Getting Data In

Why are we missing data in Splunk after rsyslog?

Arkon
Explorer

Hello,

I am missing data in my current setup (about 20 to 30%).

  1. Instance A is sending data to Instance B on port 514
  2. Instance B uses rsyslog to get the data and log it into a file called /var/log/app.log
  3. Splunk indexes /var/log/app.log

All the data from Intance A is arriving perfectly well into /var/log/app.log.
However, some events are missing in Splunk.

Would you have any idea about the potential issue please?
Thank you very much in advance

0 Karma

hardikJsheth
Motivator

If instance B is Splunk instance, I would suggest that you use tcp input in inputs.conf, no need to route through rsyslog.
i.e
[tcp://instanceA:514]

For more information refer TCP section in inputs.conf spec.

0 Karma

teunlaan
Contributor

Think he is using UDP 514, than rsyslog is a very good Idea.
Otherwise you loose data everytime splunk is restarting

0 Karma

Arkon
Explorer

I have to use UDP 514 indeed but all the data is arriving well to the file (plus it's just data from the same VPC no very low risk of losing datagrams). So udp is fine, just Splunk not indexing everything from the file.

0 Karma

hardikJsheth
Motivator

We have been using both syslog forwarding as well as TCP listen provided by Splunk for on boarding data from different sources such as firewall which produces large chunk of data. Yet, we are getting all data on boarded.

Can you find out any pattern in data that's missed ? Do you have any log rotation policy?

0 Karma

Arkon
Explorer

Yes we do have logrotation policy. It looks like that the log rotate is the issue but I am not entirely sure yet. Is it the same on your side? .
Apparantly that's the log rotation which breaks everything even if i try putting a higher initCrcLength or crcSalt=.

0 Karma

teunlaan
Contributor

Are you sure it is "missing"?

We had a multiple problems with time extracion.
- another timstamp in the message, that was pickup as time
- an ID was picked-up as Epoch time
- wrong cut-off for timestamp. (timestamp without year followed by IP address. first 2 digits of the IP where used as YEAR

To see if the timestamp is wrong, start a real-time search. Just watch the timeline for events popping up in the past.

If they pop up in the past, you have to alter the props.conf for time extraction

0 Karma

Arkon
Explorer

I am going to try realtime search right now but when I check if the data is arriving, I do a very general search specifically looking for content in the _raw, no sourcetype or source filter.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...