Getting Data In

When a universal forwarder is unable to connect to an indexer, will the forwarder still be collecting data from the server?

senthamilselvan
Engager

Hi Team,

We have an log file in one of the server and which is keep generated in the directory for every 10 mins once as below,

12/13/17 10:10 log1213171010
12/13/17 10:20 log1213171020
12/13/17 10:30 log1213171030
12/13/17 10:40 log1213171040
...........
...........
12/13/17 11:50 log1213171150 and keeps going.

We had an issue, our Splunk indexer was down for some 2 hours and we have fixed the splunk indexer issue. But we have noticed that, the above logs are not in Splunk for that particular span of time when the indexer was down. But the same time forwarder was up & running fine.

I have few question on this.
1. When the universal forwarder is not able to connect to respective indexer(standalone), will the forwarder still be collecting data from the server?
2. If forwarder is collecting the data, then will it resend the old data once the connection established with indexer.

Please help me on this.

0 Karma

somesoni2
SplunkTrust
SplunkTrust
0 Karma

yannK
Splunk Employee
Splunk Employee

If the forwarder cannot forward, it will pause the forwarding once it's tcpout queue is full (a few hundred KB).
This means that it will pause monitoring log files too. And stop listening to network inputs.
When the forwarding can resume, it will resume from the logs files on disk and check the last position.
For network inputs, it will simple restart listening

The possible problems are :

  • Network inputs : like streamed syslog, as they are ephemeral by nature, the missing events cannot be backfilled.
    for those you may want to add a persistent queue ( see inputs.conf)

  • Modular inputs/scripts : As the scheduler may skip them, they have to coded in a way that allow for resuming from a memorized checkpoint. (by example the windows events logs inputs are like that)

  • Log files with rotation
    example
    a log file /var/log/app.log rotated to app.log.1 etc...
    with an inputs.conf like [monitor://var/log/app.log]
    but if you only monitor the first copy of the files, and it rotated during the issue then splunk will not be able to resume in the rotated file.
    For those you may want to check your rotation rules, and from the splunk inputs monitor the rotated copies too
    [monitor://var/log/app.log*]
    splunk uses a checksum on the first 512 chars on be able to identify rotated files.

senthamilselvan
Engager

Hi YannK,

Thank you!! I will have some more queries. As i mentioned earlier, below are my log samples
12/13/17 10:10 log1213171010
12/13/17 10:20 log1213171020
12/13/17 10:30 log1213171030
12/13/17 10:40 log1213171040
.......
13/13/17 14:10 log1313171410
and the forwarder configured to capture the file as " log*" and our indexer is down for some 8 hours and we are not able to see any data index on those 8 hours.
1. once the indexer is up after 8 hrs, will forwarder can able to send all the old data from when its last send to the indexer? or indexer can able to index all the old data?
2. how max data can able to save in forwarder?

0 Karma

saurabh_tek11
Communicator
  1. The forwarder's monitor must be maintaining a fishhbucket or a marker with the last read data checkPoint since this is continuous monitor kind of input. So once the indexer is back up and running, the forwarder should be able to send (8 hours old) data from when it last send to indexer.

  2. At the max. forwarders tcpOut queue can have about 500 KB of data for movement in a go.

@yannK[Splunk] please confirm.

0 Karma

yannK
Splunk Employee
Splunk Employee

Correct, they are 2 mechanisms used to keep track of monitoring positions, that allow to resume.

  • the fishbucket is the place used to keep track of positions and file crc (for log files) (in $SPLUNK_HOME/var/lib/splunk/fishbucket)
  • for the modular inputs, the modular inputs checkpoint, (example for wineventlog, where they keep an event id per channel) (in $SPLUNK_HOME/var/lib/splunk/modinputs/)

And the tcpoutput buffer queue is very small (few hundred KB by default wth useack=false, and 27MB in case of useack=true)

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...