Getting Data In

Identifying empty file upload on Splunk

ashish9433
Communicator

Hi Team,

I have folder in which batch jobs loads the data files which are being consumed by Splunk. The data files are larger and sometimes it takes hours to get loaded into the directory. Once the file is loaded completely there is a empty file created with the same name as of the data file but with extension as .trg.

This empty .trg file is just to indicate the data is completely copied and the next process can start.

When monitoring the directory the empty files as usual would not be uploaded into splunk as it doesn't have any data, but is there a way using metadata command or REST command i can find out if splunk attempted to upload that empty file so that i can alert the file has been completely copied into the directory?

0 Karma

datasearchninja
Communicator

Hi, if you set the CHECK_METHOD in a props.conf on the universal forwarder to modtime, then the splunkd.log will have a entry when the file is created.

E.g: If /opt/splunkforwarder/etc/system/local/inputs.conf had an entry:

[monitor:///data/test/*.trg]

And there was a /opt/splunkforwarder/etc/system/local/props.conf with an entry:

[source::/data/test/*.trg]
CHECK_METHOD = modtime

Then when a file is created the splunkd.log will have an entry liek the following:

# date
Thu Jul  5 15:19:53 AEST 2018
# touch /data/test/my_new_file.trg
# grep trg /opt/splunkforwarder/var/log/splunk/splunkd.log
07-05-2018 15:20:08.415 +1000 INFO  WatchedFile - Will use tracking rule=modtime for file='/data/test/my_new_file.trg'.

If these splunkd logs are forwarded, you can search them in the index=_internal

0 Karma

niketn
Legend

@ashish9433 can you not Blacklist all *.trg files being indexed to Splunk while monitoring the folder?
Alternatively you can define the file format of actual data file as Whitelist to allow only those files.

Refer to documentation: https://docs.splunk.com/Documentation/Splunk/latest/Data/Whitelistorblacklistspecificincomingdata

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

ashish9433
Communicator

@niketnilay i do not want to block those files, infact i want a way to identify if the respective .trg file has been available in the directory or not.

If i see that an empty .trg file was available in the directory i can alert using Splunk and that's what is my requirement.

0 Karma

niketn
Legend

@ashish9433 then in Splunk search use source="*.trg" to identify empty file.

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

ashish9433
Communicator

@niketnilay no it doesn't work, i do not see any events. I assume since the file is empty it is not creating any event and since no event no entry in the source list. I may be wrong but i cannot see any result for source=*.trg

Any work around getting the list of file uploaded using Metadata or Rest command.

0 Karma

Azeemering
Builder

Splunk does not ingest the file, Splunk ingest the data / text in the files. Since it is empty nothing will be indexed into splunk.
You can use a REST call to check the status of a file $SPLUNK_HOME/bin/splunk _internal call /services/admin/inputstatus/TailingProcessor:FileStatus

For example: https://localhost:8089/services/admin/inputstatus/TailingProcessor:FileStatus. What you can do is ingest the REST call results back into Splunk and create a dashboard on Status of Files.

Per file it wil say something like this:

file position 75
file size 75
percent 100.00
type finished reading

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...