Splunk Dev

Reading 1000+ overwritten json files on time interval

rajkumar3v
New Member

I have 1000+ json files located in a directory and those files will be overwritten by every day. the file name starting with same characters as shown below,

1000010496,1000011820,1000013553,1000010097,1000010362...

my issue is that splunk forwarder is not reading all the files. I have tried flushing fishbucket,deleted indexed data,crcSalt,adding timestamp in filename and none of this have helped me to get entire data. even very less count of source files are showing in splunk. how to read this 1000+ files repeatedly without missing data?

json files starts like below,

$result = [
{
'advisory_type' => 'Security Advisory',
'date' => '10/12/17',
'advisory_name' => 'CL-SA-2017:0061',
} ....
....

Thanks in advance.

Tags (1)
0 Karma

woodcock
Esteemed Legend

The problem is that you have too many files/directories to sort through and splunk is getting bogged down tracking everything. You need to make sure that there is a housekeeping process ( logrotate can do this ) that is deleting the older log files so they do not hang around "forever". This will only get worse. Splunk forwarders start to really bog down when having to track and sort through thousands of files and once you cannot make the rounds before you are scheduled to go back around and check (I have no idea what the numbers are for this), then you are in a never-ending cycle of fail and ever-worsening delays. Also, check your inodes; you need user splunk to be ulimit unlimited.

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

I find these helpful for up-setting the limits. For RHEL6 and earlier:

cd /etc/security
cat  >>limits.conf <<EOF

*       hard    nofile  102400
*       soft    nofile  10240
*       hard    nproc   16384
*       soft    nproc   16384
EOF

And for RHEL7+:

mkdir -p /etc/systemd/system/splunk.service.d
cat >> /etc/systemd/system/splunk.service.d/filelimit.conf <<EOF

[Service]
LimitNOFILE=10240
EOF

Reboot afterwards.

These can be found around Answers and Docs, but for quick reference here I've provided them. Other versions of Linux will vary, but these are typical for most people to use. Check your version to ensure that these would work for you!!

0 Karma

rajkumar3v
New Member

After adding "initCrcLength=1048576" this issue got resolved but when sources got overwritten, the unique source count got reduced in search head.

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

Have you ever had all of them indexed (like on the initial start of the forwarder, not just re-reading the files after they are updated)?

0 Karma

rajkumar3v
New Member

no, at the time of first indexing, splunk dint read all the files. it listed only 356 sources instead of 1300 sources..

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

Do you get a full list of the files when you run this on the forwarder?:

splunk list monitor
0 Karma

rajkumar3v
New Member

yes, am getting full list but in search head getting 229 unique sources only. i think splunk will be monitoring the paths specified in monitoring and its not reading files to avoid re-indexing same filename or content.

0 Karma

rajkumar3v
New Member

anyone have a solution, please post it..

0 Karma

cpetterborg
SplunkTrust
SplunkTrust

If you have purchased Splunk and have a valid support contract, I'd submit a case to Splunk support.

Also, if you are not running the latest version of Splunk, you may want to upgrade.

And finally, if there are empty JSON files, they will not show up in the indexers on in searches because there is no data to index. Check for empty files.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...