Splunk Search

Host name extraction via regex on indexing - Only indexing a single file?

rturk
Builder

Greetings fellow Splunkers,

I'm having some issues with extracting the correct host name from log file names on indexing.

I keep my log files in the following directory:

/var/splunk/input/mms_logs/

The filename structure is:

mms_HOST-IP-ADDRESS_TIMESTAMP.log

examples:

mms_10.152.58.100_20110101_004000_06137.log
mms_10.152.58.194_20110121_120000_70656.log

All log files have identical file permissions: (rw-r--r--).

Now to extract the IP address portion of filename as a host, I used the following regex:

/var/splunk/input/mms_logs/mms_(\d+.\d+.\d+.\d+)_\d+

Now while that works... it only seems to extract the hostname & event data from a single file. I have a single source, sourcetype, and host.

This is despite 10,000+ files being in the directory, and the list of data inputs showing it has detected 12055 files in the directory :confused face:

If someone can shed some light on this it would be greatly appreciated 🙂

0 Karma
1 Solution

rturk
Builder

Right... a bit of digging around has turned up the goods 🙂

Checked in /var/splunk/logs/splunk/splunkd.log which had multiple instances of the following:

02-23-2011 11:51:25.673 ERROR TailingProcessor - Ignoring path due to: File will not be read, seekptr checksum did not match (file=/var/splunk/input/mms_logs/mms_10.152.58.196_20110206_211500_28809).  Last time we saw this initcrc, filename was different.  You may wish to use a CRC salt on this source.  Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.

Now one thing I neglected to mention (and had no idea would be an issue) is that this data had previously been indexed, however I cleaned out the index with:

/opt/splunk/bin/splunk clean eventdata -index main

After searching for "ERROR TailingProcessor - Ignoring path due to: File will not be read, seekptr checksum did not match", I was directed to this question on answers.splunk.com:

http://answers.splunk.com/questions/1568/windows-dhcp-log-files-too-small-to-match-seekptr-checksum

Adding crcSalt = <SOURCE> to the bottom of my inputs.conf and restarting Splunk solved the issue. Now to get my head around why this was needed...

View solution in original post

rturk
Builder

Hey meno 🙂 Sorry for not getting back to you earlier, but I didn't notice the small text below my question. I managed to find the answer elsewhere and I have some information to get further clarification.

Thanks 🙂

0 Karma

rturk
Builder

Right... a bit of digging around has turned up the goods 🙂

Checked in /var/splunk/logs/splunk/splunkd.log which had multiple instances of the following:

02-23-2011 11:51:25.673 ERROR TailingProcessor - Ignoring path due to: File will not be read, seekptr checksum did not match (file=/var/splunk/input/mms_logs/mms_10.152.58.196_20110206_211500_28809).  Last time we saw this initcrc, filename was different.  You may wish to use a CRC salt on this source.  Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.

Now one thing I neglected to mention (and had no idea would be an issue) is that this data had previously been indexed, however I cleaned out the index with:

/opt/splunk/bin/splunk clean eventdata -index main

After searching for "ERROR TailingProcessor - Ignoring path due to: File will not be read, seekptr checksum did not match", I was directed to this question on answers.splunk.com:

http://answers.splunk.com/questions/1568/windows-dhcp-log-files-too-small-to-match-seekptr-checksum

Adding crcSalt = <SOURCE> to the bottom of my inputs.conf and restarting Splunk solved the issue. Now to get my head around why this was needed...

meno
Path Finder

I would like to see your inputs.conf, props.conf, transforms.conf, if possible 😉

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...