Greetings fellow Splunkers,
I'm having some issues with extracting the correct host name from log file names on indexing.
I keep my log files in the following directory:
/var/splunk/input/mms_logs/
The filename structure is:
mms_HOST-IP-ADDRESS_TIMESTAMP.log
examples:
mms_10.152.58.100_20110101_004000_06137.log
mms_10.152.58.194_20110121_120000_70656.log
All log files have identical file permissions: (rw-r--r--).
Now to extract the IP address portion of filename as a host, I used the following regex:
/var/splunk/input/mms_logs/mms_(\d+.\d+.\d+.\d+)_\d+
Now while that works... it only seems to extract the hostname & event data from a single file. I have a single source, sourcetype, and host.
This is despite 10,000+ files being in the directory, and the list of data inputs showing it has detected 12055 files in the directory :confused face:
If someone can shed some light on this it would be greatly appreciated 🙂
Right... a bit of digging around has turned up the goods 🙂
Checked in /var/splunk/logs/splunk/splunkd.log which had multiple instances of the following:
02-23-2011 11:51:25.673 ERROR TailingProcessor - Ignoring path due to: File will not be read, seekptr checksum did not match (file=/var/splunk/input/mms_logs/mms_10.152.58.196_20110206_211500_28809). Last time we saw this initcrc, filename was different. You may wish to use a CRC salt on this source. Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.
Now one thing I neglected to mention (and had no idea would be an issue) is that this data had previously been indexed, however I cleaned out the index with:
/opt/splunk/bin/splunk clean eventdata -index main
After searching for "ERROR TailingProcessor - Ignoring path due to: File will not be read, seekptr checksum did not match", I was directed to this question on answers.splunk.com:
http://answers.splunk.com/questions/1568/windows-dhcp-log-files-too-small-to-match-seekptr-checksum
Adding crcSalt = <SOURCE> to the bottom of my inputs.conf and restarting Splunk solved the issue. Now to get my head around why this was needed...
Hey meno 🙂 Sorry for not getting back to you earlier, but I didn't notice the small text below my question. I managed to find the answer elsewhere and I have some information to get further clarification.
Thanks 🙂
Right... a bit of digging around has turned up the goods 🙂
Checked in /var/splunk/logs/splunk/splunkd.log which had multiple instances of the following:
02-23-2011 11:51:25.673 ERROR TailingProcessor - Ignoring path due to: File will not be read, seekptr checksum did not match (file=/var/splunk/input/mms_logs/mms_10.152.58.196_20110206_211500_28809). Last time we saw this initcrc, filename was different. You may wish to use a CRC salt on this source. Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.
Now one thing I neglected to mention (and had no idea would be an issue) is that this data had previously been indexed, however I cleaned out the index with:
/opt/splunk/bin/splunk clean eventdata -index main
After searching for "ERROR TailingProcessor - Ignoring path due to: File will not be read, seekptr checksum did not match", I was directed to this question on answers.splunk.com:
http://answers.splunk.com/questions/1568/windows-dhcp-log-files-too-small-to-match-seekptr-checksum
Adding crcSalt = <SOURCE> to the bottom of my inputs.conf and restarting Splunk solved the issue. Now to get my head around why this was needed...
I would like to see your inputs.conf, props.conf, transforms.conf, if possible 😉