Getting Data In

Problems with CRCs in Monitored Files

Joel_Gerber
Explorer

I have the following inputs.conf stanza, stored in /opt/splunk/etc/apps/search/local/inputs.conf:

[monitor:///home/users/c20dd/[A-Z]*/omdd/*.CSV]
host_regex = \w+\.\w{10}\.\d{4}\.(\w+).*
index = for_testing
sourcetype = C20_OM
crcSalt = <SOURCE>
disabled = 0

My files aren't getting gobbled up. So, I turned DEBUG on the category.TailingProcessor line in my /opt/splunk/etc/log-local.cfg file.

Here is what I get for one of the files in the matched monitor directory:

01-29-2014 14:07:37.479 -0400 ERROR TailingProcessor - File will not be read, seekptr checksum did not match (file=/home/users/c20dd/CHRHNSACDS0/omdd/Netops02-GRP1.01_28_2014.2030.CHRHNSACDS0.OT.CSV).  Last time we saw this initcrc, filename was different.  You may wish to use a CRC salt on this source.  Consult the documentation or file a support case online at http://www.splunk.com/page/submit_issue for more info.
01-29-2014 14:07:37.474 -0400 DEBUG TailingProcessor -   Will attempt to read file: /home/users/c20dd/CHRHNSACDS0/omdd/Netops02-GRP1.01_28_2014.2030.CHRHNSACDS0.OT.CSV.
01-29-2014 14:07:37.474 -0400 DEBUG TailingProcessor -   Will use CRC salt='/home/users/c20dd/CHRHNSACDS0/omdd/Netops02-GRP1.01_28_2014.2030.CHRHNSACDS0.OT.CSV' for this source.
01-29-2014 14:07:37.474 -0400 DEBUG TailingProcessor -   Item '/home/users/c20dd/CHRHNSACDS0/omdd/Netops02-GRP1.01_28_2014.2030.CHRHNSACDS0.OT.CSV' matches stanza: /home/users/c20dd/[A-Z]*/omdd/*.CSV.
01-29-2014 14:07:37.474 -0400 DEBUG TailingProcessor - File state notification for path='/home/users/c20dd/CHRHNSACDS0/omdd/Netops02-GRP1.01_28_2014.2030.CHRHNSACDS0.OT.CSV' (first time).

Why is this happening? From everything I can see, this should've worked.

0 Karma
1 Solution

Joel_Gerber
Explorer

Ok, this is going to seem kinda lame, but I'm going to answer my own question. With the help of ^Brian^ and amrit|wrk on EFNet's #splunk channel, I was able to come up with a solution. First off, here's the first few lines of one of the files in question:

Date, Time, Switch Name, Group Name, Key/Info Field, Reg1 Name, Reg1 Value, Reg2 Name, Reg2 Value, Reg3 Name, Reg3 Value, Reg4 Name, Reg4 Value, Reg5 Name, Reg5 Value, Reg6 Name, Reg6 Value, Reg7 Name, Reg7 Value, Reg8 Name, Reg8 Value, Reg9 Name, Reg9 Value, Reg10 Name, Reg10 Value, Reg11 Name, Reg11 Value, Reg12 Name, Reg12 Value, Reg13 Name, Reg13 Value, Reg14 Name, Reg14 Value, Reg15 Name, Reg15 Value, Reg16 Name, Reg16 Value, Reg17 Name, Reg17 Value, Reg18 Name, Reg18 Value, Reg19 Name, Reg19 Value, Reg20 Name, Reg20 Value, Reg21 Name, Reg21 Value, Reg22 Name, Reg22 Value, Reg23 Name, Reg23 Value, Reg24 Name, Reg24 Value, Reg25 Name, Reg25 Value, Reg26 Name, Reg26 Value, Reg27 Name, Reg27 Value, Reg28 Name, Reg28 Value, Reg29 Name, Reg29 Value, Reg30 Name, Reg30 Value, Reg31 Name, Reg31 Value, Reg32 Name, Reg32 Value
01-30-2014,06:30:07,Host_Name,NPILNS,0,LMTRU,2562,NTERMATT,550,ORIGABN,1768,PERCLFL,0,ORIGFAIL,111,ORIGBLK,0,TERMBLK,0,NORIGATT,6334
01-30-2014,06:30:07,Host_Name,MS,MS 0,MSERR,0,MSFLT,0,MSDIA,0,MSDIAF,0,MSMBP,0,MSMBU,0,MSSBU,0,MSCDERR,0,MSCDFLT,0,MSCDDIA,0,MSCDDIAF,0,MSCDMBP,0,MSCDMBU,0,MSCDSBU,0,MSPTERR,0,MSPTFLT,0,MSPTDIA,60,MSPTDIAF,0,MSPTMBP,0,MSPTMBU,0,MSPTSBU,0
01-30-2014,06:30:07,Host_Name,MS,MS 1,MSERR,0,MSFLT,0,MSDIA,0,MSDIAF,0,MSMBP,0,MSMBU,0,MSSBU,0,MSCDERR,0,MSCDFLT,0,MSCDDIA,0,MSCDDIAF,0,MSCDMBP,0,MSCDMBU,0,MSCDSBU,0,MSPTERR,0,MSPTFLT,0,MSPTDIA,60,MSPTDIAF,0,MSPTMBP,0,MSPTMBU,0,MSPTSBU,0

Here is my new monitor:// configuration that works:

[monitor:///home/users/c20dd/[A-Z]*/omdd/*.CSV]
host_regex = \w+\.\w{10}\.\d{4}\.(\w+).*
index = for_testing
sourcetype = C20_OM
initCrcLength = 4096
disabled = 0

The line that fixed everything was the initCrcLength = 4096. The problem was that my CSV header was so long that for some reason (I don't have a good answer on this), even with the crcSalt specified before, Splunk wasn't able to uniquely identify the contents of my files. With the increased CrcLength, the files were gobbled up perfectly.

amrit was the guy that developed the new field extraction code introduced in Splunk 6.0, and even he wasn't able to explain why Splunk wasn't able to handle my data without any Crc configuration present. He seemed to think that his new code should handle it. Oh well, at least it works now 🙂

View solution in original post

mmekroud
Explorer

dear all,
i tried this solution, & worked perfectly , tested on SPLUNJK 6.5.2

thanks for the rich experience,

regards,
Mo

0 Karma

mcronkrite
Splunk Employee
Splunk Employee

Hi OP, wondering if your files were on an auto rotate backup schedule?

0 Karma

JayJohns
Engager

Great.. thanks I will try it out

0 Karma

Joel_Gerber
Explorer

Ok, this is going to seem kinda lame, but I'm going to answer my own question. With the help of ^Brian^ and amrit|wrk on EFNet's #splunk channel, I was able to come up with a solution. First off, here's the first few lines of one of the files in question:

Date, Time, Switch Name, Group Name, Key/Info Field, Reg1 Name, Reg1 Value, Reg2 Name, Reg2 Value, Reg3 Name, Reg3 Value, Reg4 Name, Reg4 Value, Reg5 Name, Reg5 Value, Reg6 Name, Reg6 Value, Reg7 Name, Reg7 Value, Reg8 Name, Reg8 Value, Reg9 Name, Reg9 Value, Reg10 Name, Reg10 Value, Reg11 Name, Reg11 Value, Reg12 Name, Reg12 Value, Reg13 Name, Reg13 Value, Reg14 Name, Reg14 Value, Reg15 Name, Reg15 Value, Reg16 Name, Reg16 Value, Reg17 Name, Reg17 Value, Reg18 Name, Reg18 Value, Reg19 Name, Reg19 Value, Reg20 Name, Reg20 Value, Reg21 Name, Reg21 Value, Reg22 Name, Reg22 Value, Reg23 Name, Reg23 Value, Reg24 Name, Reg24 Value, Reg25 Name, Reg25 Value, Reg26 Name, Reg26 Value, Reg27 Name, Reg27 Value, Reg28 Name, Reg28 Value, Reg29 Name, Reg29 Value, Reg30 Name, Reg30 Value, Reg31 Name, Reg31 Value, Reg32 Name, Reg32 Value
01-30-2014,06:30:07,Host_Name,NPILNS,0,LMTRU,2562,NTERMATT,550,ORIGABN,1768,PERCLFL,0,ORIGFAIL,111,ORIGBLK,0,TERMBLK,0,NORIGATT,6334
01-30-2014,06:30:07,Host_Name,MS,MS 0,MSERR,0,MSFLT,0,MSDIA,0,MSDIAF,0,MSMBP,0,MSMBU,0,MSSBU,0,MSCDERR,0,MSCDFLT,0,MSCDDIA,0,MSCDDIAF,0,MSCDMBP,0,MSCDMBU,0,MSCDSBU,0,MSPTERR,0,MSPTFLT,0,MSPTDIA,60,MSPTDIAF,0,MSPTMBP,0,MSPTMBU,0,MSPTSBU,0
01-30-2014,06:30:07,Host_Name,MS,MS 1,MSERR,0,MSFLT,0,MSDIA,0,MSDIAF,0,MSMBP,0,MSMBU,0,MSSBU,0,MSCDERR,0,MSCDFLT,0,MSCDDIA,0,MSCDDIAF,0,MSCDMBP,0,MSCDMBU,0,MSCDSBU,0,MSPTERR,0,MSPTFLT,0,MSPTDIA,60,MSPTDIAF,0,MSPTMBP,0,MSPTMBU,0,MSPTSBU,0

Here is my new monitor:// configuration that works:

[monitor:///home/users/c20dd/[A-Z]*/omdd/*.CSV]
host_regex = \w+\.\w{10}\.\d{4}\.(\w+).*
index = for_testing
sourcetype = C20_OM
initCrcLength = 4096
disabled = 0

The line that fixed everything was the initCrcLength = 4096. The problem was that my CSV header was so long that for some reason (I don't have a good answer on this), even with the crcSalt specified before, Splunk wasn't able to uniquely identify the contents of my files. With the increased CrcLength, the files were gobbled up perfectly.

amrit was the guy that developed the new field extraction code introduced in Splunk 6.0, and even he wasn't able to explain why Splunk wasn't able to handle my data without any Crc configuration present. He seemed to think that his new code should handle it. Oh well, at least it works now 🙂

Joel_Gerber
Explorer

I also have the following props.conf stanza, stored in /opt/splunk/etc/system/local/props.conf:

[C20_OM]
FIELD_DELIMITER = ,
INDEXED_EXTRACTIONS = CSV
FIELD_HEADER_REGEX = ^(Date.*)$
pulldown_type = true
0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...