Getting Data In

Why are the timestamps different when indexing CSV files locally versus being sent via universal forwarder?

ragedsparrow
SplunkTrust
SplunkTrust

I'm having an issues with timestamps on CSV files. Here is what a sample of raw data looks like:

    DATE,HOUR,WORKFILE_COUNT,RATE_PER_WORKFILE,EST_NBR_WORKFILES_PER_HOUR,WORK_HOLD_FILE_COUNT,WORK_DONE_FILE_COUNT,UNPROCESSED_FILE_COUNT,BAD_FILE_COUNT
09/13/15,15,1172,247.238,14560.9,0,0,0,13
09/13/15,16,1190,247.011,14574.3,0,0,0,13
09/13/15,17,1217,227.313,15837.2,0,0,0,13
09/13/15,18,782,231.839,15528,0,0,0,13
09/13/15,19,648,240.61,14962,0,0,0,13
09/13/15,20,629,238.669,15083.6,0,0,0,13
09/13/15,21,394,242.297,14857.8,0,0,0,13
09/13/15,22,493,295.181,12195.9,0,0,0,13
09/13/15,23,257,280.949,12813.7,0,0,0,13
09/14/15,00,64,235.125,15311,0,0,0,13

Here is the props.conf in /etc/system/local on the indexer:

[csv]
NO_BINARY_CHECK = true
TIMESTAMP_FIELDS = DATE,HOUR
TIME_FORMAT = %m/%d/%y%n%k
disabled = false

Now when I index locally using the CSV sourcetype, everything looks good and the timestamp is correct:

rightTimeStamps

Now, when I have the same file sent via universal forwarder, the modtime is used as the timestamp, not the DATE, HOUR columns in the csv file:

alt text

I have tried a variety of things with the props.conf file, but all seems to be failing when it is sent via the forwarder.

0 Karma
1 Solution

gcato
Contributor

Hi ragedsparrow,

I guess your INDEXED_EXTRACTIONS = csv is configured in the forwarder's props.conf (Splunk v6) then the forwarder will be parsing the csv file into a structured stream first and then passing it to the indexer ready for immediate indexing (i.e. the indexer will not parse the file again).

Refer to the following in the docs: http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Extractfieldsfromfileheadersatindextime#Cavea...

You will need to deloy your props configuration to your forwarders, or define another sourcetype for the csv files and ensure the INDEXED_EXTRACTIONS does not happen on the forwarder so the data is forwarded to the indexer in an unstructured state.

Hope this makes sense and helps.

View solution in original post

gcato
Contributor

Hi ragedsparrow,

I guess your INDEXED_EXTRACTIONS = csv is configured in the forwarder's props.conf (Splunk v6) then the forwarder will be parsing the csv file into a structured stream first and then passing it to the indexer ready for immediate indexing (i.e. the indexer will not parse the file again).

Refer to the following in the docs: http://docs.splunk.com/Documentation/Splunk/6.2.2/Data/Extractfieldsfromfileheadersatindextime#Cavea...

You will need to deloy your props configuration to your forwarders, or define another sourcetype for the csv files and ensure the INDEXED_EXTRACTIONS does not happen on the forwarder so the data is forwarded to the indexer in an unstructured state.

Hope this makes sense and helps.

ragedsparrow
SplunkTrust
SplunkTrust

gcato, that is exactly what was happening (a good indicator, I need to read documentation more). I was modifying my props.conf file on the indexer and not on the forwarder, that's why the timestamps were working on my test data that I was bringing in locally on the indexer and not on the forwarded data. Thank you so much for helping with this headache for me.

0 Karma

gcato
Contributor

Glad it helped. A case of been there, done that for me.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...