Getting Data In

Why does Splunk selectively ignore duplicate events (not ingest events) from unique sources?

williamcharlton
Path Finder

I'm trying to learn how Splunk works by presenting it small sets of data and observing the results. The results of my most recent test really surprise me. I'm no sure what to make of it

I have a 4-server Splunk scenario:

  1. deployment server
  2. index server
  3. search head server
  4. A deployment client server (w/ a Splunk Universal Forwarder)

I used the deployment server web interface to create a *.csv files monitor on the deployment client server. Using csv sourcetype. The data is ingested into a single index.

I created 3 CSV files: testdata01.csv, testdata02.csv, and testdata03.csv. Each csv file has a heading row and 30 "event" rows, like this:

"Date","Field1","Field2","Field3","Field4","Field5"
"2019-01-01 00:00:29 ","testData1-86400","testData2-86400","testData3-86400","testData4-86400","testData5-86400"

.
.
.
"2019-01-01 00:00:00 ","testData1-86371","testData2-86371","testData3-86371","testData4-86371","testData5-86371"

For each data row, the Date decrements by one second (:29 down to :00). Likewise, the numeric value that appears in a row's 5 fields decrements by 1 (86400 down to 86371). The three CSV files each have the exact same set of 30 events.

I dropped the three files into the monitored folder and then performed a search from the search head. To my surprise, I see only 30 events from testdata01.csv. It appears that Splunk ignored the 30 events from testdata02.csv and the 30 events from testdata03.csv

I expected all 90 events to be ingested because each set of 30 has a unique source.

Why does Splunk selectively ignore events (not ingest events) from multiple CSV files?

0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi williamcharlton0028,
Splunk doesn't permits to reindex the same file even if has a different name.
If you want to index three files with the same content and a different name, you have to insert in your inputs.conf, in the stanza that reads the three files, the option:

crcSalt = <SOURCE>

in this way you say to Splunk to index all the files that have different file names from the ones already indexed.

Bye.
Giuseppe

View solution in original post

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi williamcharlton0028,
Splunk doesn't permits to reindex the same file even if has a different name.
If you want to index three files with the same content and a different name, you have to insert in your inputs.conf, in the stanza that reads the three files, the option:

crcSalt = <SOURCE>

in this way you say to Splunk to index all the files that have different file names from the ones already indexed.

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...