Getting Data In

Why are lines in a gzip'ed CSV showing up twice in searches?

estepgi
New Member

Hi.

Just installed Splunk for the first time today. As a tes,t I took a CSV file and indexed it, and it worked fine. Then I created a new file in CSV format and gzip'ed it.

test.csv.gz
field,val
blah,whatever

It indexed fine. I then edited the file using vi, adding in a new line :
newfield,morestuff

I then and then searched the results again. Now the "newfield,morestuff" shows up once in the results, but "blah,whatever" shows up twice. I tried adding more lines and saw the same pattern - the most recent line shows up once, but the older lines are duplicated in the search results.

I then added | dedup _raw to the search and the duplicates went away. However, I'm looking for a more elegant solution.

By the way, I also tried unzipping the file, editing it, then gzipping it again, with the same results.

Thanks for your help!

0 Karma

kbarker302
Communicator

It sounds like you were using the "upload" method of adding data to Splunk, which will result in the duplicates the way you've described it. A better way would be to have Splunk monitor your CSV file for changes (Add Data - Monitor - Files & Directories.) That way, you can make as many changes as you want to your CSV file without having to re-upload it, and Splunk will only detect and index any changes you've made.

0 Karma

estepgi
New Member

Thanks for the response. Actually I was already using the "continuously monitor" option that you recommend. I definitely don't want to re-upload my files. As I said, this does work well for plaintext csv files but it leads to duplication for gzipp'ed csv files.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...