Is it possible to index a CSV file making column A...

kearaspoor · ‎09-16-2017

Have a bunch of CSV files that were generated (and will continue to be generated) based on a human readable form that got filled out so column A is all the fields and B all the values. Since each CSV file only contains the 2 columns, there's no traditional header information in row 1.

Is there some way to tell Splunk to recognize column A as the header? Or to set the KV_MODE to recognize the comma as the field=value delimiter instead of a traditional = sign? Or would it be best served by indexing them as if they were unstructured and performing custom regex for each row? (which would be simple enough, if time-consuming and likely to break if the format ever changes).

Sukisen1981 · ‎09-17-2017

well, i don't know about the index time but I guess you want an output with one row (that will have the second column values) and as many columns as there are rows in the CSV with first column as column headers. I tried with the below test data as an input CSV

test 3
field1 abc
field2 4.5

So, your expected out put will be something like this -
test field1 field2
3 abc 4.5

Now, assuming your first column (headers) do not change too often , as it should be even if you have the first line in CSV as headers like usual..with a bit of manipulation you can get the output in the statistics tab in the desired way. I guess you want to perform further processing after you are able to get the output in a desired form -

| sort test
| transpose
| rename column as test,"row 1" as "field 1" , "row 2" as "field 2"
| head 1

One time manual work is needed to map the fields properly
'test' needs to be replaced by the first value in the CSV.
'row 1', 'row 2'....'row n' has to be renamed to the column values , like i have renamed row 1 and row 2 to field 1 and field 2.
You will of course have many rows and I wonder how many rows we are talking about here and what effort this needs to be done manually, but once done it will work. Even if say a couple of new rows are added all you need to do is subsequently rename the newly added rows to the new first column values.

I am guessing your column 1 values won't change so much but how many rows do you have first up that needs to be worked upon manually using this approach? If there are too many this approach may be too cumbersome.

kearaspoor · ‎09-18-2017

Thank you for the suggestions, I'll keep them in mind for when I get to the searching stage, but the part that's got me stopped is getting the data indexed in the first place.

Is it possible to index a CSV file making column A be the header?

Join Us for Splunk University and Get Your Bootcamp Game On!

.conf24 | Learning Tracks for Security, Observability, Platform, and Developers!

Announcing Scheduled Export GA for Dashboard Studio