Looking for some help with a simple csv input. I will show my inputs.conf, props.conf, transforms.conf, and example csv files below. Everything is working fine except for 1 small issue. See details below:
I have an input monitoring a directory for an csv file. The csv file has 2 lines of garbage header info, the file also has some field values which contain commas (which the csv generator places double quotes around the entire field), and some of the lines begin with an empty field (which means the first character is a comma, delimiting to the second field). Also, the csv has a total of 11 fields, of which the last 3 are empty, and I am only extracting the first 8.
All field extractions are happening perfectly with one exception: on lines which have an empty first field (therefore start with a comma), the first field is not extracted with a value of null(), it's extracted with a comma and then data from the succeeding fields. The remaining fields on that line are extracted correctly, though.
Long story short, I would like lines which have no value in field_1, to be extracted as null(), or NULL or something like that. I've attempted adding "KEEP_EMPTY_VALS = true" to the transforms.conf, with no change in results.
Any help is greatly appreciated. Thanks!
inputs.conf
[monitor:D:\Logs\mydirectory]
disabled = false
index = myindex
sourcetype = mysourcetype
whitelist = myfile.+\.csv
ignoreOlderThan = 14d
crcSalt = <SOURCE>
props.conf
[mysourcetype]
DATETIME_CONFIG = CURRENT
description = my description goes here
LINE_BREAKER = ([\r\n]+)
SHOULD_LINEMERGE = false
TZ = US/Eastern
TRANSFORMS-ignore_myheader1 = ignore_myheader1
TRANSFORMS-ignore_myheader2 = ignore_myheader2
REPORT-my_extract = my_extractions
transforms.conf
[my_extractions]
DELIMS = ","
FIELDS = "field_1", "field_2", "field_3", "field_4", "field_5", "field_6", "field_7", "field_8"
[ignore_myheader1]
# Stanza that ignores the first line of file
REGEX = plain text in myheader1 to be dicarded
DEST_KEY = queue
FORMAT = nullQueue
[ignore_myheader2]
# Stanza that ignores the second line of file
REGEX = plain text in myheader2 to be discarded
DEST_KEY = queue
FORMAT = nullQueue
myfileexample.csv
plain text in myheader1 to be dicarded,field_2,field_3,field_4,field_5,field_6,field_7,field_8,field_9,field_10,field_11
plain text in myheader2 to be dicarded,field_2,field_3,field_4,field_5,field_6,field_7,field_8,field_9,field_10,field_11
,field_2,field_3,field_4,field_5,field_6,field_7,field_8,,,
,field_2,field_3,field_4,field_5,field_6,field_7,field_8,,,
,field_2,field_3,field_4,field_5,field_6,field_7,field_8,,,
field_1,field_2,field_3,field_4,field_5,field_6,field_7,field_8,,,
field_1,field_2,field_3,field_4,field_5,field_6,field_7,field_8,,,
field_1,field_2,field_3,field_4,field_5,field_6,field_7,field_8,,,
Thank you, the documentation was helpful, and addressed my issue.
I would accept your answer as correct, but, I can't find how to do that. Didn't there used to be a button for that?!
Thank you, the documentation was helpful, and addressed my issue.
I would accept your answer as correct, but, I can't find how to do that. Didn't there used to be a button for that?!
all good 🙂 i submitted a comment and not an answer.
if it helped, you can up vote my comment
glad you figured it out
hello there,
kindly check this doc:
http://docs.splunk.com/Documentation/Splunk/6.6.0/Data/Extractfieldsfromfileswithstructureddata
Thanks for the reply.
I attempted INDEXED_EXTRACTIONS = CSV and related setting already, but reading the document you linked, I learned these would need to take place on the forwarder (I placed them at the indexer, when I attempted). When I have time I will try it out and reply here will results.