Hi there,
I have the next CSV file:
"CLM_TIMESTAMP","CLM_DATE","CLM_NUMBER"
"1569301200","24/09/2019 00:00:00","389721519283162"
"1569301400","24/09/2019 00:00:00",""
"1569301600","24/09/2019 00:00:00",""
When forwarded to index, the CLM_NUMBER "" then appears indexed as 'CLM_NUMBER = ' so if you look at statistics it says that 100% of events has the field, however, I only want to have the field in those cases where it really appears, and to consider empty double quotes to null.
Any idea to solve this problem?
I have tried to apply SEDCMD commands to the source and change the _raw but even if I change the whole _raw to "I have changed the raw", all of the fields of the csv still remains with the values that contains in the csv.
I have found a "solution" that fits for me:
PROPS.CONF
[my_sourcetype]
CHARSET = ISO-8859-1
TZ = America/Sao_Paulo
TIME_PREFIX = \" # Line to get first field as timestamp
TIME_FORMAT=%s
MAX_TIMESTAMP_LOOKAHEAD=10
SHOULD_LINEMERGE = false
disabled = false
pulldown_type = true
KV_MODE = none
NO_BINARY_CHECK = true
PREAMBLE_REGEX = .CLM_NUMBER" #Line to avoid header indexing
SEDCMD-changeeventformat1 = s/(\"[^\"]\"),(\"[^\"]\"),(\"[^\"]\")/clm_timestamp=\1 clm_date=\2 clm_number=\3/g
SEDCMD-changeeventformat2 = s/ \w+=\"\"//g #This line deletes empty fields
Not a beautiful solution but after hours of tries is the only solution I have found.
Hope is helpful for others.
Hi
try something like this:
| makeresults | eval CLM_TIMESTAMP="1569301200", CLM_DATE="24/09/2019 00:00:00", CLM_NUMBER="389721519283162"
| append [ | makeresults | eval CLM_TIMESTAMP="1569301400", CLM_DATE="24/09/2019 00:00:00", CLM_NUMBER="" ]
| append [ | makeresults | eval CLM_TIMESTAMP="1569301600", CLM_DATE="24/09/2019 00:00:00", CLM_NUMBER="" ]
| eval CLM_NUMBER=if(tonumber(CLM_NUMBER)>0,CLM_NUMBER,NULL)
| stats count BY CLM_NUMBER
Bye.
Giuseppe
Many thanks for your answer Giuseppe. I forgot to mention that I need it at index time.