What you need is a configuration kind of like this is inputs.conf
:
[monitor:///data/ftp/paloalto/PA*.csv]
sourcetype = paloalto
host = paloaltohostname
You might be able to do a more sophisticated host
processing if the information is available, e.g., in the data or in the file path. Then, in props.conf
:
[paloalto]
REPORT-paextract = paloalto_extractions
KV_MODE = none
KV_MODE = none
just turns off some default extractions that don't usually work in a CSV file. And then in transforms.conf
:
[source::...paloalto....csv]
sourcetype = paloalto
priority = 100
[paloalto_extracts]
DELIMS = ","
FIELDS = "Domain", "Receive_Time", "Serial_Number", "Threat_Content_Type" ,
# And so on for the fields.
The first clause here exists to disable/override some default behavior that is clumsy and confusing. (In particular, automatic generation of headers.) In theory, Splunk should have auto-generated the second clause (or something like it) based on the header in the CSV file and the fact that the name ended in .csv
, but it doesn't work well, so we turn it off. The second clause creates the header that we do want explicitly.
Another suggestion is to use the TA for the Palo Altos and the Plao Alto App. It will parse the data automatically.
What you need is a configuration kind of like this is inputs.conf
:
[monitor:///data/ftp/paloalto/PA*.csv]
sourcetype = paloalto
host = paloaltohostname
You might be able to do a more sophisticated host
processing if the information is available, e.g., in the data or in the file path. Then, in props.conf
:
[paloalto]
REPORT-paextract = paloalto_extractions
KV_MODE = none
KV_MODE = none
just turns off some default extractions that don't usually work in a CSV file. And then in transforms.conf
:
[source::...paloalto....csv]
sourcetype = paloalto
priority = 100
[paloalto_extracts]
DELIMS = ","
FIELDS = "Domain", "Receive_Time", "Serial_Number", "Threat_Content_Type" ,
# And so on for the fields.
The first clause here exists to disable/override some default behavior that is clumsy and confusing. (In particular, automatic generation of headers.) In theory, Splunk should have auto-generated the second clause (or something like it) based on the header in the CSV file and the fact that the name ended in .csv
, but it doesn't work well, so we turn it off. The second clause creates the header that we do want explicitly.
the source::
clause exists to override the autogenerate behavior.
Thanks gkanapathy, since I'm still a little new to splunk as far as advanced configurations, I'm still learning to grasp the concepts of the transform.conf and props.conf files. Nevertheless, I appreciate the point in the right direction. I was actually gonna try this today before reading this comment since I was dealing with setting up custom extraction fields. Thanks!!!
-Brian
BunnyHop, yes I actually have that same issue. I named sourcetype palo_alto and yet i get sourcetypes like palo_alto1 and palo_alto2. I mean it doesn't really bother me too much but just re-confirming what you are saying.
Actually in my experience, CSV files, even if you specify the sourcetype, gets auto-learned, and the fields are not extracted. I find this true up until version 4.0.11. I haven't had a chance to upgrade to 4.0.12.