Getting Data In

CSV file ingestion not respecting column headings

ericlarsen
Path Finder

I'm trying to monitor a CSV file (via a UF) with column headings included in the file. I want the column headings to be extracted at search time.

Sample file output:
"Name","DatabaseSize","UsedDatabaseSpace","AvailableNewMailboxSpace","NumMailboxes","TotalItemCount"
"SFG-DB01","306.9 GB (329,503,997,952 bytes)","257.1 GB (276,068,106,240 bytes)","49.77 GB (53,435,891,712 bytes)","223"
"SFG-DB02","350.4 GB (376,212,291,584 bytes)","300.7 GB (322,833,514,496 bytes)","49.71 GB (53,378,777,088 bytes)","362"
"SFG-DB03","308.6 GB (331,383,570,432 bytes)","236.1 GB (253,546,692,608 bytes)","72.49 GB (77,836,877,824 bytes)","151"

inputs.conf:
[monitor://E:\fileName*.csv]
index = test
sourcetype = mySourcetypeLog
ignoreOlderThan = 24h
crcSalt =

props.conf:
[mySourcetypeLog]
SHOULD_LINEMERGE = false
REPORT-getfields = mySourcetypeLog_fields

transforms.conf:
[mySourcetypeLog_fields]
DELIMS=","
FIELDS = "Name","DatabaseSize","UsedDatabaseSpace","AvailableNewMailboxSpace","NumMailboxes","TotalItemCount"

When I run a oneshot, the data is ingested correctly (one event per log record) but the extracted fields are not showing up.

Any help would be appreciated.
Thanks.

0 Karma
1 Solution

adonio
Ultra Champion

Will recommend follow docs on csv index here:
http://docs.splunk.com/Documentation/Splunk/6.5.3/Data/Extractfieldsfromfileswithstructureddata
inputs.conf: (like you already have)

[monitor://E:\fileName*.csv]
index = test
sourcetype = mySourcetypeLog
ignoreOlderThan = 24h
crcSalt =

props.conf (on indexer/s)

[mySourcetypeLog]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
CHARSET=AUTO
INDEXED_EXTRACTIONS=csv
KV_MODE=none
category=Structured
description=Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled=false
pulldown_type=true

screenshots:

alt text

alt text

you can see on the left hand side of the first screenshot the props.conf
on the second screenshot you can see all the fields extracted nicely from header
hope it helps

View solution in original post

adonio
Ultra Champion

Will recommend follow docs on csv index here:
http://docs.splunk.com/Documentation/Splunk/6.5.3/Data/Extractfieldsfromfileswithstructureddata
inputs.conf: (like you already have)

[monitor://E:\fileName*.csv]
index = test
sourcetype = mySourcetypeLog
ignoreOlderThan = 24h
crcSalt =

props.conf (on indexer/s)

[mySourcetypeLog]
SHOULD_LINEMERGE=false
NO_BINARY_CHECK=true
CHARSET=AUTO
INDEXED_EXTRACTIONS=csv
KV_MODE=none
category=Structured
description=Comma-separated value format. Set header and other settings in "Delimited Settings"
disabled=false
pulldown_type=true

screenshots:

alt text

alt text

you can see on the left hand side of the first screenshot the props.conf
on the second screenshot you can see all the fields extracted nicely from header
hope it helps

adonio
Ultra Champion

why do you want the column heading extracted at search time?
any particular reason?
this doc: http://docs.splunk.com/Documentation/Splunk/6.5.3/Data/Extractfieldsfromfileswithstructureddata
explains in detail best practices indexing csv data with nice config samples and data samples to work with

0 Karma

ericlarsen
Path Finder

I don't want to have the users to create extracted fields for every single field if the field names are already included in the csv file.

0 Karma

adonio
Ultra Champion

when you will bring the data like mentioned in the docs, the users will not have to create fields at all.
pay attention that you have 6 fields in your example but values for only 5 of them.
in that case, per docs, splunk will not extract the field with no values.
also, some values are strings like: "DatabaseSize" "308.6 GB (331,383,570,432 bytes)" you will probably would want to extract numeric field based on these values, for example:
field name: DatabaseSizeGB value 308.6 there are multiple ways to do it.
submitting a full answer with screenshot here

0 Karma

ericlarsen
Path Finder

Ignore the sample file. It's just for illustrative purposes.

I was able to get it work by setting the sourcetype = csv in inputs.conf.

0 Karma

adonio
Ultra Champion

great,
please mark question as answered and up vote any comments answers that you think helped with resolution
have a great weekend

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to April Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars in April. This post ...

Want to Reduce Costs, Mitigate Risk, Improve Performance, or Increase Efficiencies? ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...