All Apps and Add-ons

DELIMS/FIELDS with a field that has sub fields?

narwhal
Splunk Employee
Splunk Employee

I have a large CSV data file (CDR) that has some 300 fields. Looks something like:

value1,value2,value3,...,value51,"subvalue52.1,subvalue52.2.,...subvalue51.20",value53,...,value300

The gotcha is field52. field51 is properly extracted, but field52 isn't. I'm not worried yet about the subextraction--right now, I just want field52 to be the whole thing inside the quotes.

from transforms.conf:

[my-report-stanza-name]

DELIMS = ","

FIELDS = f1,f2,f3,...,f300 (where f1-300 are LONG NAMES)

Is it because my f1-f300 are LONG?

Do I have the syntax for DELIMS wrong? (like, is that saying the delim char can be any of " OR , OR ' ?)

Once I do get this right, what's the best way to subextract f52?

adTHANKSvance gang!

-tv

0 Karma
1 Solution

narwhal
Splunk Employee
Splunk Employee

It appears from my testing that there is a line length limitation in the "FIELDS =" definition. So, I am now extracting them as short names ("F001","F002",etc) and then doing FIELDALIAS'es on them to have longer names.

Now all fields (including the CSV embedded inside another field in quotes) are properly extracted. I then am sub-extracting the embedded field with another stanza.

To be more precise:

props.conf:

[myBigCSV]

REPORT-foo = BigCSV, SubCSV

transforms.conf

[BigCSV]
DELIMS = ","
FIELDS = "F001","F002","F003"
FIELDALIAS-F001 = F001 AS MyFirstBigFieldName

[SubCSV]
SOURCE_KEY = F003
DELIMS = ","
FIELDS = "F003a","F003b","F003c"
FIELDALIAS-F003c = F003c AS MyThirdSubField

A very elegant and easy to maintain config.

-tv

View solution in original post

narwhal
Splunk Employee
Splunk Employee

It appears from my testing that there is a line length limitation in the "FIELDS =" definition. So, I am now extracting them as short names ("F001","F002",etc) and then doing FIELDALIAS'es on them to have longer names.

Now all fields (including the CSV embedded inside another field in quotes) are properly extracted. I then am sub-extracting the embedded field with another stanza.

To be more precise:

props.conf:

[myBigCSV]

REPORT-foo = BigCSV, SubCSV

transforms.conf

[BigCSV]
DELIMS = ","
FIELDS = "F001","F002","F003"
FIELDALIAS-F001 = F001 AS MyFirstBigFieldName

[SubCSV]
SOURCE_KEY = F003
DELIMS = ","
FIELDS = "F003a","F003b","F003c"
FIELDALIAS-F003c = F003c AS MyThirdSubField

A very elegant and easy to maintain config.

-tv

emotz
Splunk Employee
Splunk Employee

You have DELIMS setup correctly - but how are subfields delimited? Commas?
You will probably have to write a custom field extraction for the big f52, and all of the sub fields too.
That seems like a great data set.
Good luck!

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...