All Apps and Add-ons

DELIMS/FIELDS with a field that has sub fields?

narwhal
Splunk Employee
Splunk Employee

I have a large CSV data file (CDR) that has some 300 fields. Looks something like:

value1,value2,value3,...,value51,"subvalue52.1,subvalue52.2.,...subvalue51.20",value53,...,value300

The gotcha is field52. field51 is properly extracted, but field52 isn't. I'm not worried yet about the subextraction--right now, I just want field52 to be the whole thing inside the quotes.

from transforms.conf:

[my-report-stanza-name]

DELIMS = ","

FIELDS = f1,f2,f3,...,f300 (where f1-300 are LONG NAMES)

Is it because my f1-f300 are LONG?

Do I have the syntax for DELIMS wrong? (like, is that saying the delim char can be any of " OR , OR ' ?)

Once I do get this right, what's the best way to subextract f52?

adTHANKSvance gang!

-tv

0 Karma
1 Solution

narwhal
Splunk Employee
Splunk Employee

It appears from my testing that there is a line length limitation in the "FIELDS =" definition. So, I am now extracting them as short names ("F001","F002",etc) and then doing FIELDALIAS'es on them to have longer names.

Now all fields (including the CSV embedded inside another field in quotes) are properly extracted. I then am sub-extracting the embedded field with another stanza.

To be more precise:

props.conf:

[myBigCSV]

REPORT-foo = BigCSV, SubCSV

transforms.conf

[BigCSV]
DELIMS = ","
FIELDS = "F001","F002","F003"
FIELDALIAS-F001 = F001 AS MyFirstBigFieldName

[SubCSV]
SOURCE_KEY = F003
DELIMS = ","
FIELDS = "F003a","F003b","F003c"
FIELDALIAS-F003c = F003c AS MyThirdSubField

A very elegant and easy to maintain config.

-tv

View solution in original post

narwhal
Splunk Employee
Splunk Employee

It appears from my testing that there is a line length limitation in the "FIELDS =" definition. So, I am now extracting them as short names ("F001","F002",etc) and then doing FIELDALIAS'es on them to have longer names.

Now all fields (including the CSV embedded inside another field in quotes) are properly extracted. I then am sub-extracting the embedded field with another stanza.

To be more precise:

props.conf:

[myBigCSV]

REPORT-foo = BigCSV, SubCSV

transforms.conf

[BigCSV]
DELIMS = ","
FIELDS = "F001","F002","F003"
FIELDALIAS-F001 = F001 AS MyFirstBigFieldName

[SubCSV]
SOURCE_KEY = F003
DELIMS = ","
FIELDS = "F003a","F003b","F003c"
FIELDALIAS-F003c = F003c AS MyThirdSubField

A very elegant and easy to maintain config.

-tv

emotz
Splunk Employee
Splunk Employee

You have DELIMS setup correctly - but how are subfields delimited? Commas?
You will probably have to write a custom field extraction for the big f52, and all of the sub fields too.
That seems like a great data set.
Good luck!

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...