Solved: How do I parse one large comma-delimited field int...

bstill · ‎10-23-2017

I have an event that looks similar to the following:

2017-10-18 16:59:30.943, MetaDataFoo="ValueFoo", Event_Time="2017-10-18 16:59:30.943", Measurement="45.89502,44.895,44.99505,44.89502,44.89502,45.42877,45.96756,46.09344,46.22437,46.315"

I am trying to parse the "Measurement" field values into separate fields per value. For example:

Measurement="45.89502,44.895,44.99505,44.89502,44.89502,45.42877,45.96756,46.09344,46.22437,46.315"

Turns Into the following fields:
Measurement_0 = 45.89502
Measurement_1 = 44.895
Measurement_2 = 44.99505
Measurement_3 = 44.89502
etc...

I feel like this could be done with eval and mvindex, but I really don't want to write an eval for each new value since my actual dataset has 1024 values in the Measurement field. Any help would be much appreciated.

gcusello · ‎10-23-2017

Hi bstill,
use regex in this way

Measurement\=\"(?<Measurement_0>[^,]*),(?<Measurement_1>[^,]*),(?<Measurement_3>[^,]*),(?<Measurement_4>[^,]*),(?<Measurement_5>[^,]*),(?<Measurement_6>[^,]*),(?<Measurement_7>[^,]*),(?<Measurement_8>[^,]*),(?<Measurement_9>[^,]*),(?<Measurement_10>[^,]*)

or

| rex field=Measurement "(?<Measurement_0>[^,]*),(?<Measurement_1>[^,]*),(?<Measurement_3>[^,]*),(?<Measurement_4>[^,]*),(?<Measurement_5>[^,]*),(?<Measurement_6>[^,]*),(?<Measurement_7>[^,]*),(?<Measurement_8>[^,]*),(?<Measurement_9>[^,]*),(?<Measurement_10>[^,]*)"

Test it at https://regex101.com/r/d63INQ/1
Bye.
Giuseppe

View solution in original post

elliotproebstel · ‎10-23-2017

I think this run anywhere code should provide structure for the solution:
| stats count | eval Measurement="first,second,third,fourth,fifth" | eval temp_measurements=split(Measurement, ",") | eval total_indexes=mvcount(temp_measurements) | eval indexval=mvrange(0,total_indexes,1) | mvexpand indexval | eval Measurement_{indexval}=mvindex(temp_measurements,indexval) | fields - temp_measurements indexval total_indexes | stats values(*) AS * BY Measurement

For inspiration, I credit the non-accepted solution by @aweitzman on this post:
https://answers.splunk.com/answers/168916/iterate-over-all-values-of-a-multivalue-field.html

I don't think this is super efficient, but it shouldl at least scale to cover large numbers of Measurements in small numbers of events.

cesaccenturefed · ‎02-17-2022

Thank you so much, I got half way there with streamstats, but then got lost, this essentially got rid of some <streamstats field> -1 logic that was frustrating me!

had some events like this that i wanted dynamic columns for.

level1, level2, level3

level1, level2, level3, level4

level1, level3, level3, level4, .... level7

level1

couldnt figure how dynamically create columns with regex.

gcusello · ‎10-23-2017

Hi bstill,
use regex in this way

Measurement\=\"(?<Measurement_0>[^,]*),(?<Measurement_1>[^,]*),(?<Measurement_3>[^,]*),(?<Measurement_4>[^,]*),(?<Measurement_5>[^,]*),(?<Measurement_6>[^,]*),(?<Measurement_7>[^,]*),(?<Measurement_8>[^,]*),(?<Measurement_9>[^,]*),(?<Measurement_10>[^,]*)

or

| rex field=Measurement "(?<Measurement_0>[^,]*),(?<Measurement_1>[^,]*),(?<Measurement_3>[^,]*),(?<Measurement_4>[^,]*),(?<Measurement_5>[^,]*),(?<Measurement_6>[^,]*),(?<Measurement_7>[^,]*),(?<Measurement_8>[^,]*),(?<Measurement_9>[^,]*),(?<Measurement_10>[^,]*)"

Test it at https://regex101.com/r/d63INQ/1
Bye.
Giuseppe

bstill · ‎10-23-2017

Thanks for the answer. It may be that I have to do it that way. I was trying to not have to define it for every value, since the Measurement array size may be dynamic in the future and the array is so large.

How do I parse one large comma-delimited field into multiple fields with numbered names?

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life