Interesting regex/transforms.conf question

kubowler99 · ‎03-05-2012

My dilemma:

We have a log file that dumps out info from an array.

Four fields:

Count
FieldA
FieldB
FieldC

In the log file, the field 'Count' will provide the number of entries in the array. There will then be 'Count' instances of each field (FieldA, FieldB, FieldC).

For example:

2012.02.28 00:02:00.000|Count: 1, FieldA[0]: abcdefg, FieldB[0]: 12345, FieldC[0]: 1234abcd
2012.02.28 00:02:01.000|Count: 2, FieldA[0]: abcdefg, FieldB[0]: 12345, FieldC[0]: 1234abcd, FieldA[1]: hijklmn, FieldB[1]: 67890, FieldC[1]: 5678efgh

We don't really care about FieldA[0] vs FieldA[1] for metrics, but we do want to know about the data for all instances of FieldA. Not sure if some combination of repetition and grouping will accomplish this or not. I will be testing it out, but was also looking for others' feedback and if they've done this before.

Is there a 'simple' (and I use that term lightly) way to dynamically parse the fields using Splunk (transforms.conf, etc.)?

Stephen_Sorkin · ‎03-05-2012

Yes, this is a good use of multivalued fields and the MV_ADD property in transforms.conf. First, add a stanza in props.conf like:

[source::.../yourfile.log*]
REPORT-numberedfields = numberedfields

Then add the corresponding stanza in transforms.conf, which directs Splunk to read fieldnames followed by an ignored number, followed by the value, and, when duplicate field names are encountered, to accumulate into the same field:

[numberedfields]
REGEX = (\w+)\[\d+\]: ([^,]+)
FORMAT = $1::$2
MV_ADD = True

Interesting regex/transforms.conf question

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!