consider:
Log:
2020-04-01 10:20:30 firstabc secondxyz
props.conf
[test]
REPORT-a = report_a, report_b
transforms.conf
[report_a]
REGEX=first(?<a>\w+)
[report_b]
REGEX=second(?<a>\w+)
Question 1: what is value of the field "a"?
Question 2: will the results be the same with this props.conf:
[test]
REPORT-a = report_a
REPORT-b = report_b
Challenge: try guessing without testing first 🙂
I'll spare you a search - here is a link for a previous discussion with two different opinions: https://answers.splunk.com/answers/320868/what-is-the-order-of-execution-precedence-of-multi.html
Question 3: do you get the expected results?
This post is not a 1 April joke 🙂
Edit 02.04.2020: it is actually the second statement "fields are not overridden so once an earlier-executed transform has given a field a value, later-executed ones will not update/overwrite that original value" confirmed with this test case. Otherwise the "a" field would have the "xyz" value.
I was previously ready to bet that "later-executed ones can update/overwrite that original value" but as you see it is not the case.
The purpose of this post is to ask community and help to clarify. May be somebody has a link where this behaviour is documented.
@PavelP
In your example:
[test]
REPORT-a = report_a, report_b
and
[test]
REPORT-a = report_a
REPORT-b = report_b
when Splunk executes the search time extraction, it will execute the stanzas report_a and report_b. The second version is essentially the same from the transforms. Splitting up REPORT-a into REPORT-a and REPORT-b does not change the outcome as the transforms uses the same KEY ‘a’ in the REGEX.
When the transforms for report-a returns the KEY ‘a’ there is no value assigned, so it sets the initial extraction (abc), then report-b is ran and the REGEX identified that KEY 'a' already has a value, so the value is discarded.
@to4kawa pointed in the right direction. The latter extracted value by report_b is discarded because of MV_ADD default value is false.
That is documented here: https://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf#GLOBAL_SETTINGS
MV_ADD = [true|false]
* NOTE: This setting is only valid for search-time field extractions.
* Optional. Controls what the extractor does when it finds a field which
already exists.
* If set to true, the extractor makes the field a multivalued field and
appends the newly found value, otherwise the newly found value is
discarded.
* Default: false
You can use alternating stanza though the KEY needs to be different, if the KEY needs to be the same, use MV_ADD to create a multi valued field.
props.conf
[test]
REPORT-a = report_a, report_b
transforms.conf
[report_a]
REGEX=first(?\w+)
[report_b]
REGEX=second(?\w+)
This will return the extraction as --
a = abc
b = xyz
There is this excellent document which talks about the Lexicographical ordering, though not applicable for this scenario, will provide a good insight on how to name the props / transforms.
https://docs.splunk.com/Documentation/SplunkCloud/8.0.2003/Knowledge/Searchtimeoperationssequence#Le...
Hope this clarifies.
thank you @anmolpatel and @to4kawa , I think you've helped me to understand this topic. The answer is fully and clearly documented with few words in the spec file:
MV_ADD = [true|false]
* NOTE: This setting is only valid for search-time field extractions.
* Optional. Controls what the extractor does when it finds a field which
already exists.
* If set to true, the extractor makes the field a multivalued field and
appends the newly found value, otherwise the newly found value is
discarded.
* Default: false
I'll emphasize it for myself: If set to true, the extractor makes the field a multivalued field and appends the newly found value, otherwise the newly found value is discarded.
It is a trap for somebody like me, coming from the programming background, where a value always can be overwritten. The splunk logic differs from the programming logic. In splunk universum, using transform, there are only two options for handling a situation 'what the extractor does when it finds a field which already exists':
And here is the point where REPORT is different from the EXTRACT - using extract you can overwrite fields:
|makeresults | eval _raw="firstabc1 secondxyz1" | rex "first(?<a>\w+)" | rex "second(?<a>\w+)"
I thank you guys, I'll accept both answers!
@PavelP
In your example:
[test]
REPORT-a = report_a, report_b
and
[test]
REPORT-a = report_a
REPORT-b = report_b
when Splunk executes the search time extraction, it will execute the stanzas report_a and report_b. The second version is essentially the same from the transforms. Splitting up REPORT-a into REPORT-a and REPORT-b does not change the outcome as the transforms uses the same KEY ‘a’ in the REGEX.
When the transforms for report-a returns the KEY ‘a’ there is no value assigned, so it sets the initial extraction (abc), then report-b is ran and the REGEX identified that KEY 'a' already has a value, so the value is discarded.
@to4kawa pointed in the right direction. The latter extracted value by report_b is discarded because of MV_ADD default value is false.
That is documented here: https://docs.splunk.com/Documentation/Splunk/latest/Admin/Transformsconf#GLOBAL_SETTINGS
MV_ADD = [true|false]
* NOTE: This setting is only valid for search-time field extractions.
* Optional. Controls what the extractor does when it finds a field which
already exists.
* If set to true, the extractor makes the field a multivalued field and
appends the newly found value, otherwise the newly found value is
discarded.
* Default: false
You can use alternating stanza though the KEY needs to be different, if the KEY needs to be the same, use MV_ADD to create a multi valued field.
props.conf
[test]
REPORT-a = report_a, report_b
transforms.conf
[report_a]
REGEX=first(?\w+)
[report_b]
REGEX=second(?\w+)
This will return the extraction as --
a = abc
b = xyz
There is this excellent document which talks about the Lexicographical ordering, though not applicable for this scenario, will provide a good insight on how to name the props / transforms.
https://docs.splunk.com/Documentation/SplunkCloud/8.0.2003/Knowledge/Searchtimeoperationssequence#Le...
Hope this clarifies.
Hello @anmolpatel , thank you for your time and help!
You wrote: In your second version, you're assigning the extracted value to a new Key b, so Splunk does not discard the value. The new Key doesn't has not been assigned a value, thus setting it to the extracted value.
Actually both versions extract a field "a", there is not such field "b" in the examples, the only difference is how the stanzas in transforms.conf is called.
I get the same results in both cases - the later executed value does not override an existing value, in another words "fields are not overridden so once an earlier-executed transform has given a field a value, later-executed ones will not update/overwrite that original value".
My conclusion so far is that extractions with transforms using alternative stanzas should be avoided because only the first matched transform will be applied.
I'm going to check if any of existing Apps/Addons have such kind of configuration to understand if using of alternativ transforms is a "bad practice" or can have application.
@PavelP i misread the transforms so did the extraction incorrectly. I've updated my answer to the answer the original query.
It is not that the first matched transforms is applied, the KEY value is assigned by the first extraction, so Splunk does not override the value it finds the the second REGEX. With using the same KEY in the REGEX, you can create a multi valued field and not override the extraction.
If you swap the stanza, you will get. a different result:
[test]
REPORT-a = report_b, report_a
a1: abc, need MV_ADD = true
a2: same
a3: on transforms.conf MV_ADD = true
thank you for trying!
In this case the second statement is correct:
so once an earlier-executed transform has given a field a value, later-executed ones can update/overwrite that original value.
OR
"Although the execution does not "stop early" fields are not overridden so once an earlier-executed transform has given a field a value, later-executed ones will not update/overwrite that original value."
I'm wondering if it is documented somewhere 🙂
later-executed ones can update/overwrite that original value.
wow, I did not know that.
Thank you
it is actually the second statement "fields are not overridden so once an earlier-executed transform has given a field a value, later-executed ones will not update/overwrite that original value" confirmed with this test case. Otherwise the "a" field would have the "xyz" value.
I was previously ready to bet that "later-executed ones can update/overwrite that original value" but as you see it is not the case.
the purpose of this post is to ask community and help to clarify. May be somebody has a link where it is documented.