I'm noticing some weird behavior in a search that is requiring me to inline some regexs in order to get the MR job to work.
Step 0: Create a field extraction in an app that is not search
Here are the relevant contents of
$HUNK_HOME/etc/apps/{non_searchapp_app}/local/props.conf :
[myvix_sourcetype]
EXTRACT-myField = ^(?:[^\|\n]*\|){6}(?<my_field>[^\|]+)
Step 1: Verify Field Extraction works
Example Search: (Smart Mode)
index=myvix source=*events*
Indeed, on the left hand side I see my_field is recognized and has events being counted for each unique value of my_field
Hunk auto-field detection is indeed working
Step 2: Now check to see the field is being extracted by the search
Example Search: (Smart Mode)
index=myvix source=*events* | table _time, my_field
I get the following results:
_time my_field
2015-05-26 16:19:57
2015-05-26 16:19:57
...
Known Workaround
Inline the rex and don't rely on the field extraction in props.conf.
index=myvix source=*events* | rex field=message "^(?:[^\|\n]*\|){6}(?<my_field>[^\|]+)" | table _time, my_field
results in the following:
_time my_field
2015-05-26 16:19:57 my_field_value-A
2015-05-26 16:19:57 my_field_value-B
Interesting corollary:
Inlining the following regex (e.g. field=raw) **_does not work**!!!
index=myvix source=*events* | rex field=_raw "^(?:[^\|\n]*\|){6}(?<my_field>[^\|]+)" | table _time, my_field, _raw
results:
_time my_field _raw
2015-05-26 16:19:57 {"header": {"time": 1432675197252, "threadId": "qtpXXXX", "requestMarker": "abadbeef42c8", "env": "production", "server": "some-prod-server", "service": "some-service"}}
2015-05-26 16:19:57 {"header": {"time": 1432675197253, "threadId": "qtpYYYY", "requestMarker": "8badbeef9139", "env": "production", "server": "some-otherprod-server", "service": "some-other-service"}}
Notice that _raw doesn't work because the 'message' field of the _raw avro record is not being included. Only the 'header' field is being included.
FWIW, the regex was generated using the "Event Action -> Extract Fields" UI from the main search view.
Interesting corollary++:
And as one last attempt to self-service and figure this out, I added message to the table command.
and it works!! Go figure.
index=myvix source=*events* | rex field=_raw "^(?:[^\|\n]*\|){6}(?<my_field>[^\|]+)" | table _time, my_field, _raw, message
results:
_time my_field _raw message
2015-05-26 16:19:57 my_field_value-A {"header": {"time": 1432675197252, "threadId": "qtpXXXX", "requestMarker": "abadbeef42c8", "env": "production", "server": "some-prod-server", "service": "some-service"}, "message": "t.blah.X.blah.blah.blah - |x|xxx|xxx|xxxx|xxx-xxxx|my_field_value-A|xxxx|x|x|blah&blah&blah|xxx/xxx|x|x|"} t.blah.X.blah.blah.blah - |x|xxx|xxx|xxxx|xxx-xxxx|my_field_value-A|xxxx|x|x|blah&blah&blah|xxx/xxx|x|x|
2015-05-26 16:19:57 my_field_value-B {"header": {"time": 1432675197253, "threadId": "qtpYYYY", "requestMarker": "8badbeef9139", "env": "production", "server": "some-otherprod-server", "service": "some-other-service"}, "message": "t.blah.X.blah.blah.blah - |x|xxx|xxx|xxxx|xxx-xxxx|my_field_value-B|xxxx|x|x|blah&blah&blah|xxx/xxx|x|x|"} t.blah.X.blah.blah.blah - |x|xxx|xxx|xxxx|xxx-xxxx|my_field_value-B|xxxx|x|x|blah&blah&blah|xxx/xxx|x|x|
So it seems I have to tell hunk ahead of time which "raw fields" to include then it will "auto extract" ?
... View more