I have a field that contains column names delimited by spaces that I need to break out into separate fields for filtering purposes. The values are simply the column names, like shown below:
THIS_COLUMN THAT_COLUMN
I currently have an mvexpand
solution in place which works but takes a huge amount of time to process through the duplicate events. I really want to parse the column names out into separate columns in the same event and pass them through a foreach
statement, or something similar.
I've seen solutions that would work given a finite amount of column names but not for when the amount of values vary. Is there a structure I can apply to recognize the amount of columns and add them to the event. Here's my current search for reference:
base search | eval UID = _cd + index+splunk_server | eval singleColumns=split(column_name, " ") |mvexpand singleColumns | search NOT [|inputlookup Known_Bad_Columns | rename bad_columns as singleColumns ] | dedup UID | stats count by field1, field2 | sort by count desc
Your search is broken, even as-is. It is doing nothing. Here is why.
This gives you events, each with a unique UID:
base search
| eval UID = _cd + index+splunk_server
This breaks each event, into several events, EACH WITH THE SAME UID AS THE ORIGINAL JOINED/PARENT EVENT:
| eval singleColumns=split(column_name, " ")
|mvexpand singleColumns
This drops SOME of those events, but the ones that are left still share the original UID:
| search NOT [|inputlookup Known_Bad_Columns | rename bad_columns as singleColumns ]
Here is the problem. This throws away all events for each value of UID except for the one on top. This means that the very first field in column_name
is the only event that is left. This makes no sense to me and probably is not what you are intending to do. You need to remove all the SPL after the following line and take a good hard look at what the following command is doing. It is not what you think:
| dedup UID
I cannot comment on this but presumably field1
and field2
are 2 of the columns in column_name
, right?
| stats count by field1, field2
You mus ALWAYS use a number with sort
:
| sort 0 count desc
Can you please provide a couple of test examples of how this data appears in the raw events? This will make it easier for us to see the specific issue you are running into.