Splunk Search

How do I merge a variable number of fields and expand rows?

skelly99
Explorer

Hi,

I have a dataset with single line events that contains a variable number of fields. The number of fields is determined by the number of databases that exists in one host.

Here is an example of a host with two databases, though there could be many more than 2 databases on one host — up to 30 or even more.

host=foo  DB1_DBNAME=foo_db  DB1_DBVERSION=9.6.0 DB1_DBENV=DEV  DB2_DBNAME=bar_db  DB2_DBVERSION=9.6.2 DB2_DBENV=DEV 

I'm looking to create a search that returns a result set that lists multiple rows for the host with each row having only the results for a single database — so, in the example above, something like this

host            DB_DBNAME    DB_DBVERSION   DB_DBENV  
foo             foo_db               9.6.0                       DEV                
foo             bar_db               9.6.2                      DEV

Thanks

Tags (2)
0 Karma

harsmarvania57
Ultra Champion

Hi @skelly99,

Try below query

<yourBaseSearch>
| eval TEMP_DB_DBNAME=mvzip(DB1_DBNAME, DB2_DBNAME), TEMP_DB_DBENV=mvzip(DB1_DBENV, DB2_DBENV), TEMP_DB_DBVERSION=mvzip(DB1_DBVERSION,DB2_DBVERSION)
| makemv delim="," TEMP_DB_DBNAME
| makemv delim="," TEMP_DB_DBENV
| makemv delim="," TEMP_DB_DBVERSION
| eval temp=mvzip(TEMP_DB_DBNAME,TEMP_DB_DBENV, ";"), temp1=mvzip(temp,TEMP_DB_DBVERSION, ";")
| mvexpand temp1
| eval DB_DBNAME=mvindex(split(temp1,";"),0), DB_DBENV=mvindex(split(temp1,";"),1), DB_DBVERSION=mvindex(split(temp1,";"),2)
| table host, DB_DBNAME, DB_DBENV, DB_DBVERSION

Here is run anywhere search

| makeresults
| eval _raw="host=foo DB1_DBNAME=foo_db DB1_DBVERSION=9.6.0 DB1_DBENV=DEV DB2_DBNAME=bar_db DB2_DBVERSION=9.6.2 DB2_DBENV=DEV"
| kv
| eval TEMP_DB_DBNAME=mvzip(DB1_DBNAME, DB2_DBNAME), TEMP_DB_DBENV=mvzip(DB1_DBENV, DB2_DBENV), TEMP_DB_DBVERSION=mvzip(DB1_DBVERSION,DB2_DBVERSION)
| makemv delim="," TEMP_DB_DBNAME
| makemv delim="," TEMP_DB_DBENV
| makemv delim="," TEMP_DB_DBVERSION
| eval temp=mvzip(TEMP_DB_DBNAME,TEMP_DB_DBENV, ";"), temp1=mvzip(temp,TEMP_DB_DBVERSION, ";")
| mvexpand temp1
| eval DB_DBNAME=mvindex(split(temp1,";"),0), DB_DBENV=mvindex(split(temp1,";"),1), DB_DBVERSION=mvindex(split(temp1,";"),2)
| table host, DB_DBNAME, DB_DBENV, DB_DBVERSION
0 Karma

skelly99
Explorer

Thanks @harsmarvania57 - I can see that solution works however my one concern is how to expand this when we have lots of databases returned on a single host

I don't know at this stage what the maximum will be but I know there are 10 or more is common and I've seen up to 30 databases.

Wondering if there is there a way we can determine the number of DB related field and loop around this value to create the multi-value TEMP_DB* fields?

0 Karma

harsmarvania57
Ultra Champion

Can you please below query for multiple DB

<yourBaseSearch>
| fillnull DBNAME, DBENV, DBVERSION
| foreach DB*_DBNAME [ eval DBNAME=mvzip(DBNAME, <<FIELD>>) ]
| foreach DB*_DBENV [ eval DBENV=mvzip(DBENV, <<FIELD>>) ]
| foreach DB*_DBVERSION [ eval DBVERSION=mvzip(DBVERSION, <<FIELD>>) ]
| makemv delim="," DBNAME
| makemv delim="," DBENV
| makemv delim="," DBVERSION
| eval temp=mvzip(DBNAME,DBENV, ";"), temp1=mvzip(temp,DBVERSION, ";")
| rex mode=sed field=temp1 "s/(?:\d\;)+\d//g"
| mvexpand temp1
| where temp1!=""
| eval DB_DBNAME=mvindex(split(temp1,";"),0), DB_DBENV=mvindex(split(temp1,";"),1), DB_DBVERSION=mvindex(split(temp1,";"),2)
| table host, DB_DBNAME, DB_DBENV, DB_DBVERSION

Below is run anywhere search

| makeresults
| eval _raw="host=foo DB1_DBNAME=foo_db DB1_DBVERSION=9.6.0 DB1_DBENV=DEV DB2_DBNAME=bar_db DB2_DBVERSION=9.6.2 DB2_DBENV=DEV DB3_DBNAME=test_db DB3_DBVERSION=9.6.3 DB3_DBENV=PROD"
| kv
| fillnull DBNAME, DBENV, DBVERSION
| foreach DB*_DBNAME [ eval DBNAME=mvzip(DBNAME, <<FIELD>>) ]
| foreach DB*_DBENV [ eval DBENV=mvzip(DBENV, <<FIELD>>) ]
| foreach DB*_DBVERSION [ eval DBVERSION=mvzip(DBVERSION, <<FIELD>>) ]
| makemv delim="," DBNAME
| makemv delim="," DBENV
| makemv delim="," DBVERSION
| eval temp=mvzip(DBNAME,DBENV, ";"), temp1=mvzip(temp,DBVERSION, ";")
| rex mode=sed field=temp1 "s/(?:\d\;)+\d//g"
| mvexpand temp1
| where temp1!=""
| eval DB_DBNAME=mvindex(split(temp1,";"),0), DB_DBENV=mvindex(split(temp1,";"),1), DB_DBVERSION=mvindex(split(temp1,";"),2)
| table host, DB_DBNAME, DB_DBENV, DB_DBVERSION
0 Karma

skelly99
Explorer

Hi - thanks again for the post - the search works fine when there are two or databases on a host but when only one database is present nothing gets reported.

What I can see is that when an event only contains DB1_DBNAME, DB1_DBENV, DB1_DBVERSION the foreach statement clobbers the field set with the fillnull statement

The foreach statement works if no wildcard is used, ie foreach DB1_DBNAME but when DB*_DBNAME is used then the DBNAME field is no longer present. The behaviour is different when a wildcard is used and DB1_DBNAME and DB2_DBNAME are in the event.

To make this more intriguing is your run anywhere search works when I curtail the eval statement to read
eval _raw="host=foo DB1_DBNAME=foo_db DB1_DBVERSION=9.6.0 DB1_DBENV=DEV

Still head scratching - any suggestions welcome

0 Karma

harsmarvania57
Ultra Champion

It is strange that it is not working for you with single DB, can you please replace <<FIELD>> with '<<FIELD>>' in all 3 foreach command ? foreach command repeat eval for every DB*_DBNAME so if you have only DB1_DBNAME then it will do eval only once and zip DBNAME (which is 0) and DB1_DBNAME.

Search query which I have provided is working with 1 Database as well, I have tested below query in Splunk 7.1.2 and it is generating output.

Below run anywhere search I am running and it is generating correct output

| makeresults
| eval _raw="host=foo DB1_DBNAME=foo_db DB1_DBVERSION=9.6.0 DB1_DBENV=DEV"
| kv
| fillnull DBNAME, DBENV, DBVERSION
| foreach DB*_DBNAME [ eval DBNAME=mvzip(DBNAME, '<<FIELD>>') ]
| foreach DB*_DBENV [ eval DBENV=mvzip(DBENV, '<<FIELD>>') ]
| foreach DB*_DBVERSION [ eval DBVERSION=mvzip(DBVERSION, '<<FIELD>>') ]
| makemv delim="," DBNAME
| makemv delim="," DBENV
| makemv delim="," DBVERSION
| eval temp=mvzip(DBNAME,DBENV, ";"), temp1=mvzip(temp,DBVERSION, ";")
| rex mode=sed field=temp1 "s/(?:\d\;)+\d//g"
| mvexpand temp1
| where temp1!=""
| eval DB_DBNAME=mvindex(split(temp1,";"),0), DB_DBENV=mvindex(split(temp1,";"),1), DB_DBVERSION=mvindex(split(temp1,";"),2)
| table host, DB_DBNAME, DB_DBENV, DB_DBVERSION
0 Karma

skelly99
Explorer

Thanks again for your help - I've now managed to get this working using your logic but what I found in my data set is that the foreach DB* loop would only work if there were two fields

So if DB1_DBNAME and DB2_DBNAME had values it worked - if there was only DB1_DBNAME then the foreach statement failed - as an example

assume DB1_DBNAME=foo is set in the event
fullnull DBNAME
| foreach DB
_DBNAME [ eval DBNAME=mvzip(DBNAME, '<>') ]*
the result here would be that the DBNAME field no longer existed

However
assume DB1_DBNAME=foo set in the event
fullnull DBNAME DB2_DBNAME
| foreach DB
_DBNAME [ eval DBNAME=mvzip(DBNAME, '<>') ]*
would result in DBNAME being set as 0,foo,0

So I got this to work across all events by running a fillnull statement to create the extra fields required across the data set which allowed the foreach loop to work with only single DB events

Does not explain why your run anywhere search works by my data set does not. I'm wondering if its something to do with the initial field extraction??
The fields we are using in the data set are being extracted automatically as named value pairs
That is in the feed we get DB1_DBNAME=foo DB1_DBVERSION=9.6.2......

I notice in your run anywhere search that you run the kv command - should I be added anything to the source type definition to ensure correct KV pair extraction?

0 Karma

harsmarvania57
Ultra Champion

When you search your Base search without foreach are you getting DB1_DBNAME field on left hand side in Interesting Fields ? If yes then can you please run below command and let me know what will be in DBNAME field for single DB events?

<yourBaseSearch>
| fillnull DBNAME, DBENV, DBVERSION
| foreach DB*_DBNAME [ eval DBNAME=mvzip(DBNAME, '<<FIELD>>') ]
| foreach DB*_DBENV [ eval DBENV=mvzip(DBENV, '<<FIELD>>') ]
| foreach DB*_DBVERSION [ eval DBVERSION=mvzip(DBVERSION, '<<FIELD>>') ]

For me in run anywhere search DBNAME field value is 0,foo_db

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...