Splunk Search

Multivalued field mapping

gelica
Communicator

Hi,

I have events of the form

----
name
----
Drive: C:
Free Space: 894.1 GB
Total Space: 953.1 GB

Drive: D:
Free Space: 89.1 GB
Total Space: 113.1 GB

My events contain multiple drives with different names, which may have more attributes below "Total Space". There may also be drives without the space-attributes.

I'm currently extracting Drive, Free_Space and Total_Space as three multivalued fields. My problem is that since each event have many drives, I need to know which values of Free_space and Total_Space that corresponds to which drive.

I don't want to create separate events of the different drives..

I don't know how I should solve this, so if anyone have any suggestions please tell me! 🙂

Tags (3)
0 Karma
1 Solution

wpreston
Motivator

I'm pretty sure you will need to expand these into an event per user/drive letter to get the correlation you need. It can be done using the search language only so that the newly created events are not persistent, meaning that these new events will created for this search only.

See example number 3 on the mvexpand page for an example of how to do it. To account for the drive letters that may not have space attributes, I would use fillnull. Off the top of my head but following the example, the search should be something like:

...your search parameters 
| fillnull value="N/A" Free_Space Total_Space
| eval fields = mvzip(Drive,Free_Space) 
| eval fields = mvzip(fields,Total_Space) 
| mvexpand fields 
| rex field=fields "(?<Drive>[^,])+,(?<Free_Space>[^,]+),(?<Total_Space>[^,]+)"

This should give you a single event for every Drive, Free_Space, and Total_Space; and will maintain User-to-Drive-to-Space field correlation. I hope this helps. Someone else out there may know a way to do it without having to create a separate event for each combination, but I'm not aware of one.

Edit: Answering questions from comments. Note that I cannot add comments to an answer for some reason at my workplace so I have to edit the orginal answer in order to reply.

Great questions! Here are my answers:

On the question of fillnull, you are correct. I put it in there to handle drive letters that might not have space attributes, however the more I think on it, fillnull would not work as needed because, as you said, Splunk will not know if there are values missing between drive letters. Can you post an example of how the data looks when there are no space attributes for a drive letter? I'm wondering if you can use the KEEP_EMPTY_VALS parameter in transforms.conf to get around this.

On the charting question, I would suggest playing around with stats to get the chart you want. Try something like this for starters:

... | stats sum(Free_Space) as "Free Space" sum(Total_Space) as "Total Space" by sourcetype,Drive

On the rex question, yes this does extract the Drive, Free_Space and Total_Space fields from the field called "fields". What it also does is tell Splunk to ONLY extract those fields from the field called "fields". The values from the previous (original) search time extraction are overwritten for each field with the values extracted using the rex, while leaving all other extractions untouched. This way, after using mvexpand to create multiple events from a single event, the values of Drive, Free_Space, and Total_Space are extracted once per (new) event. Since each new event has a different value in fields, you come away with the proper combinations of User, Drive and Space fields.

In your example data above, if you don't us the rex portion of the search command but use everything else, you should get the following results for this event:

 - Event 1 - 
User=name Drive=C Drive=D Free_Space=894.1 Free_Space=89.1 Total_Space=953.1 Total_Space=113.1 fields=C,894.4,953.1
 - Event 2 - 
User=name Drive=C Drive=D Free_Space=894.1 Free_Space=89.1 Total_Space=953.1 Total_Space=113.1 fields=D,89.1,113.1

However, if you include the rex portion of the search command, you get the following fields for this event:

 - Event 1 - 
User=name Drive=C Free_Space=894.1 Total_Space=953.1 fields=C,894.4,953.1
 - Event 2 - 
User=name Drive=D Free_Space=89.1 Total_Space=113.1 fields=D,89.1,113.1

View solution in original post

gelica
Communicator

@wpreston I didn't notice your edit until just now. Thank you very much for your detailed answers! 🙂

If the drives don't have the space attributes they look something like this:

drive: d:
model: hl-dt-st dvdram gh24ns95 scsi cdrom device
driver: c:\windows\system32\drivers\cdrom.sys, 6.01.7601.17514 (english), , 0 bytes

I looked at what KEEP_EMPTY_VALS does, and if I understand it correctly it would only work if these drives looked like below?

drive: d:
model: hl-dt-st dvdram gh24ns95 scsi cdrom device
driver: c:\windows\system32\drivers\cdrom.sys, 6.01.7601.17514 (english), , 0 bytes
free space:
total space:
0 Karma

wpreston
Motivator

I'm pretty sure you will need to expand these into an event per user/drive letter to get the correlation you need. It can be done using the search language only so that the newly created events are not persistent, meaning that these new events will created for this search only.

See example number 3 on the mvexpand page for an example of how to do it. To account for the drive letters that may not have space attributes, I would use fillnull. Off the top of my head but following the example, the search should be something like:

...your search parameters 
| fillnull value="N/A" Free_Space Total_Space
| eval fields = mvzip(Drive,Free_Space) 
| eval fields = mvzip(fields,Total_Space) 
| mvexpand fields 
| rex field=fields "(?<Drive>[^,])+,(?<Free_Space>[^,]+),(?<Total_Space>[^,]+)"

This should give you a single event for every Drive, Free_Space, and Total_Space; and will maintain User-to-Drive-to-Space field correlation. I hope this helps. Someone else out there may know a way to do it without having to create a separate event for each combination, but I'm not aware of one.

Edit: Answering questions from comments. Note that I cannot add comments to an answer for some reason at my workplace so I have to edit the orginal answer in order to reply.

Great questions! Here are my answers:

On the question of fillnull, you are correct. I put it in there to handle drive letters that might not have space attributes, however the more I think on it, fillnull would not work as needed because, as you said, Splunk will not know if there are values missing between drive letters. Can you post an example of how the data looks when there are no space attributes for a drive letter? I'm wondering if you can use the KEEP_EMPTY_VALS parameter in transforms.conf to get around this.

On the charting question, I would suggest playing around with stats to get the chart you want. Try something like this for starters:

... | stats sum(Free_Space) as "Free Space" sum(Total_Space) as "Total Space" by sourcetype,Drive

On the rex question, yes this does extract the Drive, Free_Space and Total_Space fields from the field called "fields". What it also does is tell Splunk to ONLY extract those fields from the field called "fields". The values from the previous (original) search time extraction are overwritten for each field with the values extracted using the rex, while leaving all other extractions untouched. This way, after using mvexpand to create multiple events from a single event, the values of Drive, Free_Space, and Total_Space are extracted once per (new) event. Since each new event has a different value in fields, you come away with the proper combinations of User, Drive and Space fields.

In your example data above, if you don't us the rex portion of the search command but use everything else, you should get the following results for this event:

 - Event 1 - 
User=name Drive=C Drive=D Free_Space=894.1 Free_Space=89.1 Total_Space=953.1 Total_Space=113.1 fields=C,894.4,953.1
 - Event 2 - 
User=name Drive=C Drive=D Free_Space=894.1 Free_Space=89.1 Total_Space=953.1 Total_Space=113.1 fields=D,89.1,113.1

However, if you include the rex portion of the search command, you get the following fields for this event:

 - Event 1 - 
User=name Drive=C Free_Space=894.1 Total_Space=953.1 fields=C,894.4,953.1
 - Event 2 - 
User=name Drive=D Free_Space=89.1 Total_Space=113.1 fields=D,89.1,113.1

gelica
Communicator

Another question I thougt of..
I can't really figure out how fillnull works in this example. I never see any of the space fields with the value "N/A". I also tried skipping the other commands and just run my search, fillnull and then use stats list(...).

And when I think more about it, Splunk can't know which drives have and haven't got the space attributes (right...?) and then I would have a problem if a drive without a space attribute appears in between drives that have the attributes?

0 Karma

gelica
Communicator

This works great, thank you! I didn't think of the mvzip command at all.

Is there any way to plot these values over source? (Each source have the same #bars as #drives, each bar with the height of the space value. Or maybe stacked bars)

(I also wonder what the purpose of the rex command would be in your example? As far as I know it would extract the drive, free space and total space fields from the field called fields? And those fields are already extracted, so I tried removing that command and got the same results, bu I just wonder if there is some extra mapping or something with this?)

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...