Solved: Trouble with field extraction

aferone · ‎12-13-2011

I have the following coming in through the *NIX df script. The formatting is lost in this form, but it comes through in a column and row format, as if it was being run from a *NIX command line.

Filesystem Type Size Used Avail UsePct MountedOn
/dev/sda9 ext4 19G 699M 18G 4% /
none devtmpfs 5.9G 256K 5.9G 1% /dev
/dev/sda1 ext4 184M 51M 124M 29% /boot
/dev/sda5 ext4 46G 369M 44G 1% /var
/dev/sda6 ext4 156G 338M 148G 1% /home
/dev/sda7 ext4 138G 1.5G 130G 2% /opt
/dev/sda8 ext4 184G 3.8G 171G 3% /usr
/dev/sdb1 xfs 26T 644G 25T 3% /logs

I want to extract the "Used" value in the xfs line to report on and chart. It looks like that field extractor wizard is not available for this source. I was playing around with the "rex" command in my search, but I couldn't get anywhere with that, either.

Any help would be great for this rex noob!

Thanks!

MHibbin · ‎12-13-2011

Hi,

I believe the command you are looking for is "multikv". So you could do

 * | multikv Filesystem Type Size Used Avail UsePct MountedOn

To extract all the fields (they may be useful later), and then to find the value for "xfs" you could do, something like:

 * | multikv fields Filesystem Size Used Avail Use% Mounted | table Type Used

to show all Type / Used pairings, or

 * | multikv fields Filesystem Size Used Avail Use% Mounted | search xfs

Hope this answers your question.

If it does answer your question, could please mark the question as answered (click the empty tick next the answer) for the benefit of the community.

Regards,

MHibbin

View solution in original post

MHibbin · ‎12-14-2011

This is probably a tidier version of the search command...

source=*df.sh | multikv fields Filesystem Size Used Avail Use% | rex field=Used "(?i)(?P<Used>\d+)[A-Z]+" | table _time Filesystem Used

This will strip out the numbers from the Used field, and replace the values in the Used field.

One thought though this you SHOULD make sure the script outputs all values in script (df) in the same format by using something like...

df -h --block-size=G

OR

df -h --block-size=GB

(See Manpage for info on difference in block size)

Otherwise you will be potentially mixing values that where in G or M, and treating them as the same block-size. This may not affect your analysis of the xfs file-system, however it will make the command/results more future-proof.

Hope this helps, but I would definitely recommend using the rex command in this answer.

Regards,

Matt

P.S. Thanks to Draineh for the suggestion!

aferone · ‎12-14-2011

Nice! We have the report running perfectly now.

Thanks again for all of your help!!!

MHibbin · ‎12-13-2011

Converted from comment, as this would not show "\"

Yeah sure assuming we are still working with the table command just add the field "_time" to you search command, e.g. ...

*| multikv fields Filesystem Size Used Avail Use% | eval Used=if(match(Used,"\d+G"),rtrim(Used, "G"),Used) |eval Used=if(match(Used,"\d+K"),rtrim(Used, "K"),Used)| eval Used=if(match(Used,"\d+M"),rtrim(Used, "M"),Used)| table _time Used Filesystem

MHibbin · ‎12-13-2011

Hi,

I believe the command you are looking for is "multikv". So you could do

 * | multikv Filesystem Type Size Used Avail UsePct MountedOn

To extract all the fields (they may be useful later), and then to find the value for "xfs" you could do, something like:

 * | multikv fields Filesystem Size Used Avail Use% Mounted | table Type Used

to show all Type / Used pairings, or

 * | multikv fields Filesystem Size Used Avail Use% Mounted | search xfs

Hope this answers your question.

If it does answer your question, could please mark the question as answered (click the empty tick next the answer) for the benefit of the community.

Regards,

MHibbin

MHibbin · ‎12-14-2011

please answer below where I use rex command for a "tidier" version...

MHibbin · ‎12-14-2011

I'm sure you have worked this out, but in the previous comment where I have used "eval rtrim", where there is a "d+" there should be a backslash in front of the "d+" (see other my other answer as an example).

aferone · ‎12-13-2011

Yet another question. I am getting the proper value, but I can't get the timestamp to show up now. All I would need is the indexed time, the same time you'd see when you run the search raw, without any conditions or manipulations. Is there an easy way to get that date? THANKS AGAIN!!

aferone · ‎12-13-2011

Once again, thank you very much for the great info!

MHibbin · ‎12-13-2011

The only reason I mention this is because you may have values over than in G (if the script is using df-h).

Hope this helps!

MHibbin · ‎12-13-2011

If it is not you could change the strict to output in GB (or what ever value you choose, check man page for DF), you could change the script to output the results of df --block-size=GB. If modifying the script is not possible you could try the following command

...See next comment!

MHibbin · ‎12-13-2011

If all the "Used" values are in Gigabytes (G), you can use

*| multikv fields Filesystem Size Used Avail Use% | eval Used=rtrim(Used, "[K,G,M]") | stats sum(Used)

..... See next comment!

aferone · ‎12-13-2011

One more question. How would I drop the "G" from the value? Splunk is not recognizing the value as a number because of the "G".

Thanks!

MHibbin · ‎12-13-2011

good job! good luck.

aferone · ‎12-13-2011

That did it! Thanks again for the great help!

MHibbin · ‎12-13-2011

just combine them....

| multikv fields Filesystem Size Used Avail Use% Mounted | search xfs | table Type Used

This should work, let me know? If it does mark the question as answered..... let me know

aferone · ‎12-13-2011

Thank you so much for the help!

So, with the "| table Type Used", I get all of the values under "Used".

With "|search xfs", I get the entire xfs line.

How would I get just the value of "Used" from the "xfs" line?

Thanks again!!

MHibbin · ‎12-13-2011

The multikv command takes tabular input (i.e. STDOUT from df) and extracts fields based on the header column (and you choice of fields).

So you could use just ...

*|multikv fields Used Type

for your requested fields.

PLEASE NOTE: the use of my "*|" should be replaced with your search for df related events.

Trouble with field extraction

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!