Splunk Search

How to use split to extract a delimited value?

wbeaucha
Explorer

I'd like to be able to extract a numerical field from a delimited log entry, and then create a graph of that number over time. I am trying to extract the colon (:) delimited field directly before "USERS" (2nd field from the end) in the log entries below:

14-07-13 12:54:00.096 STATS: maint.47CMri_3.47CMri_3.: 224: UC.v1:7:USERS

14-07-12 02:53:59.250 STATS: maint.47BMrr_3.47BMrr_3.: 64: UC.v1:12:USERS

I have tried to use regex to extract this value without success. Like 99.999% of the people on this planet, I am not a regex expert. I did try the regex extraction apps. Now I see that split() may do this but can't find documentation that really explains how to put the resulting fields into variables that can be piped into timechart. Is there a splunk analog of the Unix "cut" command?

Thanks!

1 Solution

somesoni2
SplunkTrust
SplunkTrust

Try this

source="/logpath/logfile.log" earliest=-7d@d latest=@d :USERS | eval fields=split(_raw,":") | eval num=mvindex(fields,6) | timechart sum(num) as loggedUsers

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

Try this

source="/logpath/logfile.log" earliest=-7d@d latest=@d :USERS | eval fields=split(_raw,":") | eval num=mvindex(fields,6) | timechart sum(num) as loggedUsers

somesoni2
SplunkTrust
SplunkTrust

The timechart span is breaking events in 5 min set (events happened on 10:00 to 10:05 will be grouped together) so a sum is required in case there are multiple entries in one set. If you're sure that there won't be multiple entries in 5 min period, your can change sum with max/min/first/last.

0 Karma

wbeaucha
Explorer

Ok, I think I have something that works without breaking the flash chart's 10000 line max. I had to add an option to change the timespan that it was summing to 5 minutes. I'm not sure why I have to sum at all, but whatever, the graph is displaying the right data now.

source="/logpath/logfile.log" earliest=-7d@d latest=@d :USERS | eval fields=split(_raw,":") | eval num=mvindex(fields,6) | timechart span=5m sum(num) as loggedUsers

Thanks again!

0 Karma

wbeaucha
Explorer

That's much closer. I changed it to just show the past day below.

source="/logpath/logfile.log" earliest=-1d@d latest=@d :USERS | eval fields=split(_raw,":") | eval num=mvindex(fields,6) | timechart sum(num) as loggedUsers

I now get a graph line showing the number of users that looks closer to reality, but it looks like it is summing the numbers every 30 minutes? I'm looking for a graph showing the actual data.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Searches provided by @richgalloway should work to get that [assuming, no of colons are always 7 (for split) and "USERS" is the hard-coded string (for rex)].

0 Karma

wbeaucha
Explorer

Yes, I'm trying to extract numbers 7 and 12 from the example logs and the keyword USERS is always the last field in this log entry.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Something like this should do the job.

your search | eval fields=split(_raw,":") | eval num=mvindex(fields,7) | ...

If you want to do it using regex, then this string should get you what you want.

your search | rex ":(?<num>\d+):USERS" | ...
---
If this reply helps you, Karma would be appreciated.

wbeaucha
Explorer

The num field contains the number of users logged into our app at the point in time when the log entry was created. I want a graph showing number of users logged in vs time. Yes, I'd like to plot all the values for num in the graph. Thanks again!

0 Karma

somesoni2
SplunkTrust
SplunkTrust

What is this num field anyways (what value it holds) and what do you want to chart on (max, avg)? Timechart values(num) basically give you a multivalued fields with list of unique values at a given time span. You should use some aggregate command for the graph. You want to plot all the values for num in the graph??

0 Karma

wbeaucha
Explorer

Yes I am using field number 6. Sorry that was not reflected in the cut and paste.

source="/logpath/logfile.log" earliest=-7d@d latest=@d :USERS | eval fields=split(_raw,":") | eval num=mvindex(fields,6) | timechart values(num)

I edited the chart format via the format drop down and changed the null values to connect, but that did not change anything, graph still shows no data. I notice that in statistics, each date _time single value has many values(num) associated with it.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Once you fix the mvindex, ensure that you've set the following property for the chart. Default value is "gaps" and that's why you just see the vertices.

charting.chart.nullValueMode=connect

0 Karma

richgalloway
SplunkTrust
SplunkTrust

I think I was off on the mvindex command. Try 6 instead of 7. To verify your work, replace the timechart command with table _raw, num. If the data displayed is what you expected then put the timechart back.

---
If this reply helps you, Karma would be appreciated.
0 Karma

wbeaucha
Explorer

Thanks richgalloway. Cool, now we are getting somewhere. Using the command below I see the values(num) showing up in statistics by day, but the graph in Visualization is only showing the vertices, and no plot line.

source="/logpath/logfile.log" earliest=-7d@d latest=@d :USERS | eval fields=split(_raw,":") | eval num=mvindex(fields,7) | timechart values(num)

Am I doing something wrong?

Thanks,
Warren

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Are you trying to extract numbers 7 and 12 from your example logs? Is the keyword USERS constant in value and/or in its position?

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...