Splunk Search

Problem computing error rate from two different queries for a graph. . .

dwfarris
Explorer

Other answers I have found don't quite seem to work in my case here. Have seen similar where it can be done based on say "type=" fields and the append/join suggestions don't quite work either.

Hoping someone has a simple solution while I continue to hack/dig for the solution myself. . .

This query LOOKS like what I want as a result. . . total errors / total counts * 100. . . the eval for the error rate does not use the correct correlated bucket error count. It seems to always use the first error count for every bucket percentage. . .

every bucket error rate is 12 / (1 second total count) * 100 because 12 seems to be the First 1 second bucket error count.

each of the join querys works fine by itself to create a nice line/graph. . .
index=prod_stuff source="xyzzy" | bucket _time span=1s
| stats count as totalCount by _time, host
| join
[search index=prod_stuff source="xyzzy" ("FATAL" OR "ERROR" OR "stringa" OR . . . )
NOT ("WARN" OR "string1" OR "string2" OR "string3" OR "string4" OR . . . )

| bucket _time span=1s | stats count AS totalErrors by _time, host ]
| eval errorRate=totalErrors/totalCount*100 | xyseries _time, host, errorRate

This produces a very nice looking graph if I did not care too much about the numbers being correct.

Don't care if this is done using a join, just most efficient way to do this is what I am looking for.

Thanks

0 Karma
1 Solution

sloshburch
Splunk Employee
Splunk Employee

I'm having trouble following the exact problem but some things I see that might be contributing:

  • No field name is specified for the join. That could provide unexpected behavior of how data is merged.
  • bucket _time ... | stats ... | xyseries .... seems awfully similar to using timechart. So just wanted to make sure you knew about that command
  • You might want to round the eval, but not necessarily. Just making sure you knew it was available.
  • I have a vendetta against joins. Would you be able to run both searches at the root search and then simply use an eval in the stats command to get both results? You could also use the like function of eval instead of the _raw="**" like below.

    index=prod_stuff source="xyzzy"
    | bucket _time span=1s
    | stats count AS totalErrors, count(eval(_raw="FATAL" OR _raw="ERROR" OR _raw="stringa"...)) as totalErrors by _time, host
    | eval errorRate = round( totalErrors/totalCount*100 , 2 )
    | xyseries _time, host, errorRate

View solution in original post

sloshburch
Splunk Employee
Splunk Employee

I'm having trouble following the exact problem but some things I see that might be contributing:

  • No field name is specified for the join. That could provide unexpected behavior of how data is merged.
  • bucket _time ... | stats ... | xyseries .... seems awfully similar to using timechart. So just wanted to make sure you knew about that command
  • You might want to round the eval, but not necessarily. Just making sure you knew it was available.
  • I have a vendetta against joins. Would you be able to run both searches at the root search and then simply use an eval in the stats command to get both results? You could also use the like function of eval instead of the _raw="**" like below.

    index=prod_stuff source="xyzzy"
    | bucket _time span=1s
    | stats count AS totalErrors, count(eval(_raw="FATAL" OR _raw="ERROR" OR _raw="stringa"...)) as totalErrors by _time, host
    | eval errorRate = round( totalErrors/totalCount*100 , 2 )
    | xyseries _time, host, errorRate

dwfarris
Explorer

Thanks,

I used @SloshBurch's and with some mods got it to work.

(thanks somesoni1, but did not try yours)

Agree, don't like joins/appends, but that came closest to describe problem of my several initial attempts. . .

I did have to modify it, so for others that might come across this, I could not use the raw="*string*". Those wildcards would not work.

SAME structure, but had to use like instead for string comparison.

This search works fine now:

index=prod_stuff source="good_stuff*"
| bucket _time span=1s 
| stats count as totalCount, count(eval((like(_raw,"%ERROR%") OR like(_raw,"%string1%") OR like(_raw,"%string2%")) AND NOT like(_raw,"%string3%") AND NOT like(_raw,"%string4%") AND NOT like(_raw,"%string5%") AND NOT like(_raw,"%string6%"))) as totalErrors by _time, host 
| eval errorRate=round(totalErrors/totalCount*100,2) 
| xyseries _time,host,errorRate
0 Karma

somesoni2
SplunkTrust
SplunkTrust

Try something like this

index=prod_stuff source="xyzzy" | bucket _time span=1s 
| stats count as totalCount by _time, host | append [ search index=prod_stuff source="xyzzy" ("FATAL" OR "ERROR" OR "stringa" OR . . . ) 
NOT ("WARN" OR "string1" OR "string2" OR "string3" OR "string4" OR . . . ) 
| bucket _time span=1s | stats count AS totalErrors by _time, host ]
| stats values(total*) as total* by _time host  | eval errorRate=totalErrors*100/totalCount
timechart span=1s max(errorRate) by host useother=f
0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...