We have log entries with multiple key-value pairs. All of the keys I'm interested in have a common prefix and all of the values are decimal numbers. Unfortunately, not every entry contains all of the keys.
An entry could look like this:
<TIMESTAMP, LOGLEVEL etc.> prefix-<SUFFIX1>=<VALUE1>, prefix-<SUFFIX3>=<VALUE2>, ...
What I want is a table where the first column contains the union of all of the keys with the common prefix, the other columns should contain statistics like median, different percentiles, standard deviation.
What I got so far:
host=... index=... | rex max_match=100 "(?<field>prefix-\w+?)=(?<val>\d+)" | stats median(val) as "Median", perc90(val) as "90th percentile", perc99(val) as "99th percentile", min(val) as "min", max(val) as "max", stdev(val) as "Standard deviation", count(val) as "count" by field
The problem: Because not all log entries contain all keys, somewhere along the pipes the value 0 seems to be assumed for those missing keys. As a result, the statistical measures get totally skewed.
I would be thankful for any help.
How about this
your base search | table prefix-* | eval temp=1 | untable temp field value | stats median(val) as "Median", perc90(val) as "90th percentile", perc99(val) as "99th percentile", min(val) as "min", max(val) as "max", stdev(val) as "Standard deviation", count(val) as "count" by field
How about this
your base search | table prefix-* | eval temp=1 | untable temp field value | stats median(val) as "Median", perc90(val) as "90th percentile", perc99(val) as "99th percentile", min(val) as "min", max(val) as "max", stdev(val) as "Standard deviation", count(val) as "count" by field
I'm impressed, that solved my issue!
Two small corrections: table prefix_*
(Splunk changes hyphens to underscores when a field is recognized) and untable temp field val
.
But apart from that: Brilliant! Thank you very much!
fieldsummary
seems to mostly do what I want, but it is missing percentiles and median