I can create a table of numerical outliers for requests to a web service with something like
| timechart limit=11 useother=f usenull=f span=30m exactperc97(time_taken) as Perc
| streamstats window=24 avg(Perc) as avg stdev(Perc) as std
| eval m=2, lower=avg-(std * m), upper=avg+(std * m), outlier = if(Perc < lower OR Perc > upper, 1, 0)
| table _time avg* std* upper lower Perc outlier
but that gives me the outliers for all requests as a whole, however, I want to detect outliers based on the root component of the path , something like
| timechart limit=11 useother=f usenull=f span=30m exactperc97(time_taken) as Perc by root
| streamstats window=24 avg(Perc) as avg stdev(Perc) as std by root
however, the problem appears to be that streamstats does not get Perc as a field name due to the timechart split by clause, so I'm not sure how to make that work,
I worked it out
The timechart field names become the value of 'root' not the as Perc statement, so the wildcards then in the streamstats do the trick and then for foreach stuff gets all the evaluations
| timechart limit=11 useother=f usenull=f span=30m exactperc97(time_taken) as Perc by root
| rename * as path_*
| rename path__* as _*
| streamstats window=24 avg(*) as avg_* stdev(*) as std_*
| eval m=2
| foreach avg_* [ eval lower_<<MATCHSTR>>=<<FIELD>>-(std_<<MATCHSTR>> * m), upper_<<MATCHSTR>>=<<FIELD>>+(std_<<MATCHSTR>> * m), outlier_<<MATCHSTR>> = if(<<MATCHSTR>> < lower_<<MATCHSTR>> OR <<MATCHSTR>> > upper_<<MATCHSTR>>, 1, 0) ]
| table _time path_* upper_* lower_* outlier_*
Of course rendering that is non trivial if there is more than one root, but it does mean that you can easily provide a dynamic self populating drop down and then filter on that selected field in the table statement.
The idea behind the rename is to get a standard set of path_* names but preserve the default _* internal fields.
Would appreciate any pointers on how to improve this...