I have written the following search to detect numeric outliers (based on the syslog message count per day) for the "test-device"
index = syslog hostname = "test-device" | regex message = (?i)".fail."
| timechart span=1d count | eventstats avg("count") as avg stdev("count") as stdev | eval upperBound=(avg+stdev*exact(1)) | eval isOutlier=if('count' > upperBound, 1, 0)
Is it possible to extend this search to include multiple devices?
So i can get something like this:
Device_name, is_outlier_time_1, is_outlier_time_2, is_outlier_time_3
test-device_1, 0, 0, 1
test_device_2, 1, 0, 1
test_device_3, 0, 0, 0
Try this:
index = syslog
| regex message = (?i)".fail."
| timechart span=1d count BY hostname
| eventstats avg("count") as avg stdev("count") as stdev BY hostname
| eval upperBound=(avg+stdev*exact(1))
| eval isOutlier=if('count' > upperBound, 1, 0)
Please take a look at the Detect Numeric Outliers assistant, try the "Fields to split by" option, and click on the Show SPL buttons. You'll see that eventstats and streamstats both accept a 'by' clause and that the Toolkit includes a 'splitby' macro that you may find helpful. For example, using one of the built-in datasets, you can get pretty close to what I think you're looking for:
| inputlookup hostperf.csv
| eval _time=strptime(_time, "%Y-%m-%dT%H:%M:%S.%3Q%z")
| timechart span=10m max(rtmax) as responsetime
| head 1000
| eval host=random()%3
| streamstats window=200 current=true median("responsetime") as median by "host"
| eval absDev=(abs('responsetime'-median))
| streamstats window=200 current=true median(absDev) as medianAbsDev by "host"
| eval lowerBound=(median-medianAbsDev*exact(20)), upperBound=(median+medianAbsDev*exact(20))
| eval isOutlier=if('responsetime' < lowerBound OR 'responsetime' > upperBound, 1, 0)
| `splitby("host")`
| fields _time, "responsetime", lowerBound, upperBound, isOutlier, *