All Apps and Add-ons

Detecting numerical outliers from time series data for multiple devices

kiril123
Path Finder

I have written the following search to detect numeric outliers (based on the syslog message count per day) for the "test-device"

index = syslog hostname = "test-device" | regex message = (?i)".fail."
| timechart span=1d count | eventstats avg("count") as avg stdev("count") as stdev | eval upperBound=(avg+stdev*exact(1)) | eval isOutlier=if('count' > upperBound, 1, 0)

Is it possible to extend this search to include multiple devices?

So i can get something like this:

Device_name, is_outlier_time_1, is_outlier_time_2, is_outlier_time_3
test-device_1, 0, 0, 1
test_device_2, 1, 0, 1
test_device_3, 0, 0, 0

0 Karma

woodcock
Esteemed Legend

Try this:

index = syslog
| regex message = (?i)".fail."
| timechart span=1d count BY hostname
| eventstats avg("count") as avg stdev("count") as stdev BY hostname
| eval upperBound=(avg+stdev*exact(1))
| eval isOutlier=if('count' > upperBound, 1, 0)
0 Karma

aoliner_splunk
Splunk Employee
Splunk Employee

Please take a look at the Detect Numeric Outliers assistant, try the "Fields to split by" option, and click on the Show SPL buttons. You'll see that eventstats and streamstats both accept a 'by' clause and that the Toolkit includes a 'splitby' macro that you may find helpful. For example, using one of the built-in datasets, you can get pretty close to what I think you're looking for:

| inputlookup hostperf.csv 
| eval _time=strptime(_time, "%Y-%m-%dT%H:%M:%S.%3Q%z") 
| timechart span=10m max(rtmax) as responsetime 
| head 1000 
| eval host=random()%3 
| streamstats window=200 current=true median("responsetime") as median by "host" 
| eval absDev=(abs('responsetime'-median)) 
| streamstats window=200 current=true median(absDev) as medianAbsDev by "host" 
| eval lowerBound=(median-medianAbsDev*exact(20)), upperBound=(median+medianAbsDev*exact(20)) 
| eval isOutlier=if('responsetime' < lowerBound OR 'responsetime' > upperBound, 1, 0) 
| `splitby("host")` 
| fields _time, "responsetime", lowerBound, upperBound, isOutlier, *
0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...