The latest questions for the topic "stdev"Calculating stdev by individual users
I'm trying to build the following logic and failing: For each user in my Windows Event Logs, calculate the stdev and boundaries for the distinct count (averaged daily) of servers logged into, **for each specific user.** I would then theoretically set an alert to yell when any user reaches above their threshold.
I have read the "Finding and removing outliers" doc, but that seem to allow creating upper and lower limits **for each user, or "by user"**, etc. I've tried to modify that information to fit this model and failed. Maybe I'm not understanding it correctly. My attempts look generally like this:
| eventstats dc(dest_nt_host) as new_dc, avg(new_dc) as new_avg, stdev(new_avg) by user as new_stdev
| eval upper = new_avg+(new_stdev*2)
| eval lower = new_avg-(new_stdev*2)
Any advice or guidance on this problem would be greatly appreciated!splunk-enterprisestdeveventlogswindowseventlogsWed, 16 Jan 2019 16:18:49 GMTdanataylorWhat is the best way to get the running average and standard deviations for HTTP request character length?
What would be the best way to search for anomalies/outliers for HTTP request character length by source IP? Looking for HTTP requests whose standard deviation may indicate a potential hack/indicator of compromise.
Thxsplunk-enterprisestatsstdevaveragesanomalyMon, 27 Aug 2018 20:34:28 GMTjwalzerpittstandard deviation by hour for last business week and compare it with today's numbers for the same hour
I need help with framing a query which gives me the standard deviation of 5 values (for last business week) and compare the same with today's traffic for the same hour and trigger an alert if the difference is more than x%
All i could get was the values for the same hour ever business day last week using simple chart command and I couldn't go past that.
index=ABC sourcetype=DEF uri="/sample/event/test" earliest=-6d@w1 AND latest=-1d@w6 date_hour>5 date_hour<=18 | chart limit=100 span=1h dc(unique_id) over date_hour by date_mday
Result
date_hour 19 20 21 22 23
7 60366 61630 62768 62533 64369
I need data in this below format or at least the 3 values I am looking for
StdDev(Last business week between 8 am - 9 am ET) Current_Hour's_Traffic DIfference_In_%
500 450 10
Thanks a lot.stdevMon, 26 Mar 2018 06:37:22 GMTedookatiHow to calculate moving standard deviation in Splunk?
_time, Prev Week(count),Prev 2 week(count),avg,3*Std Dev,Current count,Delta,RAG
1:30 8 7 7.5 2.121320344 8 0.5 OK
2:00 9 9 9 0 5 4 Alert
2:30 10 11 10.5 2.121320344 11 0.5 OK
3:00 11 10 10.5 2.121320344 8 2.5 Alert
3:30 12 12 12 0 7 5 Alert
4:00 12 12 12 0 10 2 Alert
4:30 13 14 13.5 2.121320344 8 5.5 Alert
5:00 13 13 13 0 8 5 Alert
5:30 14 13 13.5 2.121320344 7 6.5 Alertsplunk-enterprisestdevmovingSat, 10 Mar 2018 06:16:45 GMTpayal23Compare the output of two searches and timechart the latest output only if different than the the older output
I have two timecharts that only hit on http status code of 500 (one for the past hour and one for the same hour but last week). I want to display the value of the past hour only if it differs from the value of the same hour of last week. I believe that using stdev is the way to go but am unable to figure out exactly how to place it to get it to work (append/join the searches together then test or if it can be done in one search). The final result that I am looking for is a timechart with the hits of the status code of 500 only if the past hour's output is different than the same hour of last week. The main search that I am working with is as follows:
index=myindex sourcetype=mysourcetype field1=myfield1 http_status="500" field2!="what_i_dont_want" | timechart count by field2 limit=20 useother=false | sort -count
Im not sure if the following would work at getting what I want to see but looking through some other answers similar to what I want, I believe this should work but I do not receive any output in the statistics tab for some reason:
index=myindex sourcetype=mysourcetype field1=myfield1 http_status="500" field2!="what_i_dont_want" earliest=-60m@m latest=now | timechart count AS TodayLastHour by field2 limit=20 useother=false | appendcols [search index=myindex sourcetype=mysourcetype field1=myfield1 http_status="500" field2!="what_i_dont_want" earliest=-169h@h latest=-168h@h | timechart count AS LastWeekLastHour by field2 limit=20 useother=false] | where TodayLastHour != LastWeekLastHour | timechart count by TodayLastHour limit=20 useother=false
I plan on visualizing the chart as a linechart and am not sure if there is a way to show a linechart that contains only differences (If the values are the same as last week, dont show).splunk-enterprisetimechartlinechartstdevunionWed, 06 Dec 2017 15:57:00 GMTrobrang558Calculate Standard Deviation for a table
I have a table which is in below format
Time ProcessTime1 ProcessTime2 ABC_Count
11/14/2017 100 112 30000
11/15/2017 118 205 30546
11/16/2017 119 121 43000
11/17/2017 141 192 95000
It produces a visualization like below, where ABC_Count is the overlay which is the red line in the below graph.
![alt text][1]
I want to have stdev plotted in the below graph, and if run the below search query it is giving me zero as stdev
mybase search | stats values(ProcessTime1), values(ProcessTime2), values(ABC_Count), stdev(ProcessTime1), stdev(ProcessTime2) by Time
Basically if just remove the "by Time" in the above search i get stdev calculated properly, but then the visualization is not available.
Any idea how stdev can be plotted in the same graph?
[1]: /storage/temp/219968-visualization.jpggraphstdevaggregatedeviationWed, 06 Dec 2017 10:07:48 GMTashish9433How can I group field names to compare values within a specified time range?
I have around 200 KPIs, each having field names in the form of *_KPI and with numbers and each *_KPI has different values.
for eg, 100_KPI has values 0, 1,56,100 and so on. Is it possible to group all field names in 1 field name as KPI? I need to compare the latest value of each KPI with the 7 day avg date_hour count and group it by KPI and display only KPI that have large deviation in single panel.splunk-enterprisepanelgroupingstdevfield-nameTue, 28 Nov 2017 07:46:36 GMTdeepashri_123Compare standard deviation results for two sets of results
I have a query that uses stdev on the field value "queue_length" by field "queue_name". I need a query that gives me results only if stdev_5m > 2*stdev_hour. But the issue is sometime the "queue_name" doesn't appear in the search for the previous five minutes but it does appear for the previous hour. That's why below Splunk query giving wrong result because it's not comparing same queue_name, it's compare column by column in-respect to which queue name it has in the column.
index=cvt_metrics sourcetype=report_service_broker_queue earliest=-1h| where queue_length > 0 | stats stdev(queue_length) AS stdev_hour by queue_name | appendcols [ search index=cvt_metrics sourcetype=service_broker_queue earliest=-5m| where queue_length > 0 | stats stdev(queue_length) AS stdev_5m by queue_name] | eval Result=if(stdev_5m > 2*stdev_hour, "Error", "OK") | search Result="Error"splunk-enterprisestdevMon, 27 Nov 2017 14:18:58 GMTgauravg_cventUsing Standard Deviation to track SSH traffic
I'm looking for a way to traffic the average ssh traffic between two IP addresses (source IP and destination IP) and hopefully find when a host is doing more SSH traffic than usual and alert on it. I've been looking through some of the standard deviation paperwork and I think I found a search I wanted to do but the standard deviation I get is zero; which doesn't make sense.
Here is what I've been playing around with.
sourcetype="cisco:asa" dest_port=22
| stats count by src_ip, dest_ip
| stats mean(count) as mean, stdev(count) AS stdev by src_ip
| eval stdv_percentage=(mean/stdev)*100splunk-enterprisestdevsshdeviationTue, 24 Oct 2017 16:48:32 GMTserwinHow to alert when a deviation has been detected in volume between two time periods?
I currently use the following query to compare volume counts between current day and a week ago:
sourcetype=abc index=xyz source=foo earliest=-0d@d latest=now |
bucket _time span=30m |
stats count by _time |
eval ReportLabel="Today" |
append [search sourcetype=abc index=xyz source=foo earliest=-7d@d latest=-6d@d |
bucket _time span=30m |
stats count by _time |
eval ReportLabel="PreviousWeek" |
eval _time=_time+(60*60*24*7)] |
chart max(count) as count over _time by ReportLabel
I'm interested in leveraging this query (if possible) to alert me if volume counts between the two time periods deviate by a certain percentage. Since the alert would run every 30 minutes, I'd have to adjust the timeframes accordingly.
- How would I capture a specific half hour period from the previous week to reference against current day?
- How could a deviation calculation be applied?splunk-enterprisesearchalertstdevdeviationWed, 11 Oct 2017 17:02:48 GMTbcauntHelp understanding standard deviation alert for entries that have a count of 0?
I have seen several similar questions asked, but they are often answered in different ways so I'm hoping whoever answers this can explain why they created the search string the way they did.
I have multiple hosts and I want to create an alert if the count of events reduces by more than 2 standard deviations on a per hour basis for the last four weeks for each host. I have seen many examples that used buckets while others used timechart. My understanding is that bucket will not include entries that have a count of 0 so timechart should be used, is this correct?
index=is1
| timechart span=1h count by host
| stats stdev(count) AS Stdev
| eval thresh=Stdev*2
| where count < thresh
| table host countsplunk-enterprisestatstimechartalertstdevMon, 09 Oct 2017 20:16:01 GMTglenngermiathenCreate alert when average events greater than 2 standard deviations from rolling average
I know that there are several threads on answers that reference alerts based on standard deviation. I have tried a few of them and the use cases do not seem to meet what I need.
I would like to create an alert that will fire when the average of events over 5 minutes is greater than 2 standard deviations of the average of events over 60 minutes. This post is the closest I have found, but I am still stuck.
https://answers.splunk.com/answers/227404/alert-when-sample-is-2-standard-deviations-from-mo.html?utm_source=typeahead&utm_medium=newquestion&utm_campaign=no_votes_sort_relev
Any assistance would be appreciated.
ThanksalertingstdevWed, 23 Aug 2017 20:24:19 GMTjodrosFind out if count at a specific time is below the average
UPDATE:
I have created a search/alert that should notify me if:
1. Index data is 0 for a particular hour
2. Index data count is below the normal 5 percentile of the count (Mean - 2*Standard Deviation)
| tstats count WHERE earliest=-30d@-3h latest=now index=* by index, _time span=1h | makecontinuous span=1h _time | eval count=if(isnull(count),0,count)
| eval time_group = floor(tonumber(strftime(_time,"%H"))/3)
| bin _time as myday span=1d
| eval weekday=strftime(_time,"%a")
| where time_group=floor(tonumber(strftime(now(),"%H"))/3)-1 AND weekday=strftime(now(),"%a")
| eventstats min(_time) as _time sum(count) AS ThreeHourCount avg(count) as MonthlyAverageCount stdev(count) as MonthlyStdDev by index time_group myday weekday
| eval MonthlyAverageCount=round(MonthlyAverageCount,2), MonthlyStdDev=round(MonthlyStdDev,2)
| where strftime(now(),"%Y-%m-%d")=strftime(_time,"%Y-%m-%d") AND (count=0 OR ThreeHourCount<WeeklyAverageCount-(2*WeeklyStdDev))
So I start with checking every hour, then putting everything in 3 hour blocks (since my search will be running every 3 hours.
Based on these searches, do you see anything wrong with my steps? any room for improvement? Also this in the cloud takes about 40 seconds. any way possible to make it faster?splunk-cloudtimechartaveragetstatsstdevFri, 04 Aug 2017 21:25:29 GMTmkarimi17What is the best way to get the running average and standard deviations for external port connects?
So I am looking at cisco asa logs and wondering what the best way method would be to create an alert when the number of external connection attempts to port 23 (in my network) is +/- 2 standard deviations from the daily average.
Thank yousearchalertaverageportstdevTue, 11 Jul 2017 16:42:14 GMTpacket_hunterHow to generate a search to find the number of days that exceeds mean by certain ranges?
https://answers.splunk.com/answers/547878/how-to-generate-a-search-to-find-the-number-of-day.html
Hello all!
I'm trying to find the number of days that the daily count of my event exceeds the daily mean + standard deviation for a 3-week period. I also need to return the number of days that exceeds the mean + 2 stdevs and mean + 3 stdevs, and keep it all together.
Is there an easy way to do this?splunk-enterprisesearchstdevmeanThu, 15 Jun 2017 16:37:50 GMTjrnastaseIs it possible to print a line chart with: line with value, line with mean+stdev and line with mean-stdev?
I'm trying to print a line chart with three values:
- value
- mean(value) - stdev(value)
- mean(value) + stdev(value)
I'm trying this:
stats mean(percentIdle) AS mean, stdev(percentIdle) AS stdev |
eval down= mean-stdev |
eval up= mean+stdev |
timechart first(down) as "min" first(up) as "max" first(percentIdle) as "percentIdle"
And similar variations but nothing works.
Does anyone knows how to do this?
Thank you!timechartlinechartstdevmeanThu, 06 Apr 2017 15:49:05 GMTerabadanHow to calculate min/max/avg/stdev by each line
The date are all number field, such as
cluster, field_1, field_2, field_3, field_4, field_5
1 3 56 6 767 8
1 56 6 5432 5 7
2 6 65 987 356 6767
2 65 56 4321 4 56
3 3 5656 65 56456 56
I'd like to calculate min/max/avg/stdev of each line.
I understand that can stats min(*) max(*) avg(*) stdev(*) by xxxx,
but stats returned like min(field_1) max(field_2) avg(field_3) ..... as new column.
But, I want display min/max/avg/stdev on each line just like tableau or excel pivot table function.
For example,
new_field, field_1, field_2, field_3, field_4, field_5,
min 3 6 6 5 7
max 30 31 2,719 386 8
avg 30 31 2,719 386 8
stdev 37.47 35.35 3836.76 538.81 0.707
I mean, stats can calculate in parallel but I want to calculate it vertically.
Anyone have any idea? If I could, I'd like to group by cluster number with min/max/avg.
Thanks,splunk-enterprisemaxavgstdevminWed, 29 Mar 2017 11:38:35 GMTgojiHow to calculate stdev for a count of one field based on another?
I am trying to figure out how to calculate the stdev of the number of emails a user sends. I have the following search so far where I am calculating the count of MessageId per SenderAddress:
`index=exchange | stats count as MessageId by SenderAddress`
I'm getting tripped up trying to bring in/carry over the count of MessageId per SenderAddress. Do I need to create an eval for this field and then plug that into the rest of the search of:
`| eventstats mean(field) AS mean_field, stdev(field) AS stdev_field | eval Z_score=round(((field-mean_field)/stdev_field),2) | where Z_score>1.5 OR Z_score<-1.5 | table _time, SenderAddress, FromIP, field, mean_field, Z_score | sort -Z_score`
Thxsplunk-enterpriseevalcountstdevFri, 17 Mar 2017 16:06:38 GMTjwalzerpittAverage Index License Deviation Over The Past Month - Search Check
Hi Folks;
I am looking to get the deviation of license usage for each particular index over a 30 day period. My hope is to use this data to forcast the estimated max per month that a particular group (index) may use. I have the following basic search, but was hoping to get an extra set of eyes to make sure my math is correct:
earliest=-30d@d latest=@d index=_internal source=*license_usage.log* type=Usage idx="*" | stats sum(b) AS Bytes stdev(b) AS Deviation by idx | eval GB = Bytes/1024/1024/1024 | eval DevMB = Deviation/1024/1024 | eval "Daily Avg Usage" = GB/30 | rename idx as Index | table Index "Daily Avg Usage" "DevMB"splunk-enterprisestdevlicensesFri, 16 Dec 2016 15:49:26 GMTpaimonsororHow to edit my search to filter transactions based on standard deviation of event count?
I have a search that is grouping events into transactions and includes the eventcount as part of it. The transaction is based on a source IP address and service. My goal is to determine the standard deviation of event count by source IP and then filter it to display only transactions that actually have and event count that falls outside of the standard deviation.
Search:
Search | transaction source_ip service maxpan=5m maxpause=60s eventcount | stats avg(eventcount) as average stdev(eventcount) as standarddev by source_ip | eval upperlimit=average+standarddev, oddball=if(eventcount > upperlimit,1,0) | search oddball=1
My problem is that I get it to figure out average, standarddev, and upperlimit for each source IP just fine, but it doesn't actually figure if each transaction group is "oddball" and then filter it out.
Thoughts?
TroysearchcounttransactionfilterstdevThu, 08 Dec 2016 15:10:17 GMTtroyward