Splunk Search

Why does max value turn into average in my results for date range greater than 3 Days?

mgrimes
New Member

So I have a query that is

  • | timechart count | timechart per_second(count) as TPS | timechart span=1d max (TPS)

This query results in the correct data out when ran on a single day or range of 2.

However, if this query is ran for a time range that is greater than 2 days, the last "timechart span=1d max(TPS)" seems to become "timechart span=1d average(TPS)" instead.

If you debug this by inspecting the evens individually the results below show that it's taking the average per day rather than the max.

alt text

I confirmed that the value represented in the month report is the average by exporting the report for an individual day in excel and simply performing the average function on that column.

alt text

0 Karma
1 Solution

somesoni2
Revered Legend

Give this a try

your base search  | timechart span=1s count as TPS | timechart span=1d max(TPS)

View solution in original post

somesoni2
Revered Legend

Give this a try

your base search  | timechart span=1s count as TPS | timechart span=1d max(TPS)

mgrimes
New Member

_time max(TPS)
2016-05-01

2016-05-02 677
2016-05-03 705

The results so far seem to be the max total number of events of a given day

However, I'm attempting to find the max number of events per second per day.

0 Karma

somesoni2
Revered Legend

That's strange. Could just run the first timechart (with span=1s) and see what is the type (in magniture) of values that you get?

0 Karma

mgrimes
New Member

Query: cs_host="services*" OR cs_host="lookup*" | timechart span=1s count

Results:

_time   count
2016-05-03 21:34:10 17
2016-05-03 21:34:11 10
2016-05-03 21:34:12 18
2016-05-03 21:34:13 15
2016-05-03 21:34:14 19

Hopefully I ran the query that you suggested. It seems to return the correct number of queries per second

0 Karma

somesoni2
Revered Legend

THis looks promising. Can you try this now

cs_host="services*" OR cs_host="lookup*" | timechart span=1s count | timechart span=1d max(count) as max_TPS

AND

cs_host="services*" OR cs_host="lookup*" | timechart span=1s count | bucket span=1d _time | stats max(count) as max_TPS by _time
0 Karma

mgrimes
New Member

Results for: cs_host="services*" OR cs_host="lookup*" | timechart span=1s count | timechart span=1d max(count) as max_TPS

_time   max_TPS
2016-05-01   
2016-05-02  677
2016-05-03  705

Results for cs_host="services*" OR cs_host="lookup*" | timechart span=1s count | bucket span=1d _time | stats max(count) as max_TPS by _time

_time   max_TPS
2016-05-02 00:00:00 677
2016-05-03 00:00:00 705

Like before they seem to just take the total of the average per second. I changed the data range from 3 days to 2 to check if anything is changing but it's consistent with finding the totals

0 Karma

somesoni2
Revered Legend

I don't see any issue with query, but could be the data is like that. So, let verify that. Please run this query and let me know result.

cs_host="services*" OR cs_host="lookup*" | timechart span=1s count | eval Date=strftime(_time,"%m/%d/%Y")  | eventstats max(count) as max_TPS by Date | where count=max_TPS
0 Karma

mgrimes
New Member

Results:

_time   count   Date    max_TPS
2016-05-01 20:27:22 923 05/01/2016  923
2016-05-02 03:13:13 796 05/02/2016  796
0 Karma

somesoni2
Revered Legend

You can probably increase the limit but again, since you're looking for 1 second bucket, the new limit will again get hit for larger period. This is a classic use-case for summary indexing. What you should do is to create a scheduler search to run at an interval and look at data for same period as the interval and save the summary in a summary index. You search that should refer to already summarized data from the summary index.

I would recommend you to go through following and setup that.

https://wiki.splunk.com/Community:Summary_Indexing
http://docs.splunk.com/Documentation/Splunk/6.3.0/Knowledge/Configuresummaryindexes
http://docs.splunk.com/Documentation/Splunk/6.3.0/Knowledge/Usesummaryindexing

0 Karma

somesoni2
Revered Legend

Regarding your data values, could you just search for events at the time mentioned in the result and check if there really are that many events OR not for that minute.

0 Karma

mgrimes
New Member

So it seems that there are that many events for that particular second so it seems that your query is correct to being with. I also saw that it started with 40-50 until it spiked to 600+ which is possible for our production traffic on occasion.

This means that I was misreading my query from the start
cs_host="services*" OR cs_host="lookup*" | timechart count | timechart per_second(count)

I was under the assumption this was correct for finding the events per second, however it seems to be taking the average of some sort base off the results. It is also spaced by 30 minutes.

_time   per_second(count)
2016-06-23 09:00:00 21.842778
2016-06-23 09:30:00 22.245000
2016-06-23 10:00:00 23.039444
2016-06-23 10:30:00 23.239444
2016-06-23 11:00:00 26.157222
2016-06-23 11:30:00 22.617778
2016-06-23 12:00:00 22.279444
2016-06-23 12:30:00 22.024444
2016-06-23 13:00:00 23.281667
2016-06-23 13:30:00 21.914444

For this particular query that you have provided cs_host="services*" OR cs_host="lookup*" | timechart span=1s count | eval Date=strftime(_time,"%m/%d/%Y") | eventstats max(count) as max_TPS by Date | where count=max_TPS is there a way for me to exclude this outliers easily with a conditional statement? I am still fairly new with Splunk syntax

0 Karma

mgrimes
New Member

Thank you for all your help. I really appreciate all the work you did with solving my issue and debugging!

0 Karma

somesoni2
Revered Legend

We sure can.

Here I'm excluding counts where are 50% or more of the max value for the day.

cs_host="services*" OR cs_host="lookup*" | timechart span=1s count | eval Date=strftime(_time,"%m/%d/%Y") | eventstats max(count) as max_TPS by Date | where count<max_TPS*0.5 | timechart span=1d max(count) as TPS

OR
Here I calculated the max_TPS as the 95 percentile value so those 5% of outliers should get filtered

cs_host="services*" OR cs_host="lookup*" | timechart span=1s count | eval Date=strftime(_time,"%m/%d/%Y") | eventstats perc95(count) as max_TPS by Date | where count<max_TPS | timechart span=1d max(count) as TPS
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...