Splunk Search

Line graph: Count per hour with a trendline that plots the average count every 24 hours.

3666142
Path Finder

I have a line graph that displays the number of transactions per hour. I want a trendline to go with it, but I want it to give me the average transactions every 24 hours. I have to use epochTime because I'm using the client's timestamp instead of Splunk's _time.

In other words, the query works except the trendline is plotting the average every hour instead of every 24 hours

index=ABCD
| eval epochTime=strptime(end, "%Y-%m-%d %H:%M:%S.%6N")
| bin epochTime span=1h
| stats count as "Transactions" by epochTime
| convert ctime(epochTime) as epochTime
| trendline sma2("Transactions") as Trend

Do you think I should create another bin for the trendline?
If someone could help me out that would absolutely amazing! Thank you.

0 Karma

dmarling
Builder

You are only giving yourself the last 2 hours in your trendline command to create the trend when you use sma2. If you truly want it to be a rolling 24 hour average you need to ensure that your earliest/latest time range when running the search includes at least the prior 24 hours of data and then your query just needs to be adjusted to reflect you want a simple moving average of 24 hours. Also the trendline command appears to only work when there is a _time field when I was testing it. Here's what I would recommend:

index=ABCD earliest=-26h latest=now
| eval _time=strptime(end, "%Y-%m-%d %H:%M:%S.%6N") 
| timechart fixedrange=false span=1h count as "Transactions"
| trendline sma24("Transactions") as Trend
| where isnotnull(Trend)

The reason why I changed it to timechart with a span of 1 hour from your stats is that your stats will not include hours where there are 0 events occurring unless that is your goal. If you do not want to have zero hours deflating your 24 hour average then you can still use timechart, but will need an eval to null the zeros and then use streamstats/eval combo instead of trendline. I'm not a huge fan of this method but it will function:

index=ABCD earliest=-26h latest=now
| eval _time=strptime(end, "%Y-%m-%d %H:%M:%S.%6N") 
| timechart fixedrange=false cont=true span=1h count as "Transactions" 
| eval Transactions=if(Transactions=0, null(), Transactions)
| streamstats current=f window=24 sum(Transactions) as prior24
| eval Trend=prior24/24
| eventstats max(_time) as lasttime
| where isnotnull(Trend) and _time>=relative_time(lasttime, "-3h")
| fields - lasttime prior24
If this comment/answer was helpful, please up vote it. Thank you.
0 Karma

3666142
Path Finder

Let me get back to you on this. I am waiting for the client to send me more data so I can properly test this. Thank you for the help. I'll let you know if this works or not.

0 Karma

3666142
Path Finder

I didn't get the response that I wanted, but thank you for the help. Turns out my client does not need a trendline anymore. Sorry for the inconvenience.

0 Karma

3666142
Path Finder

Here is an example of the data that I'm ingesting. I did have to take irrelevant data out because of confidentiality reasons.

{
end: 2019-07-18 14:59:51.904915

start: 2019-07-18 14:59:47.734263

exampleInteger: 417
}

So I am given a start/end timestamp. I have successfully create a line graph (it graphs on on the end timestamp as the x axis) that plots a count of all the events every hour. For example, between 2019-07-18 14:00:00.000000 AND 2019-07-18 14:59:59.999999, I got a count of 7394. I want to take that 7394, along with 23 other counts throughout (because there are 24 hours in a day), and average them. For example:

2019-07-18 00:00:00.000000 AND 2019-07-18 00:59:59.999999 COUNT: 7394
2019-07-18 01:00:00.000000 AND 2019-07-18 01:59:59.999999 COUNT: 8053
2019-07-18 02:00:00.000000 AND 2019-07-18 02:59:59.999999 COUNT: 7265
2019-07-18 03:00:00.000000 AND 2019-07-18 03:59:59.999999 COUNT: 7237
2019-07-18 04:00:00.000000 AND 2019-07-18 04:59:59.999999 COUNT: 7656
2019-07-18 05:00:00.000000 AND 2019-07-18 05:59:59.999999 COUNT: 7528
2019-07-18 06:00:00.000000 AND 2019-07-18 06:59:59.999999 COUNT: 7948
2019-07-18 07:00:00.000000 AND 2019-07-18 07:59:59.999999 COUNT: 7254
2019-07-18 08:00:00.000000 AND 2019-07-18 08:59:59.999999 COUNT: 7744
2019-07-18 09:00:00.000000 AND 2019-07-18 09:59:59.999999 COUNT: 7752
2019-07-18 10:00:00.000000 AND 2019-07-18 10:59:59.999999 COUNT: 7936
2019-07-18 11:00:00.000000 AND 2019-07-18 11:59:59.999999 COUNT: 7410
2019-07-18 12:00:00.000000 AND 2019-07-18 12:59:59.999999 COUNT: 7047
2019-07-18 13:00:00.000000 AND 2019-07-18 13:59:59.999999 COUNT: 7008
2019-07-18 14:00:00.000000 AND 2019-07-18 14:59:59.999999 COUNT: 7394
2019-07-18 15:00:00.000000 AND 2019-07-18 15:59:59.999999 COUNT: 7647
2019-07-18 16:00:00.000000 AND 2019-07-18 16:59:59.999999 COUNT: 7954
2019-07-18 17:00:00.000000 AND 2019-07-18 17:59:59.999999 COUNT: 7541
2019-07-18 18:00:00.000000 AND 2019-07-18 18:59:59.999999 COUNT: 7145
2019-07-18 19:00:00.000000 AND 2019-07-18 19:59:59.999999 COUNT: 7421
2019-07-18 20:00:00.000000 AND 2019-07-18 20:59:59.999999 COUNT: 7045
2019-07-18 21:00:00.000000 AND 2019-07-18 21:59:59.999999 COUNT: 7453
2019-07-18 22:00:00.000000 AND 2019-07-18 22:59:59.999999 COUNT: 7155
2019-07-18 23:00:00.000000 AND 2019-07-18 23:59:59.999999 COUNT: 7124

The average of this 24 hour period would be 7462

I need a line graph that shows the past months worth of data. On this graph, I have to lines: the first lines is the hour per hour count and the other is the 24 hour average of those hour per hour counts.

I hope I might that clear... I know it's a lot of information. Thank you.

0 Karma

dmarling
Builder

Using your example and the answer I gave, This will generate a "Trend" field as soon as the 24th available hour. Here's a run anywhere example using the data you gave me.

| makeresults count=1
| eval data="2019-07-18 00:00:00.000000 AND 2019-07-18 00:59:59.999999 COUNT: 7394
2019-07-18 01:00:00.000000 AND 2019-07-18 01:59:59.999999 COUNT: 8053
2019-07-18 02:00:00.000000 AND 2019-07-18 02:59:59.999999 COUNT: 7265
2019-07-18 03:00:00.000000 AND 2019-07-18 03:59:59.999999 COUNT: 7237
2019-07-18 04:00:00.000000 AND 2019-07-18 04:59:59.999999 COUNT: 7656
2019-07-18 05:00:00.000000 AND 2019-07-18 05:59:59.999999 COUNT: 7528
2019-07-18 06:00:00.000000 AND 2019-07-18 06:59:59.999999 COUNT: 7948
2019-07-18 07:00:00.000000 AND 2019-07-18 07:59:59.999999 COUNT: 7254
2019-07-18 08:00:00.000000 AND 2019-07-18 08:59:59.999999 COUNT: 7744
2019-07-18 09:00:00.000000 AND 2019-07-18 09:59:59.999999 COUNT: 7752
2019-07-18 10:00:00.000000 AND 2019-07-18 10:59:59.999999 COUNT: 7936
2019-07-18 11:00:00.000000 AND 2019-07-18 11:59:59.999999 COUNT: 7410
2019-07-18 12:00:00.000000 AND 2019-07-18 12:59:59.999999 COUNT: 7047
2019-07-18 13:00:00.000000 AND 2019-07-18 13:59:59.999999 COUNT: 7008
2019-07-18 14:00:00.000000 AND 2019-07-18 14:59:59.999999 COUNT: 7394
2019-07-18 15:00:00.000000 AND 2019-07-18 15:59:59.999999 COUNT: 7647
2019-07-18 16:00:00.000000 AND 2019-07-18 16:59:59.999999 COUNT: 7954
2019-07-18 17:00:00.000000 AND 2019-07-18 17:59:59.999999 COUNT: 7541
2019-07-18 18:00:00.000000 AND 2019-07-18 18:59:59.999999 COUNT: 7145
2019-07-18 19:00:00.000000 AND 2019-07-18 19:59:59.999999 COUNT: 7421
2019-07-18 20:00:00.000000 AND 2019-07-18 20:59:59.999999 COUNT: 7045
2019-07-18 21:00:00.000000 AND 2019-07-18 21:59:59.999999 COUNT: 7453
2019-07-18 22:00:00.000000 AND 2019-07-18 22:59:59.999999 COUNT: 7155
2019-07-18 23:00:00.000000 AND 2019-07-18 23:59:59.999999 COUNT: 7124"
| rex field=data max_match=0 "(?<data>[^\n]+)"
| mvexpand data
| rex field=data " AND (?<end>\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2}\.\d{6})"
| eval _time=strptime(end, "%Y-%m-%d %H:%M:%S.%6N")
| rex field=data "COUNT: (?<COUNT>\d+)"
| table _time COUNT
| timechart span=1h fixedrange=false values(COUNT) as Transactions
| trendline sma24("Transactions") as Trend

If you go to the visualization tab with the above query and make it a line chart it you will see a dot for the first trend event on the 24th hour. If you have a month of data you will see that trendline continue from the initial 24th hour until the last event reported.

If this comment/answer was helpful, please up vote it. Thank you.

Sukisen1981
Champion

something like this -?
I used the default _audit index, so that you can use the code as it is
index="_audit"
| eval day=strftime(_time,"%m-%d")
| stats count by _time,day
| eventstats avg(count) as avg by day
I have created a new variable day to have a marker for the day and the eventstats calculates the average for the count of the stats command on a per day basis, which is of course a 24 hours scale

0 Karma

3666142
Path Finder

Not exactly. I see where you were going with it, but the average create became a constant value. Plus, I don't know to work it with the previous query. Thank you! I appreciate the help.

0 Karma

Sukisen1981
Champion

I am sorry , your requirement is not clear to me.
Based on the default index query, can you tell me your expected output?
If you plot the average for 24 hrs every hour it will be constant till 24 hrs, but it seems that is not what you expect. Can you please clarify?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...