Splunk Search

Average on a value over time

matthewparry
Path Finder

Hi,

I have this data..

Jul 31 23:17:54 83.231.181.65 Jul 31 23:17:54.861457 host1 INFO switch=switch0 [DATA] switchport=32 temp=37.8
Jul 31 23:17:54 83.231.181.65 Jul 31 23:17:54.861851 host1 INFO switch=switch1 [DATA] switchport=33 temp=41.2
Jul 31 23:17:54 83.231.181.65 Jul 31 23:17:54.862236 host1 INFO switch=switch3 [DATA] switchport=11 temp=36

I am trying to get the avg of the temp over time but failing

I have tried using transaction but then it does not work over time as the data is now one event under a new "time"?

I have tried using timechart but the span is incorrect and setting span=50us causes the indexer to hit 100%.

| timechart avg(pdu_temp) span=50us

Any ideas?

Tags (3)
1 Solution

matthewparry
Path Finder

To answer my own question...

| bucket _time span=5m | stats avg(temp) as Temperature by _time

This groups the events into 5 min "buckets" and gets the average of the field, so it seems to do the trick.

Is there a different/better way?

View solution in original post

Ayn
Legend

Uhm, I don't see the reason for not using timechart?

 | timechart span=5m avg(temp) as Temperature

inmzombie
New Member

Because the avg in timechart take the last result, doesn't work over all result

0 Karma

lguinn2
Legend

Sorry, inmzombie, but you are wrong about that.
The timechart command calculates the average temperature for each time range (in this case, time ranges are set to a 5-minute span).
This is exactly what the | bucket _time span=5m | stats avg(temp) as Temperature by _time does.
There is no difference.

In both cases, you will get a series of averages with a timestamp.

matthewparry
Path Finder

To answer my own question...

| bucket _time span=5m | stats avg(temp) as Temperature by _time

This groups the events into 5 min "buckets" and gets the average of the field, so it seems to do the trick.

Is there a different/better way?

matthewparry
Path Finder

Would you be able to give examples of seeing an average each hour?

0 Karma

lguinn2
Legend

You want the average temperature over what time span? Would you like to see an average for each hour over the last 7 days? (ie, what is the average temp at 9 am, 10am, etc.) Would you like to see the average by day over the last 7 days?

There are tons of ways to do this... although doing it by millisecond is probably not practical over any time span longer than 5 minutes...

0 Karma

Taruchit
Contributor

Hello Sir,

I am trying to calculate average results every hour for a week. The average is to be calculated on the rex field. I tried using the following but it did not worked:-

Method 1: |stats avg(error) as errors by _time

Method 2: |stats avg(error) as errors by date_hour

Method 3: |bucket _time span=1h |stats avg(error) as errors by date_hour

Thus, please help.

 

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Taruchit,

did you tried with timechart command?

| timechart span=1h avg(error) as errors

For more infos see at https://docs.splunk.com/Documentation/Splunk/8.1.2/SearchReference/Timechart

Ciao.

Giuseppe

0 Karma

Taruchit
Contributor

Hello Sir,

Thank you for your response. I tried using |timechart span=1h avg(error) as errors

I got a table with heading _time and errors.

Data under _time has date time with a gap of an hour. However, there is no data under column _time.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Taruchit,

could you share your full search?

Ciao.

Giuseppe

0 Karma

Taruchit
Contributor

Due to security reasons I won't be able to post the full query, but I will mask the content whereever needed. 

index="xxx" host=ap000*li source=xxx/xx.log AND GET OR POST AND "[xxx.xxx.xxx.xxx.*.*.*] (ajp-/*.*.*.*: 8089-*)" AND "com.netflix.hystrix.exception.HystrixRuntimeException" | rex ".*\w{8}-\w{4}-\w{4}-\w{4}-\w{12}\s*:\s*com.netflix.hystrix.exception.HystrixRuntimeException:\s*(?<error>.*?)\s*(\.|$)" | timechart span=1h avg(error) as errors

 

I also want to add some more points, when I tried running following query, I get count of errors in past 24 hours on hourly basis and count of errors during the same time window in previous week.

index="xxx" host=ap000*li source=xxx/xx.log AND GET OR POST AND "[xxx.xxx.xxx.xxx.*.*.*] (ajp-/*.*.*.*: 8089-*)" AND "com.netflix.hystrix.exception.HystrixRuntimeException" | rex ".*\w{8}-\w{4}-\w{4}-\w{4}-\w{12}\s*:\s*com.netflix.hystrix.exception. HystrixRuntimeException:\s(?<error>.*?)\s*(\.|$)"| chart count as error by date_hour |JOIN date_hour [search index="xxx" host=ap000*li source=xxx/xx.log AND GET OR POST AND "[xxx.xxx.xxx.xxx.*.*.*] (ajp-/*.*.*.*: 8089-*)" AND "com.netflix.hystrix.exception.HystrixRuntimeException" (earliest=-7d@w1 latest=@w6) | rex ".*\w{8}-\w{4}-\w{4}-\w{4}-\w{12}\s*:\s*com.netflix.hystrix.exception.HystrixRuntimeException:\s*(?<error>.*?)\s*(\.|$)" | chart count as errors by date_hour] | table date_hour error errors |sort date_hour |rename error AS "In past 24 hours" |rename errors AS "In past 1 week"

If I run the query for past 24 hours today, it gives three columns:-

Column 1:-

date_hour: time window like 7,8, 9, 10...

Column 2:-

In past 24 hours: It gives count of errors on each row during time interval of 1 hour in past 24 hours.

Column 3:-

In past 1 week: It gives count of errors on each row during time interval of 1 hour in last week(15 February 2021 to 19 February 2021).

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Taruchit,

about the first search, I cannot understand if the regex is corrected, could you share an example of your (also masqued!) logs?

Otherwise, you can test by yourself the regex on regex101.com

In addition, please insert your code in Code sample box (the button "<>" because I don't understand if the regex is readable.

At least try to escape the dots in the regex "com\.netflix\.hystrix\.exception\.HystrixRuntimeException".

About the second search, there aretwo problems:

  • here's the limit of 50,000 results in subsearches so, you could not have all the results,
  • the join command is a very slow command, that's possible replace using only one search.

About try something like this (I cannot test it so I don't know if it runs, but see the approach:

index="xxx" host=ap000*li source=xxx/xx.log (GET OR POST) "[xxx.xxx.xxx.xxx.*.*.*] (ajp-/*.*.*.*: 8089-*)" "com.netflix.hystrix.exception.HystrixRuntimeException" ((earliest=-24h@h latest=@h) OR (earliest=-7d@w1 latest=@w6))
| rex ".*\w{8}-\w{4}-\w{4}-\w{4}-\w{12}\s*:\s*com.netflix.hystrix.exception. HystrixRuntimeException:\s(?<error>.*?)\s*(\.|$)"
| eval week=if(_time-86400>0,"In past 24 hours", "In past 1 week")
| chart count as errors OVER week by date_hour

Ciao.

Giuseppe

0 Karma

Taruchit
Contributor

Thank you for your response and sharing the information and approach.

Regarding point 1, I have posted the masked log and its rex below. I tested it on regex101.com: https://regex101.com/r/p0MIAB/1

2021-02-22 00:58:30,255 ERROR [com.xxx.yyy.interfaces.exception.web.ServerErrorController] (ajp-/100.10.100.100:8000-10) com.netflix.hystrix.exception.HystrixRuntimeException: Error message.

rex: -

.*\w{8}-\w{4}-\w{4}-\w{4}-\w{12}\s*:\s*com.netflix.hystrix.exception.HystrixRuntimeException:\s*(?<error>.*?)\s*(\.|$)

 

Regarding point 2, I executed the query for past 24 hours and observed it gave hourly count of errors from 22 February 2021 to 01 March 2021. I will try to work on this approach and build the query to get the results of last 24 hours along with it.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @Taruchit,

let me know if you solve your problem or if you need more help.

Ciao and happy splunking.

Giuseppe

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...