Splunk Search

Eventstats not attaching a stat to every event

brajaram
Communicator

My data is structured with a series of events for any given user, that need to be summed up to get the complete response time for a user. Then I need to find summary statistics of these response times across all users. The query I have to get that to work is:

index=...
| eval responseTime=coalesce('r1', 'r2')
| eventstats sum(responseTime) as SumID by uid
| table SumID _time uid
| dedup uid
| timechart ...

Response time can be in one of two fields depending on event (r1 or r2). I create a new response time field (responseTime) and then attach the sum of the times per member to each event with eventstats, then timechart it all. If I do this search across 1 day, there are no issues. If I do it across a week, only the most recent days provide values. However, if I put the last day as several days prior, then it fills the first 2 days or so and then returns no values for the prior days. Investigating, it seems as if the field SumID is not populating for many events in eventstats. However, the field responseTime and uid do exist in those events that SumID does not exist. How is that possible for eventstats to not propagate to all events?

Tags (2)
0 Karma

DalJeanis
Legend

Just from the title, that happens when the event does not contain one of the by fields.

However, in this case, it is probably the dedup command. This eliminates all but the most recent event for each uid.

I suspect, from reading your code, that you need to rewrite the whole thing from scratch. SInce the purpose of the eventstats command is to keep the events and add summary data, and you are trying immediately to throw away most of that, you should probably switch to stats.

Consider this - you are trying to use timechart, so you are expecting to compare things across time periods. Let's say "across days".

That means, you need at the very least to bin the events by day before you sum() the response time.

Try something like

index=... 
| eval responseTime=coalesce('r1', 'r2') 
| bin _time span=1d
| stats sum(responseTime) as SumID by uid _time 
| timechart span=1d avg(SumID) as AvgID 
0 Karma

brajaram
Communicator

It isn't dedup causing it. I used the query

index=....
| eval responseTime=coalesce('r1', 'r2')
| eventstats sum(responseTime) as SumID by uid | search NOT SumID=*

to find out where SumID didn't exist. In those events, I saw no SumID and it had valid values for uid and responseTime.

Your approach is probably better in the long run for efficiency, but I'm still baffled by the fact that I'm losing the eventstats value somehow.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...