Solved: How to get an accurate timechart count of matching...

det0n8r · ‎09-28-2015

I'm struggling with counting session table exports that dump active sessions every five minutes. Basically I keep running into a problem where the count overlaps with a previous export every few intervals.

The data looks something like this:

Time="3:00:00.000 PM" User="user 1"
Time="3:00:00.000 PM" User="user 2"
Time="3:00:00.000 PM" User="user 3"
Time="3:05:00.000 PM" User="user 1"
Time="3:05:00.000 PM" User="user 2"
Time="3:10:00.000 PM" User="user 1"

Here's a sample search:

... | timechart span=5m count(_raw) as ActiveSessions

Which results in the attached visualization screenshot, where the data overlaps at certain points and incorrectly counts events from the previous polling interval.

I'm guessing that this is because the exports aren't running/completing exactly every five minutes, and so the span is intermittently counting two sets of exports.

For example, how do you do a proper count if an extract is off by a second in a given polling interval; like this:

Time="3:00:00.000 PM" User="user 1"
Time="3:00:00.000 PM" User="user 2"
Time="3:00:00.000 PM" User="user 3"
Time="3:05:01.000 PM" User="user 1"
Time="3:05:01.000 PM" User="user 2"
Time="3:10:00.000 PM" User="user 1"

Is there another way to get at this metric? I started looking into concurrency, but didn't have much luck.

masonmorales · ‎09-28-2015

Normalize the times using "bin", dedup, then chart it.
i.e.
| bin _time span=5m | dedup _time user | timechart span=5m count as ActiveSessions

I'm not sure how you're getting the data into Splunk (DB Connect 2?), but the other option is to fix it on the ingest side. 😉

Hope this helps!

View solution in original post

masonmorales · ‎09-28-2015

Normalize the times using "bin", dedup, then chart it.
i.e.
| bin _time span=5m | dedup _time user | timechart span=5m count as ActiveSessions

I'm not sure how you're getting the data into Splunk (DB Connect 2?), but the other option is to fix it on the ingest side. 😉

Hope this helps!

det0n8r · ‎09-29-2015

Thank you sir! Using bin followed by a dedup definitely fixed the overlapping count!

To answer your question on the ingest method, these extracts are captured as standard output from a PowerShell script that is executed by the universal forwarder, and I suspect that performance issues on the SDK calls to the data source are to blame for the slight deviations in the interval duration.

masonmorales · ‎09-29-2015

Would you mind posting that as a new question please?

det0n8r · ‎09-29-2015

I've posted my new question here; http://answers.splunk.com/answers/312862/how-to-normalize-event-counts-of-disparate-data-ex.html

How to get an accurate timechart count of matching events if an extract is off by a second in a given polling interval?

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms