Splunk Search

Different distinct count for stats and timechart (same time interval)

hemendralodhi
Contributor

Hello,

For same base query I am getting different distinct count result in timechart and stats for same time range (old time to mitigate any new events coming in)

stats - query - mysearch | stats dc(field)

I ran the query for 2 hours between 16:00 - 18:00 and getting result as 507

Result ( Running it individually for 1 hr)
16:00 - 17:00 - 293
17:00 - 18:00 - 223
** Difference of 9 when compared to running it for complete 2 hr.

timechart - query - mysearch | timechart span=1h dc(Field)

Result ( For whole 2 hours)
16:00 - 293
17:00 - 223

Result ( Running it individually for 1 hr)
16:00 - 293
17:00 - 223

It seem stats gives correct result if searched for separate 1 hr interval but not for running full 2 hours.

I am at loss here what is happening?

Please advise.

Thanks

0 Karma

knielsen
Contributor

I don't see a problem with the numbers you posted. Whenever Field is reused after one hour, it will be contributing to the distinct count for each hour, but you only get it once when looking at the full two hours.

hemendralodhi
Contributor

Thanks for the response . If you see the total count vs individual count it is different for stats. Running stats for 2 hrs count=507. Running it individually count = 293 + 223=516

0 Karma

knielsen
Contributor

Yes, and that is totally fine. Consider this example with 3 events in 2 hours:

16:00 field=123
16:01 field=234
17:01 field=123

If you do a dc(field) for first hour, you get 2 as result, because you have 2 different values for field. If you do a dc(field) for the the second hour alone, you get 1 as result, because there is only one value of course. that doesn't mean though, that you get 3 as result if you do a dc(field) for the whole time. The result is 2, there are still only two distinct values for field. So having the sum of dc() first hour and dc() second hour which is 3 is different than the dc() over the whole time range. That is perfectly fine.

Your example tells us that 9 field values happened both in the first and the second hour, the rest were distinct for each hour.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi hemendralodhi,

could you share the timeranges you used in stats and timecharts?

do you have the same result if you run your stats search now (maybe there are later indexes events )?

Bye.
Giuseppe

0 Karma

hemendralodhi
Contributor

Thanks for your response. I ran the search with time range few hours back.

Time Range Used : 16:00:00.000 - 18:00:00.000

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...