Splunk Search

Why does dedup count and dc return a different number of values?

tmaltizo
Path Finder

Doing separate searches with dc doesn't match numbers returned by a dedup count, except for the total. This is for the "All time" time frame. But, the issue prevails regardless of the time frame.

=====================================================

Using dc

index="forescout" sourcetype="fs_av_compliance" description="Server*" status="compliant" | stats dc(src_ip)

2804

index="forescout" sourcetype="fs_av_compliance" description="Server*" status="non-compliant" | stats dc(src_ip)

614

index="forescout" sourcetype="fs_av_compliance" description="Server*"| stats dc(src_ip)

2922

=====================================================

Using count

index="forescout" sourcetype="fs_av_compliance" description="Server*" | dedup src_ip | stats count by status | addcoltotals

compliant = 2767
non-compliant = 155
addcoltotals = 2922

Any insight is much appreciated!
Trista

0 Karma
1 Solution

sundareshr
Legend

Here's an example

_time=1 index=forescout ip=x.x.x.x status=complaint
_time=2 index=forescout ip=x.x.x.x status=complaint
_time=3 index=forescout ip=x.x.x.x status=non-complaint

With the above sample data dc(ip) will return 1 for compliant and 1 for non-compliant, Whereas dedup ip | stats count by ip will return only one for compliant.

For a more appropriate comparison try 'dedup ip status | stats count by status | addtotals`

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

Suppose your data set is this

src_ip  status
--------------------
src1    Compliance
src1    Compliance
src2    Non-compliance
src1    Non-compliance
src2    Compliance
src3    Compliance
src4    Non-compliance

Output of query 1 (distinct count of src_ip where status =Compliance) is 3 (src1, src2 and src3)
Output of query 2 (distinct count of src_ip where status =Non-compliance) is 3 (src2, src1 and src4)
Output of query 3 (distinct count of src_ip regardless of status) is 4 (src1,src2,src3 and src4)

This will be the output of query 4 after you run till dedup src_ip (take the first events for each src_ip)

src_ip  status
-----------
src1    Compliance
src2    Non-compliance
src3    Compliance
src4    Non-compliance

So, the count of src_ip with status=Compliance is now 2,
So, the count of src_ip with status=Non-compliance is now 2,
And total count is still 4 as there are still 4 distinct src_ip.

Hope this helps.

tmaltizo
Path Finder

This definitely helps @somesoni2! Thank you!

0 Karma

sundareshr
Legend

Here's an example

_time=1 index=forescout ip=x.x.x.x status=complaint
_time=2 index=forescout ip=x.x.x.x status=complaint
_time=3 index=forescout ip=x.x.x.x status=non-complaint

With the above sample data dc(ip) will return 1 for compliant and 1 for non-compliant, Whereas dedup ip | stats count by ip will return only one for compliant.

For a more appropriate comparison try 'dedup ip status | stats count by status | addtotals`

tmaltizo
Path Finder

Thanks for the clarification @sundareshr!

0 Karma

tmaltizo
Path Finder

@sundareshr,

If dc counts each unique ip/status and dedup counts only the first instance, then why are the totals the same?

... | dedup src_ip | stats count(src_ip) = 2928
... | stats dc(src_ip) = 2928

When I run the following....
... | dedup src_ip status | stats count by status | addtotals

compliant = 2809, total=2809
non-compliant = 616, total=616

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...