Splunk Search

Distinct Count Query

shonky
New Member

I've looked around for answers on this, but unfortunately I've not found an answer to date. I have a list of data, but some of this is duplicate and as such I need to remove a whole row rather than just that value itself, ie:

Store Value 2 Value 3

Store 1 Device_1 Black

Store 1 Device_2 Black

Store 1 Device_3 ----

Store 1 Device_1 Black

What I would like to do is chart this so that I have:

Store Count of Value 2 (Unique) Count of Value 3
Store 1 3 2

When I use

... | chart dc(value 2), count (value 3) by store

It omits only the duplicate value in Value 2, but my count in value 3 is too high, ie:

Store Count of Value 2 (Unique) Count of Value 3
Store 1 3 3


Hi - Sorry, I think the "---" is actually empty in my data. So that part should be okay.

If I extend the values for two more line:

Store Value 2 Value 3
Store 1 Device_1 Black

Store 1 Device_2 Black

Store 1 Device_3

Store 1 Device_1 Black

Store 1 Device_1 Black

Store 1 Device_4 Black

If I exported the data to Excel, I would initially filter all of the results so that I only had unique values in Value 2:

Store Value 2 Value 3
Store 1 Device_1 Black

Store 1 Device_2 Black

Store 1 Device_3

Store 1 Device_4 Black

I would then run a pivot table which would show me the following:

           Value 2      Value 3

Store 1 4 3

So essentially, I want to remove any duplicate "lines" based on Value 2.

I hope that makes more sense. Sorry though.

Tags (2)
0 Karma

emiller42
Motivator

So you're telling Splunk to give you a distinct count of Value 2, which is does. (There are 3 distinct values) and a count of all items in Value 3, which is does. (I'm assuming the '----' is actually NULL in your records, so again there are 3 values)

What I'm not sure about is what you want the count to be for Value 3. Do you want a count of all records (what your query asks for) or a distinct count? (What your expected result set shows) I'm not sure why you're not using dc for both counts.

Now, if the '----' is actually an empty field, it won't be included in the count. So perhaps you're expecting it to be included? (That would cause a dc(Value 3) to return 2 like your expected results) If so, you can fillnull to give all nulls some value which would then be counted.

... | fillnull value=NULL "value 3" | chart dc(value 2) dc(value 3) by store
0 Karma

emiller42
Motivator

If it can have multiple values, then you can simply

... | dedup store "value 2" "value 3" | chart count(value 2) count(value 3) by store

and that should get what you're looking for.

0 Karma

shonky
New Member

I was going to say no, that it can have more than 1 answer. But thinking about it, I think you may be right if I limit my time span to a short enough period. I will give it a go and come back and rate your answer. I think it should work. My own fault for not figuring this one out.

0 Karma

emiller42
Motivator

So I'm assuming each Value 2 can only have one possible Value 3? (Including null) If so, the dedup command would fix this for you.

... | dedup "value 2" | chart count(value 2) count(value 3)

Would get you what you're looking for in the example.

0 Karma

shonky
New Member

I updated my question. I think I need to do a filter on my data before I count it - ie remove the unique values based on value 2 and then do the count. Just not sure how to remove the duplicate values first.

0 Karma

shonky
New Member

Because the the Value 2 is a duplicate.

Essentially I only want to count the unique devices. Line 4 - Store 1, Device 1, Black - has already showed in my logs.

Think of it as being like a repeat visitor to a website. It might show his IP address and then other information about the visitor. I don't want to count the other information twice as the visitor has already been there.

0 Karma

Ayn
Legend

In your sample data, I see 3 lines with a value for Value 3, so I don't follow why you would like the count to be 2?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...