I've looked around for answers on this, but unfortunately I've not found an answer to date. I have a list of data, but some of this is duplicate and as such I need to remove a whole row rather than just that value itself, ie:
Store Value 2 Value 3
Store 1 Device_1 Black
Store 1 Device_2 Black
Store 1 Device_3 ----
Store 1 Device_1 Black
What I would like to do is chart this so that I have:
Store Count of Value 2 (Unique) Count of Value 3
Store 1 3 2
When I use
... | chart dc(value 2), count (value 3) by store
It omits only the duplicate value in Value 2, but my count in value 3 is too high, ie:
Store Count of Value 2 (Unique) Count of Value 3
Store 1 3 3
Hi - Sorry, I think the "---" is actually empty in my data. So that part should be okay.
If I extend the values for two more line:
Store Value 2 Value 3
Store 1 Device_1 Black
Store 1 Device_2 Black
Store 1 Device_3
Store 1 Device_1 Black
Store 1 Device_1 Black
Store 1 Device_4 Black
If I exported the data to Excel, I would initially filter all of the results so that I only had unique values in Value 2:
Store Value 2 Value 3
Store 1 Device_1 Black
Store 1 Device_2 Black
Store 1 Device_3
Store 1 Device_4 Black
I would then run a pivot table which would show me the following:
Value 2 Value 3
Store 1 4 3
So essentially, I want to remove any duplicate "lines" based on Value 2.
I hope that makes more sense. Sorry though.
So you're telling Splunk to give you a distinct count of Value 2, which is does. (There are 3 distinct values) and a count of all items in Value 3, which is does. (I'm assuming the '----' is actually NULL in your records, so again there are 3 values)
What I'm not sure about is what you want the count to be for Value 3. Do you want a count of all records (what your query asks for) or a distinct count? (What your expected result set shows) I'm not sure why you're not using dc for both counts.
Now, if the '----' is actually an empty field, it won't be included in the count. So perhaps you're expecting it to be included? (That would cause a dc(Value 3) to return 2 like your expected results) If so, you can fillnull to give all nulls some value which would then be counted.
... | fillnull value=NULL "value 3" | chart dc(value 2) dc(value 3) by store
If it can have multiple values, then you can simply
... | dedup store "value 2" "value 3" | chart count(value 2) count(value 3) by store
and that should get what you're looking for.
I was going to say no, that it can have more than 1 answer. But thinking about it, I think you may be right if I limit my time span to a short enough period. I will give it a go and come back and rate your answer. I think it should work. My own fault for not figuring this one out.
So I'm assuming each Value 2 can only have one possible Value 3? (Including null) If so, the dedup command would fix this for you.
... | dedup "value 2" | chart count(value 2) count(value 3)
Would get you what you're looking for in the example.
I updated my question. I think I need to do a filter on my data before I count it - ie remove the unique values based on value 2 and then do the count. Just not sure how to remove the duplicate values first.
Because the the Value 2 is a duplicate.
Essentially I only want to count the unique devices. Line 4 - Store 1, Device 1, Black - has already showed in my logs.
Think of it as being like a repeat visitor to a website. It might show his IP address and then other information about the visitor. I don't want to count the other information twice as the visitor has already been there.
In your sample data, I see 3 lines with a value for Value 3, so I don't follow why you would like the count to be 2?