Splunk Search

dedup only when values match in two fields

CharterBT
Explorer

Here's an interesting problem. I need to write a query where Splunk removes an event when two specific values in a found event match. For example, a mocked-up sample of my results shows this:

0.0.0.0 test
0.0.0.0 pass
0.0.0.0 pass

I'd like Splunk to only remove the second instance of "0.0.0.0 pass" while keeping the first instance as well as the "0.0.0.0 test" in my results.

Is there an easy way to do this? If it helps, the field name for the numbers is src and for the words is cs5. Any help is appreciated.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Dedup should be able to do this. If you post a little more of your end game, there maybe a more optimized approach. Do you want counts of how many times this happens? etc.

your_search | dedup 2 src

http://docs.splunk.com/Documentation/Splunk/6.0/SearchReference/Dedup

0 Karma

lukejadamec
Super Champion

To find the version, from Splunkweb in the upper right, click About.

0 Karma

CharterBT
Explorer

Not sure the version... but it's not 6, yet. Thanks for the tips. I'll try them and let you know how it goes.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Thats where stats count by cs5 src works a little faster. stats is done at the indexer, dedup is done at the search head. dedup src cs5 should be doing the same thing according to the docs. what version are you using?

CharterBT
Explorer

One other thing. I tried "dedup src, cs5", but it didn't retain any new "src" records after it found its first duplicate src value. I need the dedup to be a little smarter and only remove duplicate entries of the src/cs5 combination.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Then a better search is: your_search | stats dc(cs5) as DistinctInfections by src. This gives you each individual source and how many different infections they have over the time range. If you want how many of each infection per src, do your_search | stats count by cs5 src.

CharterBT
Explorer

No, I don't need to know how many times it repeats.

Each # value is a computer, and each word value is a type of malware. Some computers have multiple infections, so I just need to remove the instances where that computer/malware combination has already been identified. My search is covering a month-long timeframe, so I don't need to count every time it shows up, just that it did at some point.

Does that help?

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...