Splunk Search

Dedup with multiple criteria

HeinzWaescher
Motivator

Hi,

I want to use the dedup command with more than one criteria.

First I used | dedup A and had 100 events afterwards.
Then I used | dedup A, B and had 70 events afterwards. In my understanding I the number of events should increase, because I've specified the dedup criteria and less duplicates should be identified?! Am I completely wrong?

Best

Heinz

Tags (1)
0 Karma

landen99
Motivator

dedup keepempty=t A B
http://docs.splunk.com/Documentation/Splunk/6.2.2/SearchReference/Dedup

My understanding is that dedup on 3 fields finds all matches on any two of them as duplicates. I will cite my source for that in a moment or just provide the results of a test case in support of that assertion, but I remember learning it in a Splunk course and testing it myself for validation.

0 Karma

HeinzWaescher
Motivator

A further question regarding the dedup command:

Let's say the fields A & B can appear multiple times in an event.
For example:

Event 1:
A=1
A=2
B=3
B=4
timestamp=X

Event:2
A=1
A=2
B=3
B=4
timestamp=X

Event 3:
A=1
A=2
B=3
B=4
timestamp=Y

| dedup A,B,timestamp

does this include all field values for A & B and results in two remaining events (event 1 and event 3)?

Thanks in advance

Heinz

0 Karma

HeinzWaescher
Motivator

thanks for confirming!

0 Karma

linu1988
Champion

Yes it gives the value till you have something distinct with the above combination.

HeinzWaescher
Motivator

Ah, now numbers are changing in the correct direction 🙂

And when I want to ignore events where the dedup criteria don't exist, I can just use

sourcetype=* AND
A=* AND
B=* AND

| dedup A,B

Thanks a lot!

0 Karma

Ayn
Legend

Then that's your problem there. You can do ... | fillnull B | ... if you want B with an empty value in events that don't have it. That will make dedup work.

HeinzWaescher
Motivator

Hey Ayn,

yes normally it should exist in all events. Is there a command to find out, whether there are events without the field B and to filter them out?

Edit:

Just tried it out with | sourctype=* AND NOT B= * .
This results in a few events

0 Karma

Ayn
Legend

Does B exist in all your events? IIRC dedup will fail otherwise.

Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...