Splunk Search

Dedup with multiple criteria

HeinzWaescher
Motivator

Hi,

I want to use the dedup command with more than one criteria.

First I used | dedup A and had 100 events afterwards.
Then I used | dedup A, B and had 70 events afterwards. In my understanding I the number of events should increase, because I've specified the dedup criteria and less duplicates should be identified?! Am I completely wrong?

Best

Heinz

Tags (1)
0 Karma

landen99
Motivator

dedup keepempty=t A B
http://docs.splunk.com/Documentation/Splunk/6.2.2/SearchReference/Dedup

My understanding is that dedup on 3 fields finds all matches on any two of them as duplicates. I will cite my source for that in a moment or just provide the results of a test case in support of that assertion, but I remember learning it in a Splunk course and testing it myself for validation.

0 Karma

HeinzWaescher
Motivator

A further question regarding the dedup command:

Let's say the fields A & B can appear multiple times in an event.
For example:

Event 1:
A=1
A=2
B=3
B=4
timestamp=X

Event:2
A=1
A=2
B=3
B=4
timestamp=X

Event 3:
A=1
A=2
B=3
B=4
timestamp=Y

| dedup A,B,timestamp

does this include all field values for A & B and results in two remaining events (event 1 and event 3)?

Thanks in advance

Heinz

0 Karma

HeinzWaescher
Motivator

thanks for confirming!

0 Karma

linu1988
Champion

Yes it gives the value till you have something distinct with the above combination.

HeinzWaescher
Motivator

Ah, now numbers are changing in the correct direction 🙂

And when I want to ignore events where the dedup criteria don't exist, I can just use

sourcetype=* AND
A=* AND
B=* AND

| dedup A,B

Thanks a lot!

0 Karma

Ayn
Legend

Then that's your problem there. You can do ... | fillnull B | ... if you want B with an empty value in events that don't have it. That will make dedup work.

HeinzWaescher
Motivator

Hey Ayn,

yes normally it should exist in all events. Is there a command to find out, whether there are events without the field B and to filter them out?

Edit:

Just tried it out with | sourctype=* AND NOT B= * .
This results in a few events

0 Karma

Ayn
Legend

Does B exist in all your events? IIRC dedup will fail otherwise.

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...