Splunk Search

Dedup with multiple criteria

HeinzWaescher
Motivator

Hi,

I want to use the dedup command with more than one criteria.

First I used | dedup A and had 100 events afterwards.
Then I used | dedup A, B and had 70 events afterwards. In my understanding I the number of events should increase, because I've specified the dedup criteria and less duplicates should be identified?! Am I completely wrong?

Best

Heinz

Tags (1)
0 Karma

landen99
Motivator

dedup keepempty=t A B
http://docs.splunk.com/Documentation/Splunk/6.2.2/SearchReference/Dedup

My understanding is that dedup on 3 fields finds all matches on any two of them as duplicates. I will cite my source for that in a moment or just provide the results of a test case in support of that assertion, but I remember learning it in a Splunk course and testing it myself for validation.

0 Karma

HeinzWaescher
Motivator

A further question regarding the dedup command:

Let's say the fields A & B can appear multiple times in an event.
For example:

Event 1:
A=1
A=2
B=3
B=4
timestamp=X

Event:2
A=1
A=2
B=3
B=4
timestamp=X

Event 3:
A=1
A=2
B=3
B=4
timestamp=Y

| dedup A,B,timestamp

does this include all field values for A & B and results in two remaining events (event 1 and event 3)?

Thanks in advance

Heinz

0 Karma

HeinzWaescher
Motivator

thanks for confirming!

0 Karma

linu1988
Champion

Yes it gives the value till you have something distinct with the above combination.

HeinzWaescher
Motivator

Ah, now numbers are changing in the correct direction 🙂

And when I want to ignore events where the dedup criteria don't exist, I can just use

sourcetype=* AND
A=* AND
B=* AND

| dedup A,B

Thanks a lot!

0 Karma

Ayn
Legend

Then that's your problem there. You can do ... | fillnull B | ... if you want B with an empty value in events that don't have it. That will make dedup work.

HeinzWaescher
Motivator

Hey Ayn,

yes normally it should exist in all events. Is there a command to find out, whether there are events without the field B and to filter them out?

Edit:

Just tried it out with | sourctype=* AND NOT B= * .
This results in a few events

0 Karma

Ayn
Legend

Does B exist in all your events? IIRC dedup will fail otherwise.

Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...