Splunk Search

How to compare search results with lookup for duplicates?

solaced
Explorer

I have a lookup which I want to compare search results against and find duplicate values.   How do I ignore duplicates found that already exist in my dataset? And only identify duplicates using results from my search compared to the dataset?

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2

Labels (1)
0 Karma
1 Solution

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

You can use isnull() function like below;

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND isnull(from_lookup)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.

View solution in original post

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

You can use isnull() function like below;

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND isnull(from_lookup)

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.

solaced
Explorer

Thank you.  I added eval from_lookup="0" to the base search which accomplishes the same as your answer. Though yours is more elegant 🙂 

I appreciate your help, thank you so much!

0 Karma

solaced
Explorer

Thanks @scelikok  I think it's on the right track.  Unfortunately the output does not contain any results now from my index search.  Which I would expect show an empty  from_lookup value.

It only contains events from the inputlookup table which have from_lookup = 1

0 Karma

scelikok
SplunkTrust
SplunkTrust

Hi @solaced,

Can you please try the below search? If I understood correctly I added an extra field to be able to filter values from lookup.

index=myindex sourcetype=k_logs (ns4:phoneNo OR emailInfo OR address) AND DummyOrgName AND "<requestType>UPDATEUSER</requestType>"
| xmlkv
| rename "ns4:phoneNo" as phoneNo
| search orgName = "DummyOrgName"
| eval phoneAndEmail= coalesce(phoneNo, address)
| fields phoneAndEmailphoneNo address ipAddress userName
| table phoneAndEmailphoneNo address ipAddress userName
| append [|inputlookup thisLookup.csv | table phoneAndEmail phoneNo address ipAddress userName | eval from_lookup="1"]
| stats values(phoneNo) AS phoneNo values(address) AS address values(ipAddress) AS ipAddress values(userName) AS userName values(from_lookup) as from_lookup dc(userName) AS UserCount by phoneAndEmail
| where UserCount>2 AND from_lookup="1"

 

If this reply helps you an upvote and "Accept as Solution" is appreciated.
0 Karma

solaced
Explorer

To be clear, I'm looking at the value phoneAndEmail and if it is used on multiple userName

I want to compare results from my index against my dataset, and not compare results WITHIN the dataset.

0 Karma
Get Updates on the Splunk Community!

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...

Threat Hunting Unlocked: How to Uplevel Your Threat Hunting With the PEAK Framework ...

WATCH NOWAs AI starts tackling low level alerts, it's more critical than ever to uplevel your threat hunting ...