Solved: Re: Comparing Results from two seperate searches?

avajax0 · ‎10-28-2021

Greetings,

I'm looking to craft a correlation that allows me to compare the results between two separate searches. Here's the use case:

I have 2 indexes, one containing Threat Intelligence data (including domain names to be specific for this case). While the other index holds all DNS requests.
I'm looking to craft a Splunk correlation that reads each domain within the DNS requests, which then compares each of those domains to the Threat Intelligence data and see if there's any matches.

For instance, maybe something along the lines of the logic below:

index=Threat_Intelligence
| table DomainName
| where DomainName IN [search index=DNS | table RequestedDomain]

FYI: The latest Threat Intelligence feeds are pulled every single morning and is updated within Splunk. I thought about using lookup tables or KV Store lookups, but we're pulling in several files each morning, 2 of which are close to 1GB in size. It looks like Splunk Cloud caps the event limit of these lookups to 10,000 events by default, and I've read to be cautious about increasing this limit.

somesoni2 · ‎10-29-2021

Try something like this:

index=Threat_Intelligence OR index=DNS
| eval DomainName=coalesce(DomainName,RequestedDomain)
| stats dc(index) as indexes by DomainName
| where indexes=2

(This will list DomainNames which appear on both indexes)

View solution in original post

somesoni2 · ‎10-29-2021

Try something like this:

index=Threat_Intelligence OR index=DNS
| eval DomainName=coalesce(DomainName,RequestedDomain)
| stats dc(index) as indexes by DomainName
| where indexes=2

(This will list DomainNames which appear on both indexes)

avajax0 · ‎11-01-2021

Thank you, this works like a charm. I see the logic, essentially just checking to see if a particular domain exists in both indexes (Threat_Intelligence and DNS), which would indicate a hit.

tread_splunk · ‎10-29-2021

Reading this again, I think you're looking for a subsearch. Something along these lines.

index=DNS 
    [| search index=Threat_Intelligence
    | table DomainName
    | rename DomainName as RequestedDomainName 
    | format]
| dedup RequestedDomainName
| table RequestedDomainName

So the subsearch returns a string like ((RequestedDomainName=IntelDomainName1) OR (RequestedDomainName="IntellDomainName2" etc etc)". Run the subsearch on its own and you'll see what it builds. This string gets applied to your DNS data as a filter. Leaving all the events in the DNS index which have domain names referenced in the Threat_Intelligence index.

avajax0 · ‎11-01-2021

Thank you, this appears to work well also. It's seems to be a little less efficient, but still gets the job done.

tread_splunk · ‎11-02-2021

Thanks @avajax0 . I do agree my solution, isn't as as elegant as the one posted by @somesoni2 , but I think mine might actually have performance advantages. Suggesting this here in case anyone would like to comment. The way I see it, in my example, the subquery returns a filter list which is applied to the DNS index. DNS might potentially have thousand of host names and hopefully the threat intelligence index will only have a hand full. Therefore the subquery in my example could reduce the number of events fetched from the DNS index by an order of magnitude improving the overall performance of the query. "Filter as early as possible" is a best practice rule I've heard repeated often. Interested to hear any other opinions. Good luck!

tread_splunk · ‎10-29-2021

Some example events from each index would be useful.

Comparing Results from two seperate searches?

subsearch

Enter the Splunk Community Dashboard Challenge for Your Chance to Win!

.conf24 | Session Scheduler is Live!!

Introducing the Splunk Community Dashboard Challenge!