Splunk Search

Comparing two fields from different sources

cirrusfa
Explorer

Dear all,

I would like to compare two fields on a sequential way coming from different sourcetypes already indexed at splunk. For instance, the sourcetype 1 has the querys done by clients to the DNS. The sourcetype 2 contains a dynamic list of malicious domains. I would like to correlate both sourcetypes in order to know whether a client is trying to resolve a malicious domain. Example:
sourcetype 1 (DNS)
.www.facebook.com
.www.google.com
.www.linkedin.com
.www.malicious2.com

sourcetype 2 (malicious domains)
malicious1.com
malicious2.com

Expected result:

malicious2.com

Thanks a lot for the support!

Cheers,

0 Karma

cirrusfa
Explorer

Thanks @aholzer and @somesoni2 for your quick reply. Join indeed is the command that I have to use. However, I'm still having a problem with my query. When I'm doing the join, the matching between both searchs is not 100% accurated as there are some false positives included as a result of the query. For instance:

Sourcer type 1 (dns)
Domaingood1
Domaingood2
Domaingood3
Maliciousdomain1
Maliciousdomain2
Domaingood3

Sourcer type 2 (maldomains)
Maliciousdomain1
Maliciousdomain2
Maliciousdomain3

Result of the query
Maliciousdomain1 --> OK
Maliciousdomain2 --> OK
Domaingood3 --> False positive

In order to debug this error, I have split the left side (the search to obtain the domains resolved by the DNS) and right side (the malicious domains) of the join in a different search and saved the results in two different files (csv files). So, I have one csv with the domains resolved by the DNS and another csv with the malicious domains. After that, I did the join of both csv files in order to verify whether the result is the same:

| inputlookup dns.csv| join value [inputlookup feesII.csv |fields value]  

The output of this query is 100% correct and shows the results desired without falses positives. So, I don't understand why the first query is showing falses positives.

Does anyody have an idea?

Thank you in advance,

Cheers

0 Karma

cirrusfa
Explorer

The query looks like:

sourcetype="DNS" NOT (dns_record_type="PTR") | rex mode=sed "s/(\d+)/./g" | rex field=_raw "Rcv\s+[\d.]+\s+[\d,a-zA-Z]+\s+[A-Z]+\s[[\d,\s,A-z]+]\s+[A-Z]+\s+." | eval domaindns = substr(domaindns, 1, len(domaindns)-1)|stats count by domaindns | rename domaindns as value | fields - count | join value [search sourcetype="CTI" type="domain" | stats count by value| fields - count]

0 Karma

cirrusfa
Explorer

Thanks @aholzer, I have introduced the changes. When I'm doing the query only with My IP is working but when there is no a filter by IP the query is showing falses positives. If i perform the search from the last hour i have (8500 DNS records), (2 malicious domains) and the results are 7 domains (the 2 malicious so OK + 5 falses positives so good domains).

0 Karma

aholzer
Motivator

The reason you are getting the dupe is because of your "| stats count by dns_client,domaindns". Remove the "dns_client" from the by clause, and it should work.

0 Karma

cirrusfa
Explorer

I have not included the entire query (regular expersion) as i do not have enought space available.

0 Karma

cirrusfa
Explorer

sorry for not include my answer as a comment (first time using the forum). This is the query (big query) I'm running:

sourcetype="DNS" NOT (dns_record_type="PTR") dns_client="Myip"| rex mode=sed "s/(\d+)/./g" | rex field=_raw "Regular Expresion to extract the domainname as domaindns" | stats count by dns_client,domaindns | fields - count | join domaindns[search sourcetype="Feeds" type="domain" | stats count by value| rename value as domaindns|fields - count]

Thanks

0 Karma

aholzer
Motivator

You should give us the search you are trying to run.

Also you shouldn't post comments or updates to your question as answers, it just confuses things.

0 Karma

somesoni2
Revered Legend

Extending first answer from @aholzer, you can do following (little faster):

sourcetype="sourcetype1" | stats count by client,DNS | rename DNS as mals_domain | fields - count| join mals_domain [search sourcetype="sourcetype2" | stats count by mals_domain | fields - count]

Above should give you client name along with mals domain accessed by him/her. Will return 0 records if a client is not hitting any malicious domains.

aholzer
Motivator

This should do it:

sourcetype="DNS" | join field1 [search sourcetype="malicious domains" | rename field2 as field1]

Where "field1" is the field from DNS sourcetype that has the domain, and field2 is the field from "malicioius domains" that has the domain.

This will get you all the DNS events that have a matching domain from DNS in malicious domains.

You can probably do something like this too:

(sourcetype="DNS" OR sourcetype="malicious domains") | rename field2 as field1 | dedup field1, sourcetype | stats count by field1 | where count > 1 | table field1

This will give you all the values of "field1" that have more than 1 sourcetype attached to them.

Hope this helps.

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...