Comparing two fields from different sources

cirrusfa · ‎11-25-2013

Dear all,

I would like to compare two fields on a sequential way coming from different sourcetypes already indexed at splunk. For instance, the sourcetype 1 has the querys done by clients to the DNS. The sourcetype 2 contains a dynamic list of malicious domains. I would like to correlate both sourcetypes in order to know whether a client is trying to resolve a malicious domain. Example:
sourcetype 1 (DNS)
.www.facebook.com
.www.google.com
.www.linkedin.com
.www.malicious2.com

sourcetype 2 (malicious domains)
malicious1.com
malicious2.com

Expected result:

malicious2.com

Thanks a lot for the support!

Cheers,

cirrusfa · ‎11-26-2013

Thanks @aholzer and @somesoni2 for your quick reply. Join indeed is the command that I have to use. However, I'm still having a problem with my query. When I'm doing the join, the matching between both searchs is not 100% accurated as there are some false positives included as a result of the query. For instance:

Sourcer type 1 (dns)
Domaingood1
Domaingood2
Domaingood3
Maliciousdomain1
Maliciousdomain2
Domaingood3

Sourcer type 2 (maldomains)
Maliciousdomain1
Maliciousdomain2
Maliciousdomain3

Result of the query
Maliciousdomain1 --> OK
Maliciousdomain2 --> OK
Domaingood3 --> False positive

In order to debug this error, I have split the left side (the search to obtain the domains resolved by the DNS) and right side (the malicious domains) of the join in a different search and saved the results in two different files (csv files). So, I have one csv with the domains resolved by the DNS and another csv with the malicious domains. After that, I did the join of both csv files in order to verify whether the result is the same:

| inputlookup dns.csv| join value [inputlookup feesII.csv |fields value]

The output of this query is 100% correct and shows the results desired without falses positives. So, I don't understand why the first query is showing falses positives.

Does anyody have an idea?

Thank you in advance,

Cheers

cirrusfa · ‎11-26-2013

The query looks like:

cirrusfa · ‎11-26-2013

Thanks @aholzer, I have introduced the changes. When I'm doing the query only with My IP is working but when there is no a filter by IP the query is showing falses positives. If i perform the search from the last hour i have (8500 DNS records), (2 malicious domains) and the results are 7 domains (the 2 malicious so OK + 5 falses positives so good domains).

aholzer · ‎11-26-2013

The reason you are getting the dupe is because of your "| stats count by dns_client,domaindns". Remove the "dns_client" from the by clause, and it should work.

cirrusfa · ‎11-26-2013

I have not included the entire query (regular expersion) as i do not have enought space available.

cirrusfa · ‎11-26-2013

sorry for not include my answer as a comment (first time using the forum). This is the query (big query) I'm running:

Thanks

aholzer · ‎11-26-2013

You should give us the search you are trying to run.

Also you shouldn't post comments or updates to your question as answers, it just confuses things.

somesoni2 · ‎11-25-2013

Extending first answer from @aholzer, you can do following (little faster):

sourcetype="sourcetype1" | stats count by client,DNS | rename DNS as mals_domain | fields - count| join mals_domain [search sourcetype="sourcetype2" | stats count by mals_domain | fields - count]

Above should give you client name along with mals domain accessed by him/her. Will return 0 records if a client is not hitting any malicious domains.

aholzer · ‎11-25-2013

This should do it:

sourcetype="DNS" | join field1 [search sourcetype="malicious domains" | rename field2 as field1]

Where "field1" is the field from DNS sourcetype that has the domain, and field2 is the field from "malicioius domains" that has the domain.

This will get you all the DNS events that have a matching domain from DNS in malicious domains.

You can probably do something like this too:

(sourcetype="DNS" OR sourcetype="malicious domains") | rename field2 as field1 | dedup field1, sourcetype | stats count by field1 | where count > 1 | table field1

This will give you all the values of "field1" that have more than 1 sourcetype attached to them.

Hope this helps.

Comparing two fields from different sources

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!