Getting Data In

Filter records based on whether any records of a group match a given criteria

landen99
Motivator

In general, I am trying to filter records based on whether any records of a group match a given criteria.

Specifically, I want to discard all DNS records from any IP where a DNS record is seen with an R in it (to the left of the Q):
But records from any IP where that look like the following to be discarded:
3/17/2014 2:00:18 PM 06E4 PACKET 000000000F458210 UDP Rcv xx.yyy.zz.48 1234 R Q [0001 D NOERROR] A .server005.mycorp.com.

I used the field extractor tool to put the R and Q values into the field, dns_type.

How do I do a search which filters based on whether all records of a group do not contain specified field values? Will we have to use streamstats or is there an easier way?

Tags (4)
0 Karma
1 Solution

lguinn2
Legend

Without more information about how to define "a group", this is sort of a shot in the dark but -

Also, I assume that dns_type could have "R Q" as a value, and that there is a field named ip containing the ip address.

To simply eliminate events with of the dns_type, you could do this

yoursearchhere dns_type!="R Q"

But let's assume that you want to eliminate events from any ip that was associated with an event with a dns_type of "R Q":

yoursearchhere NOT [ search dns_type="R Q" | dedup ip | fields ip ]

This should work as long as the number of ip's returned from the subsearch is relatively small. It will not be a particularly fast search, though, as both the subsearch and the use of NOT will slow things down.

View solution in original post

lguinn2
Legend

Without more information about how to define "a group", this is sort of a shot in the dark but -

Also, I assume that dns_type could have "R Q" as a value, and that there is a field named ip containing the ip address.

To simply eliminate events with of the dns_type, you could do this

yoursearchhere dns_type!="R Q"

But let's assume that you want to eliminate events from any ip that was associated with an event with a dns_type of "R Q":

yoursearchhere NOT [ search dns_type="R Q" | dedup ip | fields ip ]

This should work as long as the number of ip's returned from the subsearch is relatively small. It will not be a particularly fast search, though, as both the subsearch and the use of NOT will slow things down.

lguinn2
Legend

The subsearch (which starts with [ and ends with ]) returns a set of results. The only thing I want from those results is a list of the src_ip values. The fields src_ip causes the subsearch to return a list of the src_ip values.

You can actually see the sub-search results if you check out the search job inspector after the search is finished.

Also, you can probably make your search go even faster if you don't put the pipes in it - they just add extra steps:

sourcetype=dns mysearch NOT [search .... ]

should work fine

0 Karma

landen99
Motivator

I have the field extracted so that dns_type is either R or Q even when there is "R Q" in the text. The search you suggested works:

sourcetype=dns | mysearch | search NOT [ search dns_type="R" | dedup src_ip | fields src_ip ]

I never knew you could bracket a search string under a not. Thank you. Why do you have "fields src_ip"?

0 Karma

lguinn2
Legend

What happens if you do this?

yoursearchhere NOT [ search dns_type="R Q" | dedup src_ip | fields src_ip ]

0 Karma

lguinn2
Legend

I would avoid streamstats - I don't think it will be particularly helpful here, and it will probably not be faster than a subsearch.

0 Karma

landen99
Motivator

What do you think about create a new field for each source ip called Query_only? The search would filter out any Query_only=False, and then set it to False if the record contains more than a single "Q" and then filters it out.

| while dns_type_ip!=False | if(dns_type!="Q", eval dns_type_ip="False" by src_ip) | while dns_type_ip!=False

..or maybe streamstats.. I am not completely clear on how it is used and have only seen it used to evaluate the current record in light of evaluations on a previous record.

0 Karma

landen99
Motivator

In the dns example, the group is the source ip. So if a source ip generates dns reply traffic then all dns traffic from that ip should be discarded. Records of ip addresses without any dns reply traffic would be kept.

It would probably be more clear to use the word "field" instead of group, but I am grouping these records by source ip in my mind so I am thinking in terms of these source ip groups.

0 Karma

lguinn2
Legend

What do you mean by "a group"?

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...