Splunk Search

How do I find rows in a table with more fields, which are NOT EQUAL to any row in table with less fields?

aovsiannikov
Explorer

I.e.

<search1>: ... | table id, f1, f2, f3
<search2>: ... | table id, f1, f2

I need to find all records in <search1> that are not equal to any record in <search2>

If I do something like

<search1> | search NOT [<search2>]

Splunk will not consider record, let's say <record1>: id=someID, f1=1, f2=2, f3=3 in <search1> different from record <record2>: id=someID, f1=1, f2=2 from <search2> , because field f3 will not be presented in the boolean expression generated by the subsearch. So it will not pass <record1> to final recordset, but I need it there.

So what is the best approach there?

0 Karma

aovsiannikov
Explorer

Guys, thanks a lot for replies, I'd like to experiment a little bit before accepting. But I'd like to avoid to relay on exact fieldset in search 1 and 2, I'd like to have some unified approach. Means, I'd like to avoid using f3 explicitly in search. Let's imagine that fieldset for search 1 is defined in some variable, I don't know exact fieldset, it's configurable, I don't want to parse it. So the spl should be independent from what is in configuration. Also I have no _raw field, both search are came from lookup.

Now all I can imagine is to search index1 twice, something like this:
index=index1 | search NOT [index=index2 | search [index=index1]]

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi aovsiannikov,
if results in search2 are less than 50,000 you can use a subsearch like the following

index=index1 NOT [ search index=index2 | fields id f1 f2 f3]
| ...

where id, f1, f2, f3 are the fields to compare results from the searches.
If you want to compare the full record you can use as row the field _raw.

If instead results in search2 are more than 50,000, you have to use a different approach, something like this:

index=index1 OR index=index2
| stats count BY index id f1 f2 f3
| where index=index1 AND count=1

if instead you want to compare the full raw you can run something like this:

index=index1 OR index=index2
| stats count BY index _raw
| where index=index1 AND count=1

Bye.
Giuseppe

FrankVl
Ultra Champion

Assuming that id is actually an identifier of a record, it could be as simple as this, right?

search1 OR search2
| evenstats count by id
| search f3=* count=1

(the f3=* is to return the rows belonging to search1. if f3 is not always populated for those items, then use some other characterizing property of the items that came from search1).

If you also need to take f1 and f2 into account for matching items between the two data sets:

search1 OR search2
| evenstats count by id,f1,f2
| search f3=* count=1
0 Karma

briancronrath
Contributor

Assuming these tables aren't over 50k rows (default max of a join subsearch) you could join the results together, and do some eval tricks to compare the data and filter it out. So something like:

 <search1>: ... | table id, f1, f2, f3 | join type=left id [ <search2>: ... | table id, f1, f2 | rename f1 as f1_x | rename f2 as f2_x] | eval filter_out=if(f1=f1_x AND f2=f2_x,"filter","keep") | search filter_out=keep
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...