Splunk Search

Limiting Results of matching values in an array field

yepyepyayyooo
New Member

Anyone know of a way to only return the matching values of a sub search to the string array field in the parent search?

index="email" sourcetype="email_links" 
    [ search index="sinkholed" sourcetype="bad_http" 
    | rename raw_host as "extracted_host{}"
    | fields "extracted_host{}" ] 
| stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
| sort recipient_dc

The query works fine except I'm getting back more than I want. The results I get back in the "extracted_host{}" field are everything in that particular field value array instead of just the matching criteria. For example, in the sub-search let's say there is a sinkhole domain called baddomain.com. The results I see in "extracted_host{}" are:

baddomain.com
www.w3.org
abc123advertisement.com
etcetcetc.com

Would like to only return what matched in the sub-search. Any assistance is greatly appreciated.

0 Karma
1 Solution

manjunathmeti
Champion

Field "extracted_host{}" in main search is a json array. So when you filter extracted_host{} = baddomain.com in main search all other values in arrays containing baddomain.com will also appear in search results.

You need to expand field extracted_host{} in main search before filtering it with a sub-search.

index="email" sourcetype="email_links" 
| mvexpand extracted_host{}
| search
     [ search index="sinkholed" sourcetype="bad_http" 
     | rename raw_host as "extracted_host{}"
     | fields "extracted_host{}" ] 
 | stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
 | sort recipient_dc

View solution in original post

0 Karma

manjunathmeti
Champion

Field "extracted_host{}" in main search is a json array. So when you filter extracted_host{} = baddomain.com in main search all other values in arrays containing baddomain.com will also appear in search results.

You need to expand field extracted_host{} in main search before filtering it with a sub-search.

index="email" sourcetype="email_links" 
| mvexpand extracted_host{}
| search
     [ search index="sinkholed" sourcetype="bad_http" 
     | rename raw_host as "extracted_host{}"
     | fields "extracted_host{}" ] 
 | stats dc("rcptto{}") as recipient_dc values("rcptto{}") values("extracted_host{}") values(subject) by from
 | sort recipient_dc
0 Karma

yepyepyayyooo
New Member

Thanks! That worked. How would you go about performing this on multiple multi-value fields?

| mvexpand extracted_host{}, url

0 Karma

manjunathmeti
Champion

Welcome! Yes, you can filter on multiple multi-value fields.

<base search>
| mvexpand extracted_host{}
| mvexpand url
| search
[<search> | fields extracted_host{}, url]
0 Karma

wmyersas
Builder

I'll presume raw_host is a multivalue field

Presuming that is the case, do the following:

index=email sourcetype=email_links
| search
    [ search index=sinkholed sourcetype=bad_http
    | mvexpand raw_host
    | stats count by raw_host
    | fields - count
    | rename raw_host as <field-in-outer-search> ]
| <rest of search>

That should only show you email_links to domains that were sinkholed

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...

Updated Data Management and AWS GDI Inventory in Splunk Observability

We’re making some changes to Data Management and Infrastructure Inventory for AWS. The Data Management page, ...