Splunk Search

SPL search pattern efficiency: join vs eventstats

elliotproebstel
Champion

We have a Splunk app that was developed in-house to track indicators that are submitted to a blocklist. Here's a simplified version of the workflow:

  1. Analyst submits indicator to be blocked/unblocked/whitelisted. An event like this is logged: index="blocklist" user="jsmith" indicator="badguy@test.com" source="threatfeed" status="submitted" action="block". (Additional, sensitive fields are present in real events.)
  2. Lead analyst reviews submissions and approves or rejects each indicator. An event like this is logged: index="blocklist" user="dtownsend" indicator="badguy@test.com" source="threatfeed" status="approved" action="block"
  3. Lead analyst complies newly-approved indicators and sends them to an operations team for implementation. An event like this is logged: index="blocklist" user="dtownsend" indicator="badguy@test.com" source="threatfeed" status="distributed" action="block"

I am trying to revise the queries that populate some of that dashboards that analysts use to interact with the blocklist data, and I'd like some guidance on search patterns. I've been running local tests on the various approaches, but the results aren't as conclusive as I'd like.

Approval/Rejection Dashboard
This page should display all indicators that have been submitted in the last seven days and have not yet been approved. The engineer who built this app used the following query structure to populate the dashboard:

index="blocklist" status="submitted"
| join type=left indicator action source 
[ index="blocklist" (status="approved" OR status="rejected")
  | eval has_been_reviewed="true" ]
| search NOT (has_been_reviewed="true")

I've learned to be wary anytime I see join, and I understand that negative searches (i.e. searches using NOT) are less efficient than positive searches. So I was planning to revise the above into this:

index="blocklist" (status="submitted" OR status="approved" OR status="rejected")
| eventstats dc(status) AS status_count values(status) AS status BY action indicator
| search status_count=1 status="submitted"

However, I wanted to first ask - is eventstats more efficient? Or is there an even better pattern I could be using for this search? Thanks!

0 Karma
1 Solution

DalJeanis
Legend

Yes, this should be much better than the join.

I'd tend to do it this way, which is pretty much equivalent to yours performance-wise...

index="blocklist"  (status="submitted" OR status="approved" OR status="rejected")
| eventstats max(eval(case(status="rejected" OR status="approved","Yes"))) as decisioned 
     BY action indicator
| where status="submitted" AND isnull(decisioned)

updated to use where.

View solution in original post

0 Karma

DalJeanis
Legend

Yes, this should be much better than the join.

I'd tend to do it this way, which is pretty much equivalent to yours performance-wise...

index="blocklist"  (status="submitted" OR status="approved" OR status="rejected")
| eventstats max(eval(case(status="rejected" OR status="approved","Yes"))) as decisioned 
     BY action indicator
| where status="submitted" AND isnull(decisioned)

updated to use where.

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...