Splunk Search

How to filter a dense search prior to a transaction?

bruceclarke
Contributor

All,

I'm trying to transact on two searches. The first search returns very quickly (there are only a few events to match), but the second search is fairly dense. I've optimized the search fairly well, but I think there is a better way to do this.

My search is below, and you'll notice that I'm running the "Subsearch" search twice, then transacting on the same field that I'm filtering the "Dense search" by. This method also means that I need to wait for "Subsearch" to return twice before my transaction ever gets run.

Is there a better way to filter the Dense search down prior to piping to the transaction?

Message="Subsearch" OR (Message="Dense search" AND [search Message="Subsearch" | fields requestId])
| transaction requestId maxevents=2 maxspan=5s

Thanks!

0 Karma
1 Solution

MuS
Legend

Hi bruceclarke,

First let me state that all depends on your use case, which is unknown to me....but there are some ways to achieve this.

Let me show you some run everywhere searches and how they perform in my current setup (one search head using 6 indexers).

Let's create some baseline by doing this basic search:

index=_internal OR index=_audit * earliest=-1d@d latest=-0d@d

This search has completed and has returned 9,801,705 results by scanning 9,801,705 events in 282.547 seconds - although it will only display the first 1000 events in the UI.
Next I tried to rebuild your dense search and used a subsearch and a transaction in this:

index=_audit OR index=_internal * earliest=-1d@d latest=-0d@d [ search index=_internal earliest=-1d@d latest=-0d@d | fields user ] | transaction user

This search has completed and has returned 2,121 results by scanning 9,801,705 events in 726.277 seconds.
The subsearch here results in a truncated list of ( ( user="splunk-system-user" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="splunk-system-user" ) OR .... it was truncated because of the limits.confsetting.

Next thing was to test the same without subsearch, because the user field is available in _internal and _audit and the transaction will use this field:

index=_audit OR index=_internal user=*  earliest=-1d@d latest=-0d@d | transaction user

This search has completed and has returned 2,420 results by scanning 9,801,705 events in 201.553 seconds. As you can see we already have improved the search performance. Okay, let's go full throttle now:

index=_audit OR index=_internal user=* earliest=-1d@d latest=-0d@d  | stats count by user, index

This search has completed and has returned 363 results by scanning 9,801,705 events in 40.771 seconds.
We get less events back now, because this will count based on the user. Each user could have multiple event though.

So what if you need some information out of the events based on the user field? No problem, try this:

index=_audit OR index=_internal user=* earliest=-1d@d latest=-0d@d | stats values(action) AS action, count by user, index

This search has completed and has returned 363 results by scanning 9,801,705 events in 75.97 seconds.
As you see this search took a bit longer and returns a stats table of action for each user and where the events are stored.

Again, maybe you need to use the subsearch/transaction combination in your use case. But based on your provided example and the limits you have by using a subsearch you should give stats a try.....

Here are some other great answers related to this:

hope this helps to speed up things a bit ...

cheers, MuS

View solution in original post

MuS
Legend

Hi bruceclarke,

First let me state that all depends on your use case, which is unknown to me....but there are some ways to achieve this.

Let me show you some run everywhere searches and how they perform in my current setup (one search head using 6 indexers).

Let's create some baseline by doing this basic search:

index=_internal OR index=_audit * earliest=-1d@d latest=-0d@d

This search has completed and has returned 9,801,705 results by scanning 9,801,705 events in 282.547 seconds - although it will only display the first 1000 events in the UI.
Next I tried to rebuild your dense search and used a subsearch and a transaction in this:

index=_audit OR index=_internal * earliest=-1d@d latest=-0d@d [ search index=_internal earliest=-1d@d latest=-0d@d | fields user ] | transaction user

This search has completed and has returned 2,121 results by scanning 9,801,705 events in 726.277 seconds.
The subsearch here results in a truncated list of ( ( user="splunk-system-user" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="splunk-system-user" ) OR .... it was truncated because of the limits.confsetting.

Next thing was to test the same without subsearch, because the user field is available in _internal and _audit and the transaction will use this field:

index=_audit OR index=_internal user=*  earliest=-1d@d latest=-0d@d | transaction user

This search has completed and has returned 2,420 results by scanning 9,801,705 events in 201.553 seconds. As you can see we already have improved the search performance. Okay, let's go full throttle now:

index=_audit OR index=_internal user=* earliest=-1d@d latest=-0d@d  | stats count by user, index

This search has completed and has returned 363 results by scanning 9,801,705 events in 40.771 seconds.
We get less events back now, because this will count based on the user. Each user could have multiple event though.

So what if you need some information out of the events based on the user field? No problem, try this:

index=_audit OR index=_internal user=* earliest=-1d@d latest=-0d@d | stats values(action) AS action, count by user, index

This search has completed and has returned 363 results by scanning 9,801,705 events in 75.97 seconds.
As you see this search took a bit longer and returns a stats table of action for each user and where the events are stored.

Again, maybe you need to use the subsearch/transaction combination in your use case. But based on your provided example and the limits you have by using a subsearch you should give stats a try.....

Here are some other great answers related to this:

hope this helps to speed up things a bit ...

cheers, MuS

bruceclarke
Contributor

@somesoni2 - I'll give that a try. I'm curious - Does Splunk cache the Message="Subsearch" search? If not, it must be running the same search twice, which would be great to avoid if possible.

0 Karma

somesoni2
Revered Legend

Try this, may work slightly better

(Message="Subsearch" OR Message="Dense search" )  [search Message="Subsearch" | fields requestId] | transaction requestId maxevents=2 maxspan=5s
0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...