All,
I'm trying to transact on two searches. The first search returns very quickly (there are only a few events to match), but the second search is fairly dense. I've optimized the search fairly well, but I think there is a better way to do this.
My search is below, and you'll notice that I'm running the "Subsearch" search twice, then transacting on the same field that I'm filtering the "Dense search" by. This method also means that I need to wait for "Subsearch" to return twice before my transaction ever gets run.
Is there a better way to filter the Dense search down prior to piping to the transaction?
Message="Subsearch" OR (Message="Dense search" AND [search Message="Subsearch" | fields requestId]) | transaction requestId maxevents=2 maxspan=5s
Thanks!
Hi bruceclarke,
First let me state that all depends on your use case, which is unknown to me....but there are some ways to achieve this.
Let me show you some run everywhere searches and how they perform in my current setup (one search head using 6 indexers).
Let's create some baseline by doing this basic search:
index=_internal OR index=_audit * earliest=-1d@d latest=-0d@d
This search has completed and has returned 9,801,705 results by scanning 9,801,705 events in 282.547 seconds - although it will only display the first 1000 events in the UI.
Next I tried to rebuild your dense search and used a subsearch and a transaction in this:
index=_audit OR index=_internal * earliest=-1d@d latest=-0d@d [ search index=_internal earliest=-1d@d latest=-0d@d | fields user ] | transaction user
This search has completed and has returned 2,121 results by scanning 9,801,705 events in 726.277 seconds.
The subsearch here results in a truncated list of ( ( user="splunk-system-user" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="splunk-system-user" ) OR ....
it was truncated because of the limits.conf
setting.
Next thing was to test the same without subsearch, because the user
field is available in _internal
and _audit
and the transaction
will use this field:
index=_audit OR index=_internal user=* earliest=-1d@d latest=-0d@d | transaction user
This search has completed and has returned 2,420 results by scanning 9,801,705 events in 201.553 seconds. As you can see we already have improved the search performance. Okay, let's go full throttle now:
index=_audit OR index=_internal user=* earliest=-1d@d latest=-0d@d | stats count by user, index
This search has completed and has returned 363 results by scanning 9,801,705 events in 40.771 seconds.
We get less events back now, because this will count based on the user. Each user could have multiple event though.
So what if you need some information out of the events based on the user field? No problem, try this:
index=_audit OR index=_internal user=* earliest=-1d@d latest=-0d@d | stats values(action) AS action, count by user, index
This search has completed and has returned 363 results by scanning 9,801,705 events in 75.97 seconds.
As you see this search took a bit longer and returns a stats table of action
for each user
and where the events are stored.
Again, maybe you need to use the subsearch/transaction combination in your use case. But based on your provided example and the limits you have by using a subsearch you should give stats
a try.....
Here are some other great answers related to this:
hope this helps to speed up things a bit ...
cheers, MuS
Hi bruceclarke,
First let me state that all depends on your use case, which is unknown to me....but there are some ways to achieve this.
Let me show you some run everywhere searches and how they perform in my current setup (one search head using 6 indexers).
Let's create some baseline by doing this basic search:
index=_internal OR index=_audit * earliest=-1d@d latest=-0d@d
This search has completed and has returned 9,801,705 results by scanning 9,801,705 events in 282.547 seconds - although it will only display the first 1000 events in the UI.
Next I tried to rebuild your dense search and used a subsearch and a transaction in this:
index=_audit OR index=_internal * earliest=-1d@d latest=-0d@d [ search index=_internal earliest=-1d@d latest=-0d@d | fields user ] | transaction user
This search has completed and has returned 2,121 results by scanning 9,801,705 events in 726.277 seconds.
The subsearch here results in a truncated list of ( ( user="splunk-system-user" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="nobody" ) OR ( user="splunk-system-user" ) OR ....
it was truncated because of the limits.conf
setting.
Next thing was to test the same without subsearch, because the user
field is available in _internal
and _audit
and the transaction
will use this field:
index=_audit OR index=_internal user=* earliest=-1d@d latest=-0d@d | transaction user
This search has completed and has returned 2,420 results by scanning 9,801,705 events in 201.553 seconds. As you can see we already have improved the search performance. Okay, let's go full throttle now:
index=_audit OR index=_internal user=* earliest=-1d@d latest=-0d@d | stats count by user, index
This search has completed and has returned 363 results by scanning 9,801,705 events in 40.771 seconds.
We get less events back now, because this will count based on the user. Each user could have multiple event though.
So what if you need some information out of the events based on the user field? No problem, try this:
index=_audit OR index=_internal user=* earliest=-1d@d latest=-0d@d | stats values(action) AS action, count by user, index
This search has completed and has returned 363 results by scanning 9,801,705 events in 75.97 seconds.
As you see this search took a bit longer and returns a stats table of action
for each user
and where the events are stored.
Again, maybe you need to use the subsearch/transaction combination in your use case. But based on your provided example and the limits you have by using a subsearch you should give stats
a try.....
Here are some other great answers related to this:
hope this helps to speed up things a bit ...
cheers, MuS
@somesoni2 - I'll give that a try. I'm curious - Does Splunk cache the Message="Subsearch" search? If not, it must be running the same search twice, which would be great to avoid if possible.
Try this, may work slightly better
(Message="Subsearch" OR Message="Dense search" ) [search Message="Subsearch" | fields requestId] | transaction requestId maxevents=2 maxspan=5s