I have log lines that I need to group by 4 or 5 fields so that I can find the duration. I am using transaction, but it takes a long, long, long time even for 4 hours period. What's the best way to go around it?
Thanks
As martin_mueller noted, transaction
can be rather resource expensive, and the stats
variant he presented works well in a lot of situations. However, you might also (really really) need to limit the amount of events that the transaction
command has to operate on.
This can be done by setting time constraints, specifying index, host, source, sourcetype
etc, filtering out unwanted events, e.g. NOT debug
.
Hope this helps,
k
Indeed - regardless of transaction, that's a good approach for every search and will make the stats substitute faster as well.
An often quicker way to compute pseudo-transactions is stats
with a by-attribute. Consider this:
some search | stats earliest(_time) as _time range(_time) as duration by transaction_id
This will compute the start and duration of each transaction. If you need more fields you can add more to the stats - the less you need, the faster it will be.
Note, this lacks features of transaction
such as maxspan, maxpause, and so on. If you need those you will likely have to stick with transaction
. You can use those to optimize the query by the way - if you for example know the maximum duration you may be able to drastically reduce the number of open transactions and speed things up without switching to stats
.