Hello!
I've been told to use stats values()
instead of transaction
for performance issues. However, with long log files with many fields, there seems to be some limitations to the stats
command as I get fewer fields returned compared to using transaction
.
Should I still be using the stats
command? How can I ensure all fields and values are included?
The comparison isn't immediately apples to apples. Depending on what you are trying to achieve the stats values command could get you the desired functionality but it also might not
The transaction command combines events that share a common field into a single event. Single event is key because the transaction command is what we call a centralized streaming command. In essence, when you see the result the result run the command the results will still be visible under the events tab in Splunk. The options of the transactions command allow you to limit the events grouped together in a certain time span. It will also add an eventcount and duration field. The most important thing to remember is that the entire raw events are combined into a single event by just concatenating them
For example: some search|transactions src_ip
all the different events with a single src_ip concatenated together.
The stats values command is what is known as a transforming command. The result is no longer raw events that can be viewed on the events tabs but statistics that are viewed in a table on the statistics tab. This will just provide a multivalued table entry of a particular field.
For example: some search|stats values(src_ip)
will give me a multivalue table entry of all the different source ips in the preceding data.
The transaction command is a hungry command. So using it depends on your use case. If you seek functionality that is achievable by the stats values command definitely do that but if you really need the functionality of the transactions command then you will have to use it.
References:
[1] http://docs.splunk.com/Documentation/Splunk/6.4.1/SearchReference/Transaction
[2] http://docs.splunk.com/Documentation/Splunk/6.4.1/SearchReference/CommonStatsFunctions
It's interesting that you make the comparison between stats values()
and transaction
as they were designed to do completely different things. It would be nice to understand the use case.
A similar case at Transactions/Stats?
It was suggested there to use the map
command to delineate the beginning and end of the transaction - a cheerful idea.
Thanks. I probably want the "transaction" command, I assume. Although it is not always obvious to me what is the right choice. (See comment to the question for details)
The comparison isn't immediately apples to apples. Depending on what you are trying to achieve the stats values command could get you the desired functionality but it also might not
The transaction command combines events that share a common field into a single event. Single event is key because the transaction command is what we call a centralized streaming command. In essence, when you see the result the result run the command the results will still be visible under the events tab in Splunk. The options of the transactions command allow you to limit the events grouped together in a certain time span. It will also add an eventcount and duration field. The most important thing to remember is that the entire raw events are combined into a single event by just concatenating them
For example: some search|transactions src_ip
all the different events with a single src_ip concatenated together.
The stats values command is what is known as a transforming command. The result is no longer raw events that can be viewed on the events tabs but statistics that are viewed in a table on the statistics tab. This will just provide a multivalued table entry of a particular field.
For example: some search|stats values(src_ip)
will give me a multivalue table entry of all the different source ips in the preceding data.
The transaction command is a hungry command. So using it depends on your use case. If you seek functionality that is achievable by the stats values command definitely do that but if you really need the functionality of the transactions command then you will have to use it.
References:
[1] http://docs.splunk.com/Documentation/Splunk/6.4.1/SearchReference/Transaction
[2] http://docs.splunk.com/Documentation/Splunk/6.4.1/SearchReference/CommonStatsFunctions
Great, thanks for the nice explanation! It does not always seem obvious which choice is best here...
Could you provide examples of each command and what's missing out of the stats?
I have multiple log files with multiple events (with multiple fields) for a session which I'd like to display for a given session. Put simply, I can then do either:
1. search sessionID=1234 | stats values(*) as * by sessionID | table * | transpose
2. search sessionID=1234 | transaction sessionID | table * | transpose
The goal is to list all fields and values for easy inspection. With the first search I hit some limitation in log lines with many fields. This does not occur in the second search.