Splunk Search

How do I create an multivalue field after stats that don't include it?

dj69
Explorer

Is there a way to aggregate data and then show additional fields as mv fields without running another search? I want to aggregate on only 2 fields but then show additional "thrown away" fields as multi value in the final results.

Sample data:

User    Machine Function
Adam    1           x
Adam    1           y
Adam    1           y
Betty   2           x

Working query:

stats count by User, Machine, Function

Results:

User    Machine     Function    Count
A       1           x           1
A       1           y           2
B       2           x           1

Tried query (doesn't work):

stats count by User, Machine | mvcombine Function | stats sum(Count), User, Machine, Function

Desired results:

User    Machine     Function    Count
A       1           x           3
                      y
B       2           x           1
0 Karma

elliotproebstel
Champion

I think this query will get you the table you want:

| stats values(Function) AS Function, count AS Count BY User, Machine

dj69
Explorer

A similar query over a week's worth of my real data shows 260 seconds for the eventstats version and 330 seconds for the values version. Yours is simpler to write but not as quick. Will confirm by running both again though.

0 Karma

dj69
Explorer

You beat me to my own answer but my approach was slightly different. Waiting for the moderator to approve it but basically I used eventstats which keeps the other values:
| eventstats count by User, Machine | mvcombine Function | eval Function=mvdedup(Function)

0 Karma

dj69
Explorer

Actually I think I just answered my own question, eventstats keeps the data in a form I can combine:

eventstats count by User, Machine | mvcombine Function

Results:

Function    Machine User    count
x           1       Adam    3
y
y
x           2       Betty   1
0 Karma

dj69
Explorer

So now my question, which approach is "better" as far as query costs would be concerned?

0 Karma

dj69
Explorer

Eventstats with many other fields has to be cleaned up, however by removing _* fields and specifying the exact fields to keep, so it becomes:
eventstats count by User, Machine | fields Count, User, Machine, Function | fields - _* | mvcombine Function | eval Function=mvdedup(Function) | sort -Count

0 Karma

elliotproebstel
Champion

You can certainly test both approaches over your data set to see which completes more efficiently, but I'm pretty sure my approach is more efficient. Stats is almost always more efficient than eventstats, I believe, specifically because it discards extra fields immediately.

dj69
Explorer

I would have thought the same, but wanted to know for sure.

So I've run each version of the query over and over again, both for short durations and for long. The variability of server responses while everyone else is running queries makes this hard to test. However, when running them simultaneously, they are mostly in a dead heat (~977 seconds over 30 days, ~10-15 seconds over 4 hours).

When running one of them over and over again, and then doing the same with the other, the average performance of the eventstats is slightly better. I think what might be going on is that while eventstats is using more memory during the query, this doesn't cost much in the way of performance. Stats values, on the other hand, might be wasting performance in upkeep while it throws things away.

elliotproebstel
Champion

Fair enough! Thanks for running the experiments and sharing your results. I've definitely found with Splunk that it's worth the time to run experiments like this up front to tune my approaches, because what works well for one data set may not be as efficient for another. So you're very smart to set up these A/B tests and choose your final approach accordingly.

0 Karma

dj69
Explorer

I pasted the wrong "tried" query but it doesn't really matter. I'm just trying to show that I can't use mvcombine once stats throws away the other fields. The alternative, to mvcombine at the end, can't combine if the Count column changes.

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...