Splunk Search

Single-Field Multi-Value Count Difference by Multiple other Fields

landen99
Motivator

I am trying to find the difference of the dns type values for each domain in each time bucket. Let's say there are 10 queries and 22 responses for a given domain. The answer would be 22-10=12. My current search looks like this:

sourcetype=dns | bucket _time span=10m | rex "(?i)^.+\\s{2}\\..*?(?P<domain_root>[^\\.]+\\.[^\\.]+)(?=.$)" | stats count as c by domain_root _time dns_type

This produces the number of queries and replies per domain, but I don't know how to subtract them. I came across what may be a different approach using:

| stats count AS c0 count(eval(dns_type="Q")) AS cq count(eval(dns_type="R")) AS cr by domain_root _time | eval d=cq-cr

If this second approach is viable then would it be followed up with eval d=cq-cr even though cr and cq are for each domain and time bucket or is there another solution? Even if the count-eval method has a solution, I am still interested in understanding how the calculations are done when the "by" term is invoked and I am interested in knowing if a solution exists for the first method (above) using "count by domain_root time dns_type"

0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

The second stats seems reasonable to me. The issue with the first one is that eval works on a per-event / per-row basis, so you'd have to merge each pair first before doing the calculation - the second stats already does that for you.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

The second stats seems reasonable to me. The issue with the first one is that eval works on a per-event / per-row basis, so you'd have to merge each pair first before doing the calculation - the second stats already does that for you.

martin_mueller
SplunkTrust
SplunkTrust

That'll run. Let func be avg for example, then this will add a field c to every event that is the value of a plus the average of a calculated for each combination of x, y, and z.

0 Karma

landen99
Motivator

You are very sharp. Let's consider eventstats, which keeps "a" then.

eventstats func(a) AS b by x y z | eval c=a+b
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

That's going to be trouble as well, there is no field called a after the stats.

0 Karma

landen99
Motivator

You are correct. The "sum" function requires the stat function. I was thinking about something more like this:

stats func(a) AS b by x y z | eval c=a+b
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

That eval isn't going to run.

As for that stats, it will create a table with four columns: x y z b
You'll get one row for every combination of x, y, and z, and b will be func(a) for events matching that combination.

0 Karma

landen99
Motivator

So

stats func(a) AS b by x y z | eval s=sum(b)

in effect creates the variable b.x.y.z so that eval "s=sum(b)"is really in effect "s=sum(b.x.y.z)" which sums for each unique combination of x, y, and z so that "table s x y z" can show a different value of "s" for each x, y, and z combination. Is this correct?

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Yup, per row / per domain_root and _time.

0 Karma

landen99
Motivator

If I did the second stats approach, how would the eval look?

| stats count AS c0 count(eval(dns_type="Q")) AS cq count(eval(dns_type="R")) AS cr by domain_root _time | eval d=cq-cr

If so, then would this "d" be per domain and time bucket?

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...