Solved: Single-Field Multi-Value Count Difference by Multi...

landen99 · ‎05-02-2014

I am trying to find the difference of the dns type values for each domain in each time bucket. Let's say there are 10 queries and 22 responses for a given domain. The answer would be 22-10=12. My current search looks like this:

sourcetype=dns | bucket _time span=10m | rex "(?i)^.+\\s{2}\\..*?(?P<domain_root>[^\\.]+\\.[^\\.]+)(?=.$)" | stats count as c by domain_root _time dns_type

This produces the number of queries and replies per domain, but I don't know how to subtract them. I came across what may be a different approach using:

| stats count AS c0 count(eval(dns_type="Q")) AS cq count(eval(dns_type="R")) AS cr by domain_root _time | eval d=cq-cr

If this second approach is viable then would it be followed up with eval d=cq-cr even though cr and cq are for each domain and time bucket or is there another solution? Even if the count-eval method has a solution, I am still interested in understanding how the calculations are done when the "by" term is invoked and I am interested in knowing if a solution exists for the first method (above) using "count by domain_root time dns_type"

martin_mueller · ‎05-02-2014

The second stats seems reasonable to me. The issue with the first one is that eval works on a per-event / per-row basis, so you'd have to merge each pair first before doing the calculation - the second stats already does that for you.

View solution in original post

martin_mueller · ‎05-02-2014

The second stats seems reasonable to me. The issue with the first one is that eval works on a per-event / per-row basis, so you'd have to merge each pair first before doing the calculation - the second stats already does that for you.

martin_mueller · ‎05-05-2014

That'll run. Let func be avg for example, then this will add a field c to every event that is the value of a plus the average of a calculated for each combination of x, y, and z.

landen99 · ‎05-05-2014

You are very sharp. Let's consider eventstats, which keeps "a" then.

eventstats func(a) AS b by x y z | eval c=a+b

martin_mueller · ‎05-05-2014

That's going to be trouble as well, there is no field called a after the stats.

landen99 · ‎05-05-2014

You are correct. The "sum" function requires the stat function. I was thinking about something more like this:

stats func(a) AS b by x y z | eval c=a+b

martin_mueller · ‎05-05-2014

That eval isn't going to run.

As for that stats, it will create a table with four columns: x y z b
You'll get one row for every combination of x, y, and z, and b will be func(a) for events matching that combination.

landen99 · ‎05-05-2014

So

stats func(a) AS b by x y z | eval s=sum(b)

in effect creates the variable b.x.y.z so that eval "s=sum(b)"is really in effect "s=sum(b.x.y.z)" which sums for each unique combination of x, y, and z so that "table s x y z" can show a different value of "s" for each x, y, and z combination. Is this correct?

martin_mueller · ‎05-02-2014

Yup, per row / per domain_root and _time.

landen99 · ‎05-02-2014

If I did the second stats approach, how would the eval look?

| stats count AS c0 count(eval(dns_type="Q")) AS cq count(eval(dns_type="R")) AS cr by domain_root _time | eval d=cq-cr

If so, then would this "d" be per domain and time bucket?

Single-Field Multi-Value Count Difference by Multiple other Fields

Splunk Custom Visualizations App End of Life

Introducing Splunk Enterprise 9.2

Adoption of RUM and APM at Splunk