How to sum a column's values along multiple catego...

rey123 · ‎01-16-2019

I have the following data:

A B C Pkg Area Count

NP bcd D02 abc.d PP 1656
NP bcd D05 abc.d PP 870
NP bcd D01 abc.d PP 100
NP cde D05 lmn.o PP 50
NP cde D10 lmn.o PP 350
NP cde D07 lmn.o PP 200

I want to sum the Count column by A and B, but display the resulting rows with the value of Column C which contributed to the maximum Count in the above summation.

So, expected result:

A B C Pkg Area Count

NP bcd D02 abc.d PP 2626
NP cde D10 lmn.o PP 600

Tried the following:

| rex "java\.lang\.(?P[A-Z]+(NP))"
| rex field=_raw "(?\S+)\.[A-Z]\S+\((?<b>\w+)\)"
| search A=NP
...

| stats values(Pkg), values(Area), values(C) , count as Count by A, B, C
| eventstats sum(Count) as Count1 by A, B  
| search Area=PP 
| sort -Count1

Result:

A B C Pkg Area Count Count1

NP bcd D02 abc.d PP 1656 2626
NP bcd D05 abc.d PP 870 2626
NP bcd D01 abc.d PP 100 2626
NP cde D05 lmn.o PP 50 600
NP cde D10 lmn.o PP 350 600
NP cde D07 lmn.o PP 200 600

If I modify the query as follows:

| rex "java\.lang\.(?P[A-Z]+(NP))"
| rex field=_raw "(?\S+)\.[A-Z]\S+\((?<b>\w+)\)"
| search A=NP
...

| stats values(Pkg), values(Area), values(C) , count as Count by A, B, C
| eventstats sum(Count) as Count1 by A, B  first(Count1) as Top_Count by C
| search Area=PP 
| sort -Count1

..the Top_Count column shows up empty:

A B C Pkg Area Count Top_Count

NP bcd D02 abc.d PP 1656

NP bcd D05 abc.d PP 870

NP bcd D01 abc.d PP 100

NP cde D05 lmn.o PP 50

NP cde D10 lmn.o PP 350

NP cde D07 lmn.o PP 200

Removing the Count column at the end..

| rex "java\.lang\.(?P[A-Z]+(NP))"
| rex field=_raw "(?\S+)\.[A-Z]\S+\((?<b>\w+)\)"
| search A=NP
...

| stats values(Pkg), values(Area), values(C) , count as Count by A, B, C
| eventstats sum(Count) as Count1 by A, B  first(Count1) as Top_Count by C
| search Area=PP 
| sort -Count1
| fields - Count

..hides all the Count columns:

A B C Pkg Area Top_Count

NP bcd D02 abc.d PP

NP bcd D05 abc.d PP

NP bcd D01 abc.d PP

NP cde D05 lmn.o PP

NP cde D10 lmn.o PP

NP cde D07 lmn.o PP

I am not sure why all the Count columns get removed considering the Fields - command is applied only at the very end.

Any pointers? Thank you!

woodcock · ‎01-18-2019

Like this:

| makeresults
| eval raw="NP,bcd,D02,abc.d,PP,1656 NP,bcd,D05,abc.d,PP,870 NP,bcd,D01,abc.d,PP,100 NP,cde,D05,lmn.o,PP,50 NP,cde,D10,lmn.o,PP,350 NP,cde,D07,lmn.o,PP,200" 
| makemv raw 
| mvexpand raw 
| rename raw AS _raw 
| rex "^(?<A>[^,]+),(?<B>[^,]+),(?<C>[^,]+),(?<Pkg>[^,]+),(?<Area>[^,]+),(?<Count>[^,]+)$" 

| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"

| sort 0 - Count
| eventstats first(C) AS C BY A B Area
| stats sum(Count) AS Count BY A B C Pkg Area

mayurr98 · ‎01-17-2019

Hi Try this run anywhere search

| makeresults 
| eval data="NP bcd D02 abc.d PP 1656,NP bcd D05 abc.d PP 870,NP bcd D01 abc.d PP 100,NP cde D05 lmn.o PP 50,NP cde D10 lmn.o PP 350,NP cde D07 lmn.o PP 200" 
| makemv data delim="," 
| mvexpand data 
| rex field=data "(?<a>[^\s]+)\s(?<b>[^\s]+)\s(?<c>[^\s]+)\s(?<pkg>[^\s]+)\s(?<area>[^\s]+)\s(?<count>.*)" 
| table a b c pkg area count 
| eventstats sum(count) as sum max(count) as max by a b 
| where count==max 
| table a b c pkg area sum

put this at the end of your main search

| table a b c pkg area count 
    | eventstats sum(count) as sum max(count) as max by a b 
    | where count==max 
    | table a b c pkg area sum

let me know if this helps!

rey123 · ‎01-17-2019

Thank you very much, however (and I should have mentioned this earlier), the tabular data that I shared is not a fixed list. It is generated dynamically at runtime based on Splunk logs matching a lookup file (the part of the query I skipped and indicated with '...').

Hence, I am unfortunately not sure how the 'eval' function could work.

harishalipaka · ‎01-17-2019

hi @rey123

try like this

add in your query'' | stats sum(Count) as Count1 by A, B

Thanks
Harish

rey123 · ‎01-17-2019

Thank you, this seems to replace evalstats with stats. Tried, but didn't work, unfortunately (no data appeared).