I have the following data:
NP bcd D02 abc.d PP 1656
NP bcd D05 abc.d PP 870
NP bcd D01 abc.d PP 100
NP cde D05 lmn.o PP 50
NP cde D10 lmn.o PP 350
NP cde D07 lmn.o PP 200
I want to sum the Count column by A and B, but display the resulting rows with the value of Column C which contributed to the maximum Count in the above summation.
So, expected result:
NP bcd D02 abc.d PP 2626
NP cde D10 lmn.o PP 600
Tried the following:
| rex "java\.lang\.(?P[A-Z]+(NP))"
| rex field=_raw "(?\S+)\.[A-Z]\S+\((?<b>\w+)\)"
| search A=NP
...
| stats values(Pkg), values(Area), values(C) , count as Count by A, B, C
| eventstats sum(Count) as Count1 by A, B
| search Area=PP
| sort -Count1
Result:
NP bcd D02 abc.d PP 1656 2626
NP bcd D05 abc.d PP 870 2626
NP bcd D01 abc.d PP 100 2626
NP cde D05 lmn.o PP 50 600
NP cde D10 lmn.o PP 350 600
NP cde D07 lmn.o PP 200 600
If I modify the query as follows:
| rex "java\.lang\.(?P[A-Z]+(NP))"
| rex field=_raw "(?\S+)\.[A-Z]\S+\((?<b>\w+)\)"
| search A=NP
...
| stats values(Pkg), values(Area), values(C) , count as Count by A, B, C
| eventstats sum(Count) as Count1 by A, B first(Count1) as Top_Count by C
| search Area=PP
| sort -Count1
..the Top_Count column shows up empty:
NP bcd D02 abc.d PP 1656
NP bcd D05 abc.d PP 870
NP bcd D01 abc.d PP 100
NP cde D05 lmn.o PP 50
NP cde D10 lmn.o PP 350
NP cde D07 lmn.o PP 200
Removing the Count column at the end..
| rex "java\.lang\.(?P[A-Z]+(NP))"
| rex field=_raw "(?\S+)\.[A-Z]\S+\((?<b>\w+)\)"
| search A=NP
...
| stats values(Pkg), values(Area), values(C) , count as Count by A, B, C
| eventstats sum(Count) as Count1 by A, B first(Count1) as Top_Count by C
| search Area=PP
| sort -Count1
| fields - Count
..hides all the Count columns:
NP bcd D02 abc.d PP
NP bcd D05 abc.d PP
NP bcd D01 abc.d PP
NP cde D05 lmn.o PP
NP cde D10 lmn.o PP
NP cde D07 lmn.o PP
I am not sure why all the Count columns get removed considering the Fields - command is applied only at the very end.
Any pointers? Thank you!
Like this:
| makeresults
| eval raw="NP,bcd,D02,abc.d,PP,1656 NP,bcd,D05,abc.d,PP,870 NP,bcd,D01,abc.d,PP,100 NP,cde,D05,lmn.o,PP,50 NP,cde,D10,lmn.o,PP,350 NP,cde,D07,lmn.o,PP,200"
| makemv raw
| mvexpand raw
| rename raw AS _raw
| rex "^(?<A>[^,]+),(?<B>[^,]+),(?<C>[^,]+),(?<Pkg>[^,]+),(?<Area>[^,]+),(?<Count>[^,]+)$"
| rename COMMENT AS "Everything above generates sample event data; everything below is your solution"
| sort 0 - Count
| eventstats first(C) AS C BY A B Area
| stats sum(Count) AS Count BY A B C Pkg Area
Hi Try this run anywhere search
| makeresults
| eval data="NP bcd D02 abc.d PP 1656,NP bcd D05 abc.d PP 870,NP bcd D01 abc.d PP 100,NP cde D05 lmn.o PP 50,NP cde D10 lmn.o PP 350,NP cde D07 lmn.o PP 200"
| makemv data delim=","
| mvexpand data
| rex field=data "(?<a>[^\s]+)\s(?<b>[^\s]+)\s(?<c>[^\s]+)\s(?<pkg>[^\s]+)\s(?<area>[^\s]+)\s(?<count>.*)"
| table a b c pkg area count
| eventstats sum(count) as sum max(count) as max by a b
| where count==max
| table a b c pkg area sum
put this at the end of your main search
| table a b c pkg area count
| eventstats sum(count) as sum max(count) as max by a b
| where count==max
| table a b c pkg area sum
let me know if this helps!
Thank you very much, however (and I should have mentioned this earlier), the tabular data that I shared is not a fixed list. It is generated dynamically at runtime based on Splunk logs matching a lookup file (the part of the query I skipped and indicated with '...').
Hence, I am unfortunately not sure how the 'eval' function could work.
hi @rey123
try like this
add in your query'' | stats sum(Count) as Count1 by A, B
Thank you, this seems to replace evalstats with stats. Tried, but didn't work, unfortunately (no data appeared).