Match day and get the sum by day, also get the per...

known_user · ‎09-21-2017

My data looks like this, I've grouped it by a common field. I want to match the date_mday and get the sum of the events for that day.

commonField list(field1) list(date_mday) list(count)
abc f222 efg 20 10
abc f333 ccc 20 20
abc f222 efg 20 30
abc f334 ccc 20 40 -- sum of count for same date_mday - 10 + 20 + 30 + 40 = 100
abc f114 ddd 19 10
abc f113 ccd 19 9 -- sum of count for outliers for same date_mday - 10+9 = 19

def f222 efg 22 10
def f333 ccc 22 25 -- sum of count for same date_mday - 10+25+5 = 40
def f111 bbb 22 5
def f111 bbb 20 15

There are some outliers(in italic) in the data. Then, I want to get the percentage of the outlier vs the total sum.

I'm using the stats command for grouping the data running over a 30 days range, like this:

 search string here | stats list(field1),list(field2),list(date_mday),list(count) by commonField

lfedak_splunk · ‎09-21-2017

Hey @known_user, if @rich7177 solved your problem, please don't forget to accept an answer! You can upvote posts as well. (Karma points will be awarded for either action.) Happy Splunking!

Richfez · ‎09-21-2017

I'm not sure I understand the methodology you are trying to replicate with "outliers", but let me try a few slightly more general tips and see if those get you what you want.

First, this might get you a lot closer:

 search string here | stats count, list(field1),list(field2),list(count) by commonField, date_mday

Notice I added a "count" to the output side of your stats, and moved date_mday to the group by section. Now, I'm not sure I'm reading your data right up in the question, but I think your output should be a count per day per common field (with those remaining fields as a list, just like before).

commonfield, date_mday, list(f1), list(f2), count
abc, 19, efg..., ghi..., 18
abc, 20, efg..., ghi..., 14
xyz, 19, efg..., ghi..., 13
xyz, 20, efg..., ghi..., 15

Again, I'm making up what I think your data looks like. So for the 20th, you'd have 14+15, the 19th would be 18+13. Now, to get a total per day, you can use eventstats on the end of your existing search. Evenstats won't "split it up again" it'll just add stats to the whole pile. I'm busting up the search into rows to make it easier to read...

search string here 
| stats count, list(field1),list(field2),list(count) by commonField, date_mday
| eventstats count AS Daily_Count BY date_mday

In this case, eventstats has been told to do a count (and call it "Daily_Count") of the events each date_mday. Your output should now be something like ...

commonfield, date_mday, list(f1), list(f2), count, Daily_Count
abc, 19, efg..., ghi..., 18, 31
abc, 20, efg..., ghi..., 14, 29
xyz, 19, efg..., ghi..., 13, 31
xyz, 20, efg..., ghi..., 15, 29

Hopefully my math's right - all of the events that exist on the 19th will have a Daily_Count of 31. All the events on the 20th will have 29.

So, while I'm not sure if this really solves your problem or not, I hope it gives you the tools you need to solve it yourself.

If you have further difficulties or if you can provide a simpler, less complex example (or actual real live events!), please comment back!

(And also, if you DO resolve it yourself with these tools and tips, please paste the answer back here too!)

Happy Splunking!
Rich

Match day and get the sum by day, also get the percentage

.conf24 | Registration Open!

ICYMI - Check out the latest releases of Splunk Edge Processor

Introducing the 2024 SplunkTrust!