I noticed this too. It looks like the chart command sums up the field differently???
Here’s what I’m talking about:
source=*file001* index=index001 sourcetype=delta "duration"=* earliest=@w latest=now
| bucket span=1w _time AS WeekStart
| chart sum("duration") AS Sum, count("duration") as Count, count BY WeekStart
| convert ctime(WeekStart)
source=*file001* index=index001 sourcetype=delta "duration"=* earliest=@w latest=now
| bucket span=1w _time AS WeekStart
| stats sum("duration") AS Sum, count("duration") as Count, count BY WeekStart
| convert ctime(WeekStart)
Thanks
I think I found what is wrong. Some duration values have commas such as 2,455 so not sure how the splunk functions treat numbers and strings. I am using a regex to fix the issue. will post here when I finish it.
you don't needregex, just use this
| convert num(duration)
I would use convert rmcomma command instead of rex. Like this
source=*file001* index=index001 sourcetype=delta "duration"=* earliest=@w latest=now
| bucket span=1w _time AS WeekStart | convert rmcomma(duration)
| stats sum("duration") AS Sum, count("duration") as Count, count BY WeekStart
| convert ctime(WeekStart)
I wanna add rmcomma as the answer for this thread. How do I do that? I want to give points to everyone who helped me.
I think your comments was the answer for your question. You can just vote up/like all the comments that helped you to resolving the issue.
What result do you get when you run this
source=*file001* index=index001 sourcetype=delta "duration"=* earliest=@w latest=now
| bucket span=1w _time AS WeekStart
| chart sum("duration") AS Sum, count("duration") as Count, count BY WeekStart
| convert ctime(WeekStart)
| append [search source=*file001* index=index001 sourcetype=delta "duration"=* earliest=@w latest=now
| bucket span=1w _time AS WeekStart
| stats sum("duration") AS Sum, count("duration") as Count, count BY WeekStart
| convert ctime(WeekStart) ]
First, when investigating things like this, put in an exact timeframe so that the moment you happen to run the queries cannot change the results. I'd bet the difference of 1 in the Count figure is due to the difference between start time of the jobs.
earliest="01/18/2017:02:00:00" latest="01/18/2017:02:02:00"
Second, it's looking like when you limit the query to the exact time, you'll end up with the chart results being exactly twice the stats results for the item Sum. I'll poke around and see if I can find an explanation.
In the meantime, please add a term to each query to get the average of the "duration" field. Also, please check whether there are any records with null values or multiple values in the duration field.
| checkmv = mvcount("duration")
| search checkmv !=1
nope it didnt return me any events with multi value in the 'duration' field. Yeah it is weird, there is a difference of 1 when counting 2253 vs 2254. But with CHART the sum up value doubles.
That makes sense (no mv on duration field) otherwise the count values would've been doubled as well.
If you used the mvcount!=1, then it also means no null records also (they would be zero).
Try removing the quotes around "duration" - here, run these
source=*file001* index=index001 sourcetype=delta "duration"=* earliest=@w latest=now
| eval myDuration1 = duration
| eval myDuration2 = "duration"
| bucket span=1w _time AS WeekStart
| chart
sum(myDuration1) AS Sum1, count(myDuration1) as Count1,
sum(myDuration2) AS Sum2, count(myDuration2) as Count2,
sum("duration") AS Sum3, count("duration") as Count3,
sum(duration) AS Sum4, count(duration) as Count4,
count as EventCount BY WeekStart
| convert ctime(WeekStart)
source=*file001* index=index001 sourcetype=delta "duration"=* earliest=@w latest=now
| eval myDuration1 = duration
| eval myDuration2 = "duration"
| bucket span=1w _time AS WeekStart
| stats
sum(myDuration1) AS Sum1, count(myDuration1) as Count1,
sum(myDuration2) AS Sum2, count(myDuration2) as Count2,
sum("duration") AS Sum3, count("duration") as Count3,
sum(duration) AS Sum4, count(duration) as Count4,
count as EventCount BY WeekStart
| convert ctime(WeekStart)
with any luck, 1 and 4 will be correct and 2 and 3 will match what you've gotten before.
I fixed the issue by removing the comma from the 'duration' field:
| rex mode=sed field="Ads viewed upto 25_ of total ad duration" "s/,//g"
Now both queries match. It looks like stats will treat a number like 3,455 as a string and not the number 3455. Once I removed the comma it was summing up correctly.
does that make sense? sorry I am new to Splunk.
Yes that's the way to go. (validated this behavior difference between stats and chart using following runwhere sample
| gentimes start=-1 | eval duration="2,345" | stats sum(duration) ----no result
|gentimes start=-1 | eval duration="2,345" | chart sum(duration) --- result is 2345
)