Why would my "stats count by uri_path" search resu...

dgoldin · ‎03-03-2017

I am using this search to produce a monthly report ranking top pages in a section of a site. My date range always starts on Sep 1, 2016 and goes to the end of each month. I am finding is that the count through Feb 28 is actually lower than it is through Jan 31 for some urls. It seems like such a simple search. What other factor could be reducing the effective number of events over time?

source="/home/suchandso/access-logs/website.com" AND uri_path="/sectionpath*" | stats count by uri_path

Thanks!

DalJeanis · ‎03-03-2017

Okay, here's a candidate. Please run the following two items and see which one matches your numbers for Sept only.

 earliest="09/01/2016:00:00:00" latest="09/30/2016:00:00:00"  source="/home/suchandso/access-logs/website.com" AND uri_path="/sectionpath*" | stats count by uri_path

 earliest="09/01/2016:00:00:00" latest="10/01/2016:00:00:00"  source="/home/suchandso/access-logs/website.com" AND uri_path="/sectionpath*" | stats count by uri_path

NEVER MIND THE ABOVE.

Here's a quick and useful way to look at your data. Try this. Modify the URLS to the one that was consistent and two that were inconsistent.

 earliest="09/01/2016:00:00:00" latest="10/01/2016:00:00:00"  
source="/home/suchandso/access-logs/website.com" AND 
(uri_path="/sectionpath/mytestA" OR uri_path="/sectionpath/mytestB" OR uri_path="/sectionpath/mytestC" )
| bin _time span=1month
| stats count as eventcount by _time uri_path
| chart sum(eventcount) as eventcount  over _time by uri_path
| addtotals row=t col=t labelfield="Month" label="Total for Period" fieldname="Total for Month"

Giving you results that look somewhat like this

Month             /sectionpath/mytestA  /sectionpath/mytestB  /sectionpath/mytestC  Total for Month  
2016-09            70                    31                     46                    147    
2016-10            95                    49                    102                    246    
2016-11            95                    48                    107                    250    
2016-12           105                    45                    106                    256    
2017-01            89                    51                    106                    246    
2017-02            90                    40                     91                    221    
2017-03            14                     6                     18                     38     
Total for Period  558                   270                    576                   1404

dgoldin · ‎03-03-2017

Thanks for the query logic, much faster than exporting one month at a time.

As to your two candidate queries, the second one is the exact query I ran for Sept only. What was your hypothesis in comparing the two?

DalJeanis · ‎06-14-2017

Follow up after a long time... that particular combination was just trying to eliminate the possibility that some records on the last day of the month were being dropped due to their event time.

With regard to the overall issue, it seems like the subsearch limits could cause numbers for each individual URL to go either direction, since I don't believe there is a guarantee which events from which indexers will arrive first. Nonetheless, I don't see any way that the described symptoms could be generated based on the code and comments above. Sure wish he had posted what he found.

I'm expecting, based on the fact that the op disappeared shortly after we posted the chart... over _time code, that that code enabled him to effectively find and stomp the bug, since it automatically provides summary row and column for the individual data.

DalJeanis · ‎03-03-2017

By the way, are you comparing the results that you saved from last month, or are you comparing the results that you get NOW for a search that ends on Jan 31? If records had been deleted in the meantime, a later search is obviously not going to find the missing records.

dgoldin · ‎03-03-2017

I am comparing both the results I saved from last month AND the results I get now for a search that ends on Jan 31. For some urls, the Sep-Jan number is higher than Sep-Feb when I run both reports now. AND the Sep-Jan number when I ran it last month was higher than the Sep-Jan number when I run it now. See comments I will post below, it gets more bizarre...

DalJeanis · ‎03-03-2017

Okay, I'd limit the query to a single one of the URLs that is different and see if the behavior can be replicated at a more granular level. Most of the ways that splunk inherently limits search results will disappear if the volume of results get under the throttle. Thus, if the problem goes away when you limit the query to a small selection, then it tells you where to look for the glitch.

dgoldin · ‎03-03-2017

If I'm understanding what you're suggesting, then I would expect to see lower totals for larger aggregate time ranges. But in this case, I'm seeing the opposite: lower numbers when I manually total up smaller time ranges (e.g. 1 month increments) than when I use larger time ranges (e.g. 6 months). And just now I tried one URL and got a smaller number for Sep-Jan than when I pull that same time range for all URLs, which would seem to be a much bigger chunk for Splunk to process.

So could the glitch or limit you're suggesting be that some events are being counted more than once?

DalJeanis · ‎03-03-2017

Depends on how you are using it. There are some places where splunk has hidden and counter-intuitive limits on what it will return. Is that search in a panel? Are you retrieving the results though loadjob?

Do the same search from the search head, specifying these parameters, and send to background...

earliest="09/01/2016:00:00:00" latest="02/01/2017:00:00:00" source="/home/suchandso/access-logs/website.com" AND uri_path="/sectionpath*" | stats count by uri_path 

earliest="09/01/2016:00:00:00" latest="03/01/2017:00:00:00" source="/home/suchandso/access-logs/website.com" AND uri_path="/sectionpath*" | stats count by uri_path

If the results don't look reasonable for the second one relative to the first, then find out what is falling out by breaking out Feb independently...

earliest="09/01/2016:00:00:00" latest="03/01/2017:00:00:00" source="/home/suchandso/access-logs/website.com" AND uri_path="/sectionpath*" | eval month=strftime(_time,"%m")| eval month=case(month==2,2,true(),0)| stats count by month uri_path

or break every month out independently and see if anything jumps out at you...

earliest="09/01/2016:00:00:00" latest="03/01/2017:00:00:00" source="/home/suchandso/access-logs/website.com" AND uri_path="/sectionpath*" | eval month=strftime(_time,"%m")| stats count by month uri_path

updated to put quotes around the time format string "%m"

dgoldin · ‎03-03-2017

I am not running the search in a panel, I am entering it as a raw search and I save those searches as reports. To my knowledge I am not using loadjob.

Your search parameters eval month=strftime(_time,%m)| eval month=case(month==2,2,true(),0)| stats count by month uri_path and eval month=strftime(_time,%m)| stats count by month uri_path both threw errors for me.

So I ran the report for each month, limiting the date range to only that month, e.g. Sep 1 - Sep 30, Oct 1 - Oct 31, etc. Then I manually totaled each month of data for 5 selected urls. For each url my manually totaled number for Sep-Feb exactly matches what I get when I run the report with the full Sep-Feb date range.

However, my manually totaled numbers for Sep-Jan match full Sep-Jan date range reports exactly for only two of the five urls. The manual totals are lower for the other three. And for Sep-Dec, the data for only one of the five urls match.

cmerriman · ‎03-03-2017

i doubt this would cause the problem, but you don't need the AND

have you tried taking out the uri_path="/sectionpath*" and checking? I also doubt that's it but you never know. it seems simple enough...

dgoldin · ‎03-03-2017

You are correct: I didn't need the AND. I did not yet try taking out the uri_path="/sectionpath*" yet because there would be millions of events and it would take way too long to run.

Why would my "stats count by uri_path" search result in a lower count over longer time period?

Introducing the 2024 SplunkTrust!

Introducing the 2024 Splunk MVPs!

Splunk Custom Visualizations App End of Life