I have noticed that the search results between table and stats can vary if one of the fields returns a null result. But this is something else, though the effect is similar. I am searching web logs, and there is a user field that sometimes has an empty pair of double-quotes, other times a hyphen, and the rest of the time a meaningful username. If I do a search using stats adding or removing user from the field list changes the number of records I get in return, even though I am making sure the field cannot be null. I'm greatly puzzled by this, and I am hoping someone has an insight about it. My current version of Splunk is 6.1.5
In the sample data I have, including the user field returns a count total of 208 in the Statistics tab, while the Events tab shows 547 events.
However, omitting the user field returns 374 in the Statistics page (Events tab is still 547).
Here is the search with the user field included:
index="web_index" status="40*"
| eval user=if(user="\"\"", "nullquotes", user)
| eval user=if(user="-", "nullhyphen", user)
| eval user=if(isnull(user), "nullfield", user)
| stats earliest(_time) as first, latest(_time) as last count as Occurrences by clientip user status method uri
| eval first=strftime(first,"%m/%d/%Y %H:%M:%S")
| eval last=strftime(last,"%m/%d/%Y %H:%M:%S")
| search Occurrences > 15
| sort -Occurrences
| addtotals Occurrences col=t row=f
This returns the following:
clientip user status method uri first last Occurrences
128.000.000.113 nullquotes 401 GET /webcontent/birt/pages/common/blank.html 01/09/2017 14:44:55 01/09/2017 15:44:26 150
128.000.000.113 joeblow 401 GET /webcontent/birt/pages/common/blank.html 01/09/2017 14:44:54 01/09/2017 14:44:55 42
128.000.000.113 janedoe 401 GET /webcontent/birt/pages/common/blank.html 01/09/2017 15:39:44 01/09/2017 15:39:45 16
208
But then if I drop the user field I get this back:
clientip status method uri first last Occurrences
128.000.000.113 401 GET /webcontent/birt/pages/common/blank.html 01/09/2017 14:44:54 01/09/2017 15:44:26 208
128.000.000.113 401 GET /webcontent/birt/ajax/utility/BirtUtility.js 01/09/2017 14:43:37 01/09/2017 15:44:25 20
128.000.000.113 401 GET /webcontent/birt/ajax/utility/Debug.js 01/09/2017 14:43:35 01/09/2017 15:44:10 18
128.000.000.113 401 GET /webcontent/birt/ajax/core/BirtDndManager.js 01/09/2017 14:43:36 01/09/2017 15:44:18 16
128.000.000.113 401 GET /webcontent/birt/ajax/core/BirtEvent.js 01/09/2017 14:43:37 01/09/2017 15:44:19 16
128.000.000.113 401 GET /webcontent/birt/ajax/core/BirtEventDispatcher.js 01/09/2017 14:43:37 01/09/2017 15:44:19 16
128.000.000.113 401 GET /webcontent/birt/ajax/core/Mask.js 01/09/2017 14:43:36 01/09/2017 15:44:10 16
128.000.000.113 401 GET /webcontent/birt/ajax/lib/prototype.js 01/09/2017 14:43:35 01/09/2017 15:44:10 16
128.000.000.113 401 GET /webcontent/birt/ajax/utility/Constants.js 01/09/2017 14:43:36 01/09/2017 15:44:19 16
128.000.000.113 401 GET /webcontent/birt/styles/style.css 01/09/2017 14:43:35 01/09/2017 15:44:10 16
128.000.000.113 401 GET /webcontent/triStyles 01/09/2017 14:43:35 01/09/2017 15:44:10 16
374
That last number in each search result is the total of the Occurrences column -- 208 in the first search, and 374 in the second.
Can you change the if conditions on user and include a case for empty user field using len(trim(user))=0 i.e. user=
| eval user=case(user=="\"\"", "nullquotes", user=="-", "nullhyphen", isnull(user), "nullfield",len(trim(user))=0,"nullempty",1=1, user)
Also try running the query in verbose mode with user!=* as your base search and see what top 10 stats shows up in the interesting fields.
Or
Alternatively when you take out user from split by field list can you add values(user) as statistical aggregate and see if you get empty rows or any spooky values not handled in the case statement i.e. 1=1 condition.
PS: I tested with makeresults and user="" hits nullempty condition
| makeresults
| eval user=""
| eval user=case(user=="\"\"", "nullquotes", user=="-", "nullhyphen", isnull(user), "nullfield",len(trim(user))=0,"nullempty",1=1, user)
| stats earliest(_time) as first, latest(_time) as last count as Occurrences by user
| eval first=strftime(first,"%m/%d/%Y %H:%M:%S")
| eval last=strftime(last,"%m/%d/%Y %H:%M:%S")
| sort - Occurrences
| table first last Occurrences user
| addtotals Occurrences col=t row=f
To find JUST the records with user values you have not accounted for, use the | set diff generating command.
http://docs.splunk.com/Documentation/Splunk/6.5.1/SearchReference/Set
It's going to looks something like
| set diff [my first search that contains all users] [my second search that contains what I have thought of]
What do you mean, "what I have thought of" ?
Right now, "all users" is the search where you are not checking the user field, and "which I have thought of" is the search that you posted with limitations on the user field. (null, quotes, etc). That second one is what I meant by "what you have thought of".
If you use the diff command with those, you will be able to look at the individual records which are falling out.
Hmmm. There's another simple way. You could also do this -
index="web_index" status="40*"
| eval myuser = user.""
| stats count by myuser
| sort -count
That will get you a count of all users of any kind, and you can review to see where your guys are getting lost.
By the way, be sure to do your validations without the line
| search Occurrences > 15
or you'll ALWAYS be missing stuff.
One last thing. You might try this as well.
| eval user=if(user="", "nullempty", user)
I was already doing | eval user=if(isnull(user), "nullfield", user)
and I have also tried | eval user=coalesce(user,"nullfield")
as well as your suggestioni of | eval user=if(user="", "nullempty", user)
and of | eval myuser = user.""
, substituting myuser for user in the stats command. None of these changed the behavior I've described above. Removing the check for > 15 does, of course, return more records, but the behavior with user included and with user removed is the same -- the record count changes. Darndest thing.
Try doing a |user="" |stats count to see if that double quote you're talking about is where the mismatch is. Also run a search specifically looking for null values and empty quotes in your user field and check if there are nulls anywhere else in those events.
I assume you meant | eval user=""
🙂 I just tried that with user in the list of stats columns, and the count went back to 374. Of course, the user column is empty in the search results. So what value could be in one (or more) of the records for the field user that breaks the stats command results? In any case, that's pretty interesting.