Hey there,
I've been learning how to use the search features in Splunk and trying to find a way to get some user-agent metrics from all of our aggregated Tomcat log data. There are 3 key fields that I'm looking at to determine how many browsers are hitting different websites that we run:
site (this is the domain URL from the logs, consider the format to be blahblah.companyname.com)
tr_cookie (a tracking cookie with a unique id per session)
http_user_agent (the user agent we pull from each log entry)
What I need to do is generate a report that counts how many http_user_agents exist using the tr_cookie as a source (this cookie will never have two different http_user_agent values) and then break that down into statistics on a per 'site' basis. I have two searches so far:
index=tomcat_logs "GET /company/index.php " | stats dc(tr_cookie) as sessions by http_user_agent
The above gives me the overall breakdown of how many unique sessions can be attributed to different http_user_agents. So it will say that 100 people are using ie8. 50 people are using Firefox and etc, ex:
Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko 541
Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko 198
Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.94 Safari/537.36 172
But what I really need is something like this:
somecompany.companyname.com:
Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko 300
Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko 100
Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.94 Safari/537.36 75
someothercompany.companyname.com:
Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko 200
Mozilla/5.0 (Windows NT 6.1; Trident/7.0; rv:11.0) like Gecko 110
Mozilla/5.0 (X11; Linux i686) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/40.0.2214.94 Safari/537.36 63
The closest that I have gotten is this search, but it just lists the statistics line by line with the "site" line repeating over and over:
index=tomcat_logs "GET /company/index.php " | stats dc(SKP_TS) as sessions by http_user_agent, site
Example:
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322) XXXXX.companyname.com 1
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 1.1.4322; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) XXXXX.companyname.com 1
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) XXXXX.companyname.com 1
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 3.0.04506.30; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729) XXXXX.companyname.com 2
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; WOW64; Trident/7.0; .NET4.0C; .NET4.0E; InfoPath.3; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729) YYYYY.companyname.com 1
Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 10.0; WOW64; Trident/7.0; Touch; .NET4.0C; .NET4.0E; .NET CLR 2.0.50727; .NET CLR 3.0.30729; .NET CLR 3.5.30729; Tablet PC 2.0) ZZZZZ.companyname.com 1
Any idea how to make this happen?
This works.
| search .....
| stats dc(SKP_TS) as sessions by site, http_user_agent
| streamstats current=false last(site) as previousSite
| eval site=if(match(site,previousSite),"",site)
| fields site, http_user_agent, sessions
I used sideview's answer from https://answers.splunk.com/answers/25102/question-regarding-grouping-of-results-into-a-table.html
This works.
| search .....
| stats dc(SKP_TS) as sessions by site, http_user_agent
| streamstats current=false last(site) as previousSite
| eval site=if(match(site,previousSite),"",site)
| fields site, http_user_agent, sessions
I used sideview's answer from https://answers.splunk.com/answers/25102/question-regarding-grouping-of-results-into-a-table.html
That does indeed the give the effect I'm looking for.
Like this:
... | stats dc(SKP_TS) AS sessions BY site http_user_agent
Also close but it looks like this:
Site: User-Agent: Sessions:
blah.companyname.com Internet Explorer 8 3
blah.companyname.com Internet Explorer 9 2
blah.companyname.com Internet Explorer 10 5
junk.companyname.com Internet Explorer 8 2
junk.companyname.com Internet Explorer 9 6
junk.companyname.com Internet Explorer 10 5
Try this
... | stats values(http_user_agent) as UA dc(SKP_TS) as sessions by site
That's somewhat closer but what that returns is the total count of unique SKP_TS sessions for each site, along side each unique value of the user-agent. What I need is the unique count of SKP_TS sessions per user-agent per site.
So instead of:
Site: User-Agent: Sessions:
blah.companyname.com Internet Explorer 8 10
Internet Explorer 9
Internet Explorer 10
junk.companyname.com Internet Explorer 8 13
Internet Explorer 9
Internet Explorer 10
It should be:
Site: User-Agent: Sessions:
blah.companyname.com Internet Explorer 8 3
Internet Explorer 9 2
Internet Explorer 10 5
junk.companyname.com Internet Explorer 8 2
Internet Explorer 9 6
Internet Explorer 10 5
I know the syntax doesn't make sense but in my mind it's like this:
index=tomcat_logs "GET /company/index.php " | (stats dc(tr_cookie) as sessions by http_user_agent) by site
Essentially using by twice.