Hello
I am trying to get the browser information from the below raw data and haven't been able to do so. Can anyone please explain how to get the information? I haven't yet been able to successfully write complex regex expressions.
2012-11-26 19:41:42 10.64.182.218 GET /_js/mbox.js - 80 - 10.64.182.224 Mozilla/4.0+(compatible;+MSIE+8.0;+Windows+NT+5.1;+Trident/4.0;+.NET+CLR+1.1.4322;+.NET+CLR+2.0.50727;+InfoPath.2;+.NET+CLR+3.0.4506.2152;+.NET+CLR+3.5.30729;+MS-RTC+LM+8;+.NET4.0C;+.NET4.0E)
Regards
theou
I couldn't find a definitive list of permissible characters for user agent strings. So, as long as all log entries are the same you can try this regex:
\S*$
That just means anything that's not a space at the end of the log entry. Since the parts of the log entry are delineated by spaces, you should be good to go with that. Otherwise you can try something like:
Mozilla[\.\d\w:;+/()-]*
Which is "Mozilla" followed by all the characters I found in example user agent strings. Also, try :
Mozilla[^\s]*
Which just means anything not a space following "Mozilla".
Regex is complicated but powerful, its worth learning.
This app could very well be exactly what you're looking for. http://splunk-base.splunk.com/apps/48017/ta-uas_parser
Sorry, the task of making sense out of user agent strings is ridiculously complex, because there's simply no universal standard for how they're formatted. The web analytics app might have some inbuilt support for this.
Oh. I get it now what you meant. Any idea on how to approach this?
That's my point - if you just catch the initial "Mozilla" you won't be able to differentiate between browsers at all. Both Opera and Internet Explorer commonly use "Mozilla" at the beginning of their user-agent string.
That's fine. In the whole raw data there are few in Opera and Internet Browser too. I just need to make a table to determine which browsers where the most used.
You do know that pretty much all browsers use "Mozilla" in their user-agent string? http://en.wikipedia.org/wiki/User_agent#Format
Nope. Just the browser info. In this data only Mozilla.
Which info do you want? The whole user-agent string?