In my IIS logs i am trying to extract the OS and browser versions from the cs_USer_Agent field. I know the cs_user_agent field is complex and confusing but is there an easy way to just extract those two values from this field?
Mozilla/5.0+(compatible;+MSIE+9.0;+Windows+NT+6.1;+Trident/5.0)
Mozilla/4.0+(compatible;+MSIE+6.0;+MS+Web+Services+Client+Protocol+2.0.50727.3603)
These are two examples of the results I get from the cs_User_Agent field.
Thank you.
Sorry to plug my own stuff, but you might want to take a look at these:
These are addons that do parsing on the user-agent string to get more value out of them.
Thanks,
Dave
They both work independently. The difference is in the fields that they produce, and the results. The TA-browscap produces some extra fields that might be useful for web developers, such as the browser's capabilities (javascript, activex, etc.). I couldn't decide between the two, so published both.
What is the difference between the two? If I download only the TA-browscap and not the TA-uas_parser will it still work?
Plug! These are great! And a lot less work than building it yourself! Sorry that I didn't check for an app before I posted an answer.
You can look around on the Internet for "IIS detect browser from user agent". You will get over a million hits, but I doubt that you will find an easy answer.
I would set up a lookup table that uses wildcards to determine the browser and OS based on user agent.
The table could look like this:
user_agent,browser,browser_version,OS
"Mozilla/4.0 (*; MSIE 6.0; Windows*",Internet Explorer,6.0,Windows
"Mozilla/5.0 (Windows;*Firefox/2.0.0.6",Firefox,2.0,Windows
"Mozilla/5.0 (Macintosh; *Chrome/5.0.375.38 Safari/533.4",Safari,5.0,Mac
"Opera/9.01 (Windows *",Opera,9.01,Windows
"Opera/9.20 (Windows *",Opera,9.2,Windows
"Mozilla/4.0 (*MSIE 7.0; Windows*",Internet Explorer,7.0,Windows
transforms.conf
[yourlookupname]
match_type=WILDCARD(user_agent)
default_match = Not found
filename = browser_lookup.csv
max_matches = 1
min_matches = 1
case_sensitive_match = false
Put as many lines in your table as you can. When you run your reports in Splunk, the lookup will return "Not Found" for the browser and OS if the user-agent isn't in the table. When you find one of those, you can add it to the table.
Even this solution is not perfect, as the authors of a browser can emit any user-agent string that they want. So multiple browsers can (and do) emit the same user-agent string.