Splunk Search

Custom Apache log REGEX

kubowler99
New Member

Splunk noob REGEX question.

I'm attempting to customize the REGEX for the ootb Apache extraction. I've got it working for the most part, but I'm unable to get it to parse the referer, useragent, and cookie from the logs. They all get parsed as the 'other' field.

REGEX: ^[[nspaces:clientip]]\s++[[nspaces:dummyip1]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]

Log Sample:

208.20.251.27 205.141.201.135 - - [03/Jan/2012:09:14:59 -0600] "POST /web/member/webflow.sf HTTP/1.1" 200 7726 "https://oururl.com/web/member/loginWebflow.sf?_flowExecutionKey=_c79820C50-FDCF-6ECA-7444-A686DB586C..." "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2)" "__utma=242832726.1910715443.1325603689.1325603689.1325603689.1; __utmb=242832726.1.10.1325603689; __utmc=242832726; __utmz=242832726.1325603689.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); mbox=check#true#1325603751|session#1325603690349-869173#1325605551; JSESSIONID=0000LoZUujL4bBUy5HkrMitepvr:16i215taj; BIGipServer_MemberFront_DMZ_WAS_prodmember=2798226893.17440.0000; __utmv=1.DIV%3D%3ASEG%3D; __utma=1.1819453781.1325603691.1325603691.1325603691.1; __utmb=1; __utmc=1; __utmz=1.1325603691.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none); remainingTime=null"

Any assistance would be greatly appreciated.

Tags (2)
0 Karma

RubenOlsen
Path Finder

Given your example data is in the form of key=value, you probably do not need to create field extractions for these values as Splunk will do this automatically.

However, if you can amend the logging of the utm-keys to enclose the values with " (i.e. utma="242832726.1910715443.1325603689.1325603689.1325603689.1") - then you really do not need to create any explicit field extractions as this will ensure that the complete value is used.

0 Karma

lguinn2
Legend

When you set up the File or Directory input, under More Settings, using the "choose from list" option or the "manual" option to set the source type. I suggest that you use

access_combined_wcookie

This is a pre-existing Splunk sourcetype, with field extractions for Apache data. I have used access_combined a bunch, and it definitely extracts the referer and useragent fields (along with clientip, status, uri, etc.)
There are three built-in choices for Apache: access_combined_wcookie, access_combined and access_common.
Even if none of these is an exact match, you can set the sourcetype to the best fit - and then just do the additional field extractions that you need.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...