Splunk Search

Custom Apache log REGEX

kubowler99
New Member

Splunk noob REGEX question.

I'm attempting to customize the REGEX for the ootb Apache extraction. I've got it working for the most part, but I'm unable to get it to parse the referer, useragent, and cookie from the logs. They all get parsed as the 'other' field.

REGEX: ^[[nspaces:clientip]]\s++[[nspaces:dummyip1]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]

Log Sample:

208.20.251.27 205.141.201.135 - - [03/Jan/2012:09:14:59 -0600] "POST /web/member/webflow.sf HTTP/1.1" 200 7726 "https://oururl.com/web/member/loginWebflow.sf?_flowExecutionKey=_c79820C50-FDCF-6ECA-7444-A686DB586C..." "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; .NET CLR 1.1.4322; .NET CLR 2.0.50727; .NET CLR 3.0.04506.30; .NET CLR 3.0.4506.2152; .NET CLR 3.5.30729; InfoPath.2)" "__utma=242832726.1910715443.1325603689.1325603689.1325603689.1; __utmb=242832726.1.10.1325603689; __utmc=242832726; __utmz=242832726.1325603689.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); mbox=check#true#1325603751|session#1325603690349-869173#1325605551; JSESSIONID=0000LoZUujL4bBUy5HkrMitepvr:16i215taj; BIGipServer_MemberFront_DMZ_WAS_prodmember=2798226893.17440.0000; __utmv=1.DIV%3D%3ASEG%3D; __utma=1.1819453781.1325603691.1325603691.1325603691.1; __utmb=1; __utmc=1; __utmz=1.1325603691.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none); remainingTime=null"

Any assistance would be greatly appreciated.

Tags (2)
0 Karma

RubenOlsen
Path Finder

Given your example data is in the form of key=value, you probably do not need to create field extractions for these values as Splunk will do this automatically.

However, if you can amend the logging of the utm-keys to enclose the values with " (i.e. utma="242832726.1910715443.1325603689.1325603689.1325603689.1") - then you really do not need to create any explicit field extractions as this will ensure that the complete value is used.

0 Karma

lguinn2
Legend

When you set up the File or Directory input, under More Settings, using the "choose from list" option or the "manual" option to set the source type. I suggest that you use

access_combined_wcookie

This is a pre-existing Splunk sourcetype, with field extractions for Apache data. I have used access_combined a bunch, and it definitely extracts the referer and useragent fields (along with clientip, status, uri, etc.)
There are three built-in choices for Apache: access_combined_wcookie, access_combined and access_common.
Even if none of these is an exact match, you can set the sourcetype to the best fit - and then just do the additional field extractions that you need.

0 Karma
Get Updates on the Splunk Community!

Updated Team Landing Page in Splunk Observability

We’re making some changes to the team landing page in Splunk Observability, based on your feedback. The ...

New! Splunk Observability Search Enhancements for Splunk APM Services/Traces and ...

Regardless of where you are in Splunk Observability, you can search for relevant APM targets including service ...

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...