Splunk Search

Why is my props.conf and transforms.conf configuration not extracting fields from access_combined logs with a vhost?

lukas_loder
Communicator

Hi

I have a Problem with my Access_combined which has a vhost at the beginning like this:

www.domain.com:80 10.60.50.40 - - [04/Nov/2015:11:14:26 +0100] "GET /path/to/file/custom/flexslider.css HTTP/1.1" 200 1663 "http://www.domain.com/" "Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko"

When I index it, it doesn't get the fields from Access_combined.
I already tried to create a new transforms.conf and props.conf.

I'm indexing those logs with sourcetype=webserver_access_combined

Props.conf

[webserver_access_combined]
pulldown_type = true 
maxDist = 28
MAX_TIMESTAMP_LOOKAHEAD = 128
REPORT-access = vhost-access-extractions
SHOULD_LINEMERGE = False
TIME_PREFIX = \[
category = Web
description = National Center for Supercomputing Applications (NCSA) combined format HTTP web server logs (can be generated by apache or other web servers)

Transforms.conf

[vhost-access-extractions]
# matches access-common or access-combined apache logging formats
# Extracts: clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)  
# Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer" 
REGEX = ^[[nspaces:vhost]]\s++[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]

I have those configurations on my indexer Servers. And I also see the logs with the correct sourcetype, but it doesn't work.

Does somebody have an idea why it doesn't work?

Thanks!

0 Karma

woodcock
Esteemed Legend

Your REGEX is crazy; try this one:

REGEX=^(?<vhost>\S+)\s+(?<clientip>\S+)\s++(?<ident>\S+)\s+(?<user>\S+)\s+\[(?<req_time>[^\]]+)\]\s+"(?<access_request>[^"]+)"\s+(?<status>\S+)\s+(?<bytes>\S+)\s+"(?<referrer>[^"]+)"\s+"(?<user_agent>[^"]+)"
0 Karma

hagjos43
Contributor

Did you build out your extractions and confirm them in something like regex101? I copied your example log and your extractions and it did not match. I started a bit and for the first few fields it would look more like this: \n(?\S+):(?\d+)\s(?\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\s

Also you'll want your extractions to take place at search-time in your props.conf like this:

EXTRACT-blah = \n(?<vhost>\S+):(?<clientport>\d+)\s(?<clientip>\d{1,3}.\d{1,3}.\d{1,3}.\d{1,3})\s
0 Karma

lukas_loder
Communicator

I just used the the original which was in the transforms.conf like this:

REGEX = ^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]

and tried to change this one... so this isn't the correct way?

0 Karma

hagjos43
Contributor

based on what I"m seeing that won't work. to see if your regex works do something like this:

Your Search | rex "^(?<vhost>\S+)\s+(?<clientip>\S+)\s++(?<ident>\S+)\s+(?<user>\S+)\s+\[(?<req_time>[^\]]+)\]\s+"(?<access_request>[^"]+)"\s+(?<status>\S+)\s+(?<bytes>\S+)\s+"(?<referrer>[^"]+)"\s+"(?<user_agent>[^"]+)""
0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...