All Apps and Add-ons

search time field extraction doesn't work

cam343
Path Finder

Hello,
We are attempting to resolve problem where data hasn't been assigned the correct source type.
We have attempted to resolve this by performing search time field extractions but nothing seems to work.

The sourcetype has been identified as: www_website_com_au_access_log-2
The source is: /var/log/httpd/www_website_com_au_access_log

In props.conf I have tried:

[source::/var/log/httpd/www_website_com_au_access_log]

rename=access-common

I have tried:
[source::/var/log/httpd/www_website_com_au_access_log]

sourcetype=access-common

I have tried:
[source::/var/log/httpd/www_website_com_au_access_log]

TRANSFORMS-fix_ae = fix_access_extractions

With the complementing transforms.conf

[fix_access_extractions]
matches access-common or access-combined apache logging formats

Extracts: clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)

Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"

REGEX = ^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]
FORMAT = sourcetype::access_common
DEST_KEY = MetaData:Sourcetype

Yet when I do a search on source=/var/log/httpd/www_website_com_au_access_log

The fields are still useless and no useful fields are returned.

Thanks in advance
Cam

SAMPLE DATA:

192.168.x.x (192.168.x.x) www.website.com - - [23/May/2013:17:05:44 +8000] "GET /images/external/website_logo.png HTTP/1.1" 304 - "-" "Mozilla/4.0 (compatible; MSIE 7.0; Windows NT 6.1; WOW64; Trident/5.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C; .NET4.0E; InfoPath.2; MSOffice 12)" 21832 TLSv1 AES128-SHA

Please see http://pastebin.com/zYBgyhhn for raw question

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi cam343

you're mixing up things here, you are setting up an index time field extraction, not search time. This means, only new indexed events will have those fields and not the older events.

But maybe you should test your field extraction in the search app by using only one field at the time and proceed until you get what you want, like:

 ...  | rex "(?<nspaces:clientip>^(\d{3}.){2}x\.x)"

this matches the first IP in your log data and creates in your search result a new field called nspaces:clientip. This way you can build the regex and use them in the transforms.conf to have the fields extracted at index time for any new event.

as always docs is a good place to read:
http://docs.splunk.com/Documentation/Splunk/5.0.2/Knowledge/Addfieldsatsearchtime
http://docs.splunk.com/Documentation/Splunk/5.0.2/Data/Configureindex-timefieldextraction

hope this helps

cheers,
MuS

0 Karma

Ayn
Legend

I'm a bit confused. I don't see any statement at all telling Splunk to apply any search-time field extractions? You got a TRANSFORMS statement there but that is index-time, not search-time.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...