Splunk Search

How to extract and create new sourcetypes at index time from logs in syslog format?

dhavamanis
Builder

Can you please help us, how to extract the sourcetype (like access_log format with all fields) from the below pattern of logs and these logs are coming as syslog format using syslog-ng forwarder, currently its not extracting any field like access_log automatically after indexing and assigning by default as syslog sourcetype. we are trying create new sourcetype and assign to populate the fields automatically.

Sample Log :

varnishncsa bal-1234 1.48.1.2 - - [22/Aug/2014:15:04:45 +0000] "GET http://www.test.com/error HTTP/1.1" 404 30041 "http://www.test.com/test" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.78.2 (KHTML, like Gecko) Version/7.0.6 Safari/537.78.2" 0.090000868 miss pass request_id="v-a732c002-2a0d-11e4-88b7-12313d2d8c3b" "-"
varnishncsa bal-1234 1.2.7.8 - - [22/Aug/2014:15:04:45 +0000] "GET http://www.test.com/?page=2 HTTP/1.1" 302 26 "-" "Mozilla/5.0 (compatible; MSIE 8.0; WOW64; Windows NT 5.1; Trident/4.0)" 0.420004606 miss miss request_id="v-a6fd5804-2a0d-11e4-838f-12313d2d8c3b" "-"
varnishncsa bal-1234 6.7.5.22 - - [22/Aug/2014:15:04:45 +0000] "GET http://www.test.com/ HTTP/1.1" 200 38087 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6" 0.000000000 hit hit request_id="v-8225c840-2a0d-11e4-8d96-12313d2d8c3b" "-"
varnishncsa bal-1234 5.2.17.9 - - [22/Aug/2014:15:04:44 +0000] "GET http://www.test.com/ HTTP/1.1" 302 26 "-" "Mozilla/5.0 (compatible; MSIE 8.0; WOW64; Windows NT 5.1; Trident/4.0)" 1.140012741 miss miss request_id="v-a6758a00-2a0d-11e4-945a-12313d2d8c3b" "-"
varnishncsa bal-1234 3.2.71.1 - - [22/Aug/2014:15:04:43 +0000] "GET http://www.test.com/5800722 HTTP/1.1" 404 30221 "http://www.test.com/" "Akamai-SiteSnapshot/6.9.0.2" 1.720019102 miss miss request_id="v-a616eefa-2a0d-11e4-87a4-12313d2d8c3b" "-"

Current config (with dynamic index, we need to form new sourcetype and assign it within the config to populate the fields automatically):

inputs.conf


[tcp-ssl:5140]
sourcetype = syslog

props.conf

[syslog]
TRANSFORMS-idx_routing = generic_idx_routing

transforms.conf


[generic_idx_routing]
REGEX = ^(\S+)\s*
FORMAT = $1
DEST_KEY = _MetaData:Index

More Details :

if i force to use existing sourcetype by uploading the data also, its not extracting fields. because in the prefix of each entry added similar "varnishncsa bal-1234 " values (please refer the above entry). Please provide the matching REGEX for new entry in transforms.com.

props.conf :

[acquia_varnish_log]
NO_BINARY_CHECK = 1
REPORT-access = access-extractions
SHOULD_LINEMERGE = False
TIME_PREFIX = \[
pulldown_type = 1

transforms.conf :

[access-extractions]
# matches access-common or access-combined apache logging formats
# Extracts: clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)
# Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"
REGEX = ^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]
1 Solution

dhavamanis
Builder

We have created the REGEX from like below and its working fine.

transforms.conf :

[acquia-access-extractions]

matches access-common or access-combined apache logging formats

Extracts: logfilename,nodename,clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)

Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"

REGEX = ^[[nspaces:logfilename]]\s++[[nspaces:nodename]]\s++[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]

[generic_idx_routing]
REGEX = ^varnishncsa
FORMAT = varnishncsa
DEST_KEY = _MetaData:Index

[generic_sourcetype_routing]
REGEX = .
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::acquia_access_combined

props.conf

[syslog]
TRANSFORMS-idx_routing = generic_idx_routing
TRANSFORMS-sourcetype_routing = generic_sourcetype_routing

View solution in original post

dhavamanis
Builder

We have created the REGEX from like below and its working fine.

transforms.conf :

[acquia-access-extractions]

matches access-common or access-combined apache logging formats

Extracts: logfilename,nodename,clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)

Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"

REGEX = ^[[nspaces:logfilename]]\s++[[nspaces:nodename]]\s++[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]

[generic_idx_routing]
REGEX = ^varnishncsa
FORMAT = varnishncsa
DEST_KEY = _MetaData:Index

[generic_sourcetype_routing]
REGEX = .
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::acquia_access_combined

props.conf

[syslog]
TRANSFORMS-idx_routing = generic_idx_routing
TRANSFORMS-sourcetype_routing = generic_sourcetype_routing

somesoni2
Revered Legend

See the props.conf and transforms.conf file in $SPLUNK_HOME\etc\system\default directory. They contain sourcetype definition and necessary field extraction for access_log type files. May be re-using those setting in your new sourcetype is what you need.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...