Splunk Search

How to extract and create new sourcetypes at index time from logs in syslog format?

dhavamanis
Builder

Can you please help us, how to extract the sourcetype (like access_log format with all fields) from the below pattern of logs and these logs are coming as syslog format using syslog-ng forwarder, currently its not extracting any field like access_log automatically after indexing and assigning by default as syslog sourcetype. we are trying create new sourcetype and assign to populate the fields automatically.

Sample Log :

varnishncsa bal-1234 1.48.1.2 - - [22/Aug/2014:15:04:45 +0000] "GET http://www.test.com/error HTTP/1.1" 404 30041 "http://www.test.com/test" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_4) AppleWebKit/537.78.2 (KHTML, like Gecko) Version/7.0.6 Safari/537.78.2" 0.090000868 miss pass request_id="v-a732c002-2a0d-11e4-88b7-12313d2d8c3b" "-"
varnishncsa bal-1234 1.2.7.8 - - [22/Aug/2014:15:04:45 +0000] "GET http://www.test.com/?page=2 HTTP/1.1" 302 26 "-" "Mozilla/5.0 (compatible; MSIE 8.0; WOW64; Windows NT 5.1; Trident/4.0)" 0.420004606 miss miss request_id="v-a6fd5804-2a0d-11e4-838f-12313d2d8c3b" "-"
varnishncsa bal-1234 6.7.5.22 - - [22/Aug/2014:15:04:45 +0000] "GET http://www.test.com/ HTTP/1.1" 200 38087 "-" "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.2) Gecko/20100115 Firefox/3.6" 0.000000000 hit hit request_id="v-8225c840-2a0d-11e4-8d96-12313d2d8c3b" "-"
varnishncsa bal-1234 5.2.17.9 - - [22/Aug/2014:15:04:44 +0000] "GET http://www.test.com/ HTTP/1.1" 302 26 "-" "Mozilla/5.0 (compatible; MSIE 8.0; WOW64; Windows NT 5.1; Trident/4.0)" 1.140012741 miss miss request_id="v-a6758a00-2a0d-11e4-945a-12313d2d8c3b" "-"
varnishncsa bal-1234 3.2.71.1 - - [22/Aug/2014:15:04:43 +0000] "GET http://www.test.com/5800722 HTTP/1.1" 404 30221 "http://www.test.com/" "Akamai-SiteSnapshot/6.9.0.2" 1.720019102 miss miss request_id="v-a616eefa-2a0d-11e4-87a4-12313d2d8c3b" "-"

Current config (with dynamic index, we need to form new sourcetype and assign it within the config to populate the fields automatically):

inputs.conf


[tcp-ssl:5140]
sourcetype = syslog

props.conf

[syslog]
TRANSFORMS-idx_routing = generic_idx_routing

transforms.conf


[generic_idx_routing]
REGEX = ^(\S+)\s*
FORMAT = $1
DEST_KEY = _MetaData:Index

More Details :

if i force to use existing sourcetype by uploading the data also, its not extracting fields. because in the prefix of each entry added similar "varnishncsa bal-1234 " values (please refer the above entry). Please provide the matching REGEX for new entry in transforms.com.

props.conf :

[acquia_varnish_log]
NO_BINARY_CHECK = 1
REPORT-access = access-extractions
SHOULD_LINEMERGE = False
TIME_PREFIX = \[
pulldown_type = 1

transforms.conf :

[access-extractions]
# matches access-common or access-combined apache logging formats
# Extracts: clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)
# Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"
REGEX = ^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]
1 Solution

dhavamanis
Builder

We have created the REGEX from like below and its working fine.

transforms.conf :

[acquia-access-extractions]

matches access-common or access-combined apache logging formats

Extracts: logfilename,nodename,clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)

Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"

REGEX = ^[[nspaces:logfilename]]\s++[[nspaces:nodename]]\s++[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]

[generic_idx_routing]
REGEX = ^varnishncsa
FORMAT = varnishncsa
DEST_KEY = _MetaData:Index

[generic_sourcetype_routing]
REGEX = .
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::acquia_access_combined

props.conf

[syslog]
TRANSFORMS-idx_routing = generic_idx_routing
TRANSFORMS-sourcetype_routing = generic_sourcetype_routing

View solution in original post

dhavamanis
Builder

We have created the REGEX from like below and its working fine.

transforms.conf :

[acquia-access-extractions]

matches access-common or access-combined apache logging formats

Extracts: logfilename,nodename,clientip, clientport, ident, user, req_time, method, uri, root, file, uri_domain, uri_query, version, status, bytes, referer_url, referer_domain, referer_proto, useragent, cookie, other (remaining chars)

Note: referer is misspelled in purpose because that is the "official" spelling for "HTTP referer"

REGEX = ^[[nspaces:logfilename]]\s++[[nspaces:nodename]]\s++[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[nspaces:bytes]?[[all:other]]

[generic_idx_routing]
REGEX = ^varnishncsa
FORMAT = varnishncsa
DEST_KEY = _MetaData:Index

[generic_sourcetype_routing]
REGEX = .
DEST_KEY = MetaData:Sourcetype
FORMAT = sourcetype::acquia_access_combined

props.conf

[syslog]
TRANSFORMS-idx_routing = generic_idx_routing
TRANSFORMS-sourcetype_routing = generic_sourcetype_routing

somesoni2
SplunkTrust
SplunkTrust

See the props.conf and transforms.conf file in $SPLUNK_HOME\etc\system\default directory. They contain sourcetype definition and necessary field extraction for access_log type files. May be re-using those setting in your new sourcetype is what you need.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...