Splunk Search

How to write the regex for transforms.conf to extract fields and assign the proper sourcetype for my sample log format?

dhavamanis
Builder

Need your help,

We have this below format of log and need to assign sourcetype to extract the fields, can you please provide the working regex to include this in transforms.conf

2015-08-07T18:59:32.388226Z pnews-api 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0) Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1

fields:

timestamp
elb
client
backend
request_processing_time
backend_processing_time
response_processing_time
elb_status_code
backend_status_code
received_bytes
sent_bytes
request
user_agent
ssl_cipher
ssl_protocol

I have tried this, seems somehow its not working for me,

transforms.conf:

[s3-access-extractions]
REGEX = ^[[nspaces:req_time]]\s++[[nspaces:elb]]\s++[[nspaces:client]]\s++[[sbstring:backend]]\s++[[nspaces:request_processing_time]]\s++[[nspaces:backend_processing_time]]\s++[[nspaces:response_processing_time]]\s++[[nspaces:elb_status_code]]\s++[[nspaces:backend_status_code]]\s++[[nspaces:received_bytes]]\s++[[nspaces:sent_bytes]]\s++[[access-request]](?:\s++[[qstring:useragent]]\s++[[nspaces:ssl_cipher]]\s++[[nspaces:ssl_protocol]]

props.conf

[s3_access_combined]
REPORT-access = s3-access-extractions
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%6NZ
EVAL-date_hour = strftime(_time,"%H")
EVAL-date_mday = strftime(_time,"%d")
EVAL-date_minute = strftime(_time,"%M")
EVAL-date_month = strftime(_time,"%m")
EVAL-date_second = strftime(_time,"%S")
EVAL-date_wday = strftime(_time,"%A")
EVAL-date_year = strftime(_time,"%Y")
category = Custom
pulldown_type = true

[rule::s3_access_combined]
sourcetype = s3_access_combined
MORE_THAN_75 = ^\S+ \S+ \S+ \S* ?\[[^\]]+\] "[^"]*" \S+ \S+ \S+ "[^"]*"$
0 Karma
1 Solution

woodcock
Esteemed Legend

Forget transforms.conf for now and try this:

props.conf

[s3_access_combined]
EXTRACT-s3-access-extractions = ^(?<req_time>[\S]+)\s+(?<elb>[\S]+)\s+(?<client>[\S]+)\s+(?<backend>[\S]+)\s+(?<request_processing_time>[\S]+)\s+(?<backend_processing_time>[\S]+)\s+(?<response_processing_time>[\S]+)\s+(?<elb_status_code>[\S]+)\s+(?<backend_status_code>[\S]+)\s+(?<received_bytes>[\S]+)\s+(?<sent_bytes>[\S]+)\s+"(?<access_request>[^"]+)"\s+"(?<useragent>[^"]+)"\s+(?<ssl_cipher>[\S]+)\s+(?<ssl_protocol>[\S]+)

View solution in original post

woodcock
Esteemed Legend

Forget transforms.conf for now and try this:

props.conf

[s3_access_combined]
EXTRACT-s3-access-extractions = ^(?<req_time>[\S]+)\s+(?<elb>[\S]+)\s+(?<client>[\S]+)\s+(?<backend>[\S]+)\s+(?<request_processing_time>[\S]+)\s+(?<backend_processing_time>[\S]+)\s+(?<response_processing_time>[\S]+)\s+(?<elb_status_code>[\S]+)\s+(?<backend_status_code>[\S]+)\s+(?<received_bytes>[\S]+)\s+(?<sent_bytes>[\S]+)\s+"(?<access_request>[^"]+)"\s+"(?<useragent>[^"]+)"\s+(?<ssl_cipher>[\S]+)\s+(?<ssl_protocol>[\S]+)

dhavamanis
Builder

Thank you so much, i have added the below in transforms.conf and its working fine,

REGEX = ^(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+[[access-request]]\s+[[qstring:useragent]]\s+(?[\S]+)\s+(?[\S]+)
0 Karma

jnussbaum_splun
Splunk Employee
Splunk Employee

Put in props.conf , with sourcetype s3_access_combined

[s3_access_combined]
EXTRACT-elb,client,backend,request_processing_time,backend_processing_time,response_processing_time,elb_status_code,backend_status_code,received_bytes,sent_bytes,request,user_agent,ssl_cipher,ssl_protocol = ^[^ \n]* (?P[^ ]+)[^ \n]* (?P[^ ]+)[^ \n]* (?P[^ ]+)\s+(?P[^ ]+)[^ \n]* (?P\d+\.\d+)\s+(?P\d+\.\d+)\s+(?P[^ ]+)[^ \n]* (?P\d+)[^ \n]* (?P\d+)[^ \n]* (?P[^ ]+)[^ \n]* "(?P[^"]+)"\s+"(?P[^"]+)[^"\n]*"\s+(?P[^ ]+)\s+(?P.+)

AnilPujar
Path Finder

what if the fields sequence changes..
2015-08-07T18:59:32.388226Z pnews-api 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0) Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1

2015-08-07T18:59:32.388226Z pnews-api Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0)

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...