Splunk Search

How to write the regex for transforms.conf to extract fields and assign the proper sourcetype for my sample log format?

dhavamanis
Builder

Need your help,

We have this below format of log and need to assign sourcetype to extract the fields, can you please provide the working regex to include this in transforms.conf

2015-08-07T18:59:32.388226Z pnews-api 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0) Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1

fields:

timestamp
elb
client
backend
request_processing_time
backend_processing_time
response_processing_time
elb_status_code
backend_status_code
received_bytes
sent_bytes
request
user_agent
ssl_cipher
ssl_protocol

I have tried this, seems somehow its not working for me,

transforms.conf:

[s3-access-extractions]
REGEX = ^[[nspaces:req_time]]\s++[[nspaces:elb]]\s++[[nspaces:client]]\s++[[sbstring:backend]]\s++[[nspaces:request_processing_time]]\s++[[nspaces:backend_processing_time]]\s++[[nspaces:response_processing_time]]\s++[[nspaces:elb_status_code]]\s++[[nspaces:backend_status_code]]\s++[[nspaces:received_bytes]]\s++[[nspaces:sent_bytes]]\s++[[access-request]](?:\s++[[qstring:useragent]]\s++[[nspaces:ssl_cipher]]\s++[[nspaces:ssl_protocol]]

props.conf

[s3_access_combined]
REPORT-access = s3-access-extractions
SHOULD_LINEMERGE = false
TIME_FORMAT = %Y-%m-%dT%H:%M:%S.%6NZ
EVAL-date_hour = strftime(_time,"%H")
EVAL-date_mday = strftime(_time,"%d")
EVAL-date_minute = strftime(_time,"%M")
EVAL-date_month = strftime(_time,"%m")
EVAL-date_second = strftime(_time,"%S")
EVAL-date_wday = strftime(_time,"%A")
EVAL-date_year = strftime(_time,"%Y")
category = Custom
pulldown_type = true

[rule::s3_access_combined]
sourcetype = s3_access_combined
MORE_THAN_75 = ^\S+ \S+ \S+ \S* ?\[[^\]]+\] "[^"]*" \S+ \S+ \S+ "[^"]*"$
0 Karma
1 Solution

woodcock
Esteemed Legend

Forget transforms.conf for now and try this:

props.conf

[s3_access_combined]
EXTRACT-s3-access-extractions = ^(?<req_time>[\S]+)\s+(?<elb>[\S]+)\s+(?<client>[\S]+)\s+(?<backend>[\S]+)\s+(?<request_processing_time>[\S]+)\s+(?<backend_processing_time>[\S]+)\s+(?<response_processing_time>[\S]+)\s+(?<elb_status_code>[\S]+)\s+(?<backend_status_code>[\S]+)\s+(?<received_bytes>[\S]+)\s+(?<sent_bytes>[\S]+)\s+"(?<access_request>[^"]+)"\s+"(?<useragent>[^"]+)"\s+(?<ssl_cipher>[\S]+)\s+(?<ssl_protocol>[\S]+)

View solution in original post

woodcock
Esteemed Legend

Forget transforms.conf for now and try this:

props.conf

[s3_access_combined]
EXTRACT-s3-access-extractions = ^(?<req_time>[\S]+)\s+(?<elb>[\S]+)\s+(?<client>[\S]+)\s+(?<backend>[\S]+)\s+(?<request_processing_time>[\S]+)\s+(?<backend_processing_time>[\S]+)\s+(?<response_processing_time>[\S]+)\s+(?<elb_status_code>[\S]+)\s+(?<backend_status_code>[\S]+)\s+(?<received_bytes>[\S]+)\s+(?<sent_bytes>[\S]+)\s+"(?<access_request>[^"]+)"\s+"(?<useragent>[^"]+)"\s+(?<ssl_cipher>[\S]+)\s+(?<ssl_protocol>[\S]+)

dhavamanis
Builder

Thank you so much, i have added the below in transforms.conf and its working fine,

REGEX = ^(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+(?[\S]+)\s+[[access-request]]\s+[[qstring:useragent]]\s+(?[\S]+)\s+(?[\S]+)
0 Karma

jnussbaum_splun
Splunk Employee
Splunk Employee

Put in props.conf , with sourcetype s3_access_combined

[s3_access_combined]
EXTRACT-elb,client,backend,request_processing_time,backend_processing_time,response_processing_time,elb_status_code,backend_status_code,received_bytes,sent_bytes,request,user_agent,ssl_cipher,ssl_protocol = ^[^ \n]* (?P[^ ]+)[^ \n]* (?P[^ ]+)[^ \n]* (?P[^ ]+)\s+(?P[^ ]+)[^ \n]* (?P\d+\.\d+)\s+(?P\d+\.\d+)\s+(?P[^ ]+)[^ \n]* (?P\d+)[^ \n]* (?P\d+)[^ \n]* (?P[^ ]+)[^ \n]* "(?P[^"]+)"\s+"(?P[^"]+)[^"\n]*"\s+(?P[^ ]+)\s+(?P.+)

AnilPujar
Path Finder

what if the fields sequence changes..
2015-08-07T18:59:32.388226Z pnews-api 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0) Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1

2015-08-07T18:59:32.388226Z pnews-api Gecko/20100101 Firefox/24.0" ECDHE-RSA-AES128-SHA TLSv1 1.1.2.1:5681 10.4.0.81:8081 0.000049 0.002743 0.000021 200 200 0 686 "GET https://xyz.xyz.com:443/news-content/ HTTP/1.1" "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:24.0; GomezAgent 3.0)

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Get the T-shirt to Prove You Survived Splunk University Bootcamp

As if Splunk University, in Las Vegas, in-person, with three days of bootcamps and labs weren’t enough, now ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...