Splunk Search

Need help understanding how Transform "access-extractions" works

Kozanic
Path Finder

Hi to all that read this, Hoping one of you might be able to provide some assistance.

We have an app that is producing logs using Extended Common web format. Right now the source type we are using is linked to the access-extractions transform, but is not giving all the required fields.

I have tried a number of different approaches to get the required values using regex, but due to the nature of the logs, it feels like I might need a large number of regex entries to capture all variations.

After figuring out that we were using the access-extractions transform, I though a better approach would be to edit this to suit - however I'm still pretty new to regex and not really sure what the regex in this transform is actually doing or how it works.

A sample of the logs we are working with:

10.x.x.x www.blah.au - [20/Aug/2018:08:06:19 +1000] "GET /ebs/picmi/picmirepository.nsf/PICMI?OpenForm&t=PI&k=D&r=http%3A%2F%2Fwww.assediomoral.org%2Findex.php%2Fspip.php%3Farticle106 HTTP/1.1" 200 53245 "http://a.bla.es/?u=https%3A%2F%2Fwww.ebs.tga.gov.au%2Febs%2Fpicmi%2Fpicmirepository.nsf%2FPICMI%3FOpenForm%26t%3DPI%26k%3DD%26r%3Dhttp%253A%252F%252Fwww.assediomoral.org%252Findex.php%252Fspip.php%253Farticle106" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/64.0.3282.189 Safari/537.36 Vivaldi/1.95.1077.60" 422 "" "d:/Lotus/Domino/data/ebs/picmi/picmirepository.nsf"

10.x.x.x www.blah.au "107831_67744" [20/Aug/2018:08:06:19 +1000] "GET /ebs/lm/lmdrafts.nsf/xAgentUpdateValidationMonitoring.xsp?documentId=7D35903C63DAEB54CA2582C000426C09&dojo.preventCache=1534716380650 HTTP/1.1" 200 78 "https://www.ebs.tga.gov.au/ebs/LM/LMDrafts.nsf/GenApp.xsp?documentId=7d35903c63daeb54ca2582c000426c09&action=editDocument" "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/11.1.2 Safari/605.1.15" 31 "_ga=GA1.3.644697231.1517015993;
_gid=GA1.3.1541615874.1534641115; DomAuthSessId=A004127B4D088BDBD4B14B7E1BF0928B; WelcomeDialogLM=1; SessionID=9E1B7E03146C77042992C7B008ABB7DB303BC2AD" "d:/Lotus/Domino/data/ebs/lm/lmdrafts.nsf"

10.x.x.x www.blah.au - [20/Aug/2018:08:06:15 +1000] "GET /ebs/picmi/picmirepository.nsf/PICMI?OpenForm&t=PI&k=D&r=http%3A%2F%2Fwww2.ogs.state.ny.us%2Fhelp%2Furlstatusgo.html%3Furl%3Dhttp%253A%252F%252Fpedagogie.ac-toulouse.fr%252Feco-golfech%252Fspip.php%253Farticle129 HTTP/1.1" 200 53566 "https://www.apemsa.es/web/guest/analisis-de-agua/-/asset_publisher/7OQq/content/dureza?redirect=https%3A%2F%2Fwww.ebs.tga.gov.au%2Febs%2Fpicmi%2Fpicmirepository.nsf%2FPICMI%3FOpenForm%26t%3DPI%26k%3DD%26r%3Dhttp%253A%252F%252Fwww2.ogs.state.ny.us%252Fhelp%252Furlstatusgo.html%253Furl%253Dhttp%25253A%25252F%25252Fpedagogie.ac-toulouse.fr%25252Feco-golfech%25252Fspip.php%25253Farticle129" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36,gzip(gfe)" 282 "" "d:/Lotus/Domino/data/ebs/picmi/picmirepository.nsf"

 10.x.x.x www.blah.au - [20/Aug/2018:08:06:15 +1000] "GET /ebs/picmi/picmirepository.nsf/PICMI?OpenForm&t=PI&k=D&r=http%3A%2F%2Fwww2.ogs.state.ny.us%2Fhelp%2Furlstatusgo.html%3Furl%3Dhttp%253A%252F%252Fpedagogie.ac-toulouse.fr%252Feco-golfech%252Fspip.php%253Farticle129 HTTP/1.1" 200 53566 "https://www.apemsa.es/web/guest/analisis-de-agua/-/asset_publisher/7OQq/content/dureza?redirect=https%3A%2F%2Fwww.ebs.tga.gov.au%2Febs%2Fpicmi%2Fpicmirepository.nsf%2FPICMI%3FOpenForm%26t%3DPI%26k%3DD%26r%3Dhttp%253A%252F%252Fwww2.ogs.state.ny.us%252Fhelp%252Furlstatusgo.html%253Furl%253Dhttp%25253A%25252F%25252Fpedagogie.ac-toulouse.fr%25252Feco-golfech%25252Fspip.php%25253Farticle129" "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/66.0.3359.170 Safari/537.36,gzip(gfe)" 282 "" "d:/Lotus/Domino/data/ebs/picmi/picmirepository.nsf"

The particular fields that we are after are the last 3 which represent the time to process, cookie header and translated URL.

Regex from access-extractions:

^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++"(?<referer>[[bc_domain:referer_]]?+[^"]*+)"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]

I'm assuming I need to update the last part of this "[[all:other]]" but have tried running this in GUI search box and in regex101, neither seem to be able to work with it so struggling to understand how to update correctly.

0 Karma
1 Solution

Kozanic
Path Finder
0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...