Getting Data In

How to modify access_combined field definitions to include spaces in uri field?

orion44
Communicator

I recently discovered the access_combined field definitions don't properly parse the uri fields if it includes a space. I understand the reasoning as spaces are largely regarded as invalid and should be escaped with %20 – however that shouldn't have any bearing on parsing the result in Splunk.

How can I modify the access_combined field definitions via transforms.conf to include spaces in the uri field?

Example event with spaces in the uri:

1.1.1.1 80 - [07/Aug/2019:21:43:37 +0000] "GET /demo_bin/resource.php?command= space and another space and some more spaces  in between HTTP/1.1" 400 583 "-" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)" "-" "-"
0 Karma
1 Solution

woodcock
Esteemed Legend

Here are the details:

| makeresults 
|  eval _raw = "1.1.1.1 80 - [07/Aug/2019:21:43:37 +0000] \"GET /demo_bin/resource.php?command= space and another space and some more spaces  in between HTTP/1.1\" 400 583 \"-\" \"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)\" \"-\" \"-\""
| rename COMMENT AS "This is 'access-extractions' from '/opt/splunk/etc/system/local/transforms.conf'"
| rename COMMENT AS "^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++\"(?<referer>[[bc_domain:referer_]]?+[^\"]*+)\"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]"
| rename COMMENT AS "This is 'access-request' from '/opt/splunk/etc/system/local/transforms.conf'"
| rename COMMENT AS "\s*+[[reqstr:method]]?(?:\s++[[bc_uri]](?:\s++[[reqstr:version]])*)?\s*+"
| rename COMMENT AS "This is 'bc_uri' from '/opt/splunk/etc/system/local/transforms.conf'"
| rex "(?<uri>[[bc_domain:uri_]]?+(?<uri_path>[[uri_root]]?[[uri_seg]]*(?<file>[^\s\?/]+)?)(?:\?(?<uri_query>[^\s]*))?)"
| rex mode=sed "s/.*\"\w+\s+//"
| rename COMMENT AS "Let's modify it to fix it..."
| rex "(?<uri>[[bc_domain:uri_]]?+(?<uri_path>[[uri_root]]?[[uri_seg]]*(?<file>[^\s\?/]+)?)(?:\?(?<uri_query>[^\s]*(?:\s+[^\"]+))?)?)"

So you need to create an updated definition for bc_uri along the lines of what I did in the last line above and put it someplace where the transforms.conf will have global scope preferences.

View solution in original post

woodcock
Esteemed Legend

Here are the details:

| makeresults 
|  eval _raw = "1.1.1.1 80 - [07/Aug/2019:21:43:37 +0000] \"GET /demo_bin/resource.php?command= space and another space and some more spaces  in between HTTP/1.1\" 400 583 \"-\" \"Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)\" \"-\" \"-\""
| rename COMMENT AS "This is 'access-extractions' from '/opt/splunk/etc/system/local/transforms.conf'"
| rename COMMENT AS "^[[nspaces:clientip]]\s++[[nspaces:ident]]\s++[[nspaces:user]]\s++[[sbstring:req_time]]\s++[[access-request]]\s++[[nspaces:status]]\s++[[nspaces:bytes]](?:\s++\"(?<referer>[[bc_domain:referer_]]?+[^\"]*+)\"(?:\s++[[qstring:useragent]](?:\s++[[qstring:cookie]])?+)?+)?[[all:other]]"
| rename COMMENT AS "This is 'access-request' from '/opt/splunk/etc/system/local/transforms.conf'"
| rename COMMENT AS "\s*+[[reqstr:method]]?(?:\s++[[bc_uri]](?:\s++[[reqstr:version]])*)?\s*+"
| rename COMMENT AS "This is 'bc_uri' from '/opt/splunk/etc/system/local/transforms.conf'"
| rex "(?<uri>[[bc_domain:uri_]]?+(?<uri_path>[[uri_root]]?[[uri_seg]]*(?<file>[^\s\?/]+)?)(?:\?(?<uri_query>[^\s]*))?)"
| rex mode=sed "s/.*\"\w+\s+//"
| rename COMMENT AS "Let's modify it to fix it..."
| rex "(?<uri>[[bc_domain:uri_]]?+(?<uri_path>[[uri_root]]?[[uri_seg]]*(?<file>[^\s\?/]+)?)(?:\?(?<uri_query>[^\s]*(?:\s+[^\"]+))?)?)"

So you need to create an updated definition for bc_uri along the lines of what I did in the last line above and put it someplace where the transforms.conf will have global scope preferences.

woodcock
Esteemed Legend

It is not common for spaces to even exist so you would have to post some sample raw events if you'd like anybody to help.

0 Karma

orion44
Communicator

Updated question to include example event

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...