Splunk Search

Search extract grabbing more than tested/verified.

nocostk
Communicator

I'm extracting a partial line from a multi-line event. When I test the extract out everything returns as it should. However, when I perform a search and view the extractions the rest of the multi-line event is showing up. Any insights on this?

Here is the event:

Date: 2011-01-05 13:48:49
Request made by: wwwrun   /opt/apache2/bin/httpd -k start
Actual request: db_auth  /usr/bin/perl /home/db_auth/db_auth dashboard host
==============================================================

This is the regex used for extraction:

(?i)Request\s+made\s+by:\s+\w+\s+(?P<dbauth_request>.+) 

In testing it extracts correctly:

dbauth_request"=/opt/apache2/bin/httpd -k start"

In searching, however, this is being extracted:

    dbauth_request="/opt/apache2/bin/httpd -k start
    Actual request: db_auth  /usr/bin/perl /home/db_auth/db_auth dashboard host
    =============================================================="
Tags (1)
1 Solution

Lowell
Super Champion

I assume you are comparing interactive field extraction using "|rex" vs setting up a permanent field extraction in transforms.conf or props.conf. Is that correct, you exact situation wasn't clearly spelled out.

Try one of these:

(?i)^Request\s+made\s+by:\s+\w+\s+(?P<dbauth_request>.+)$
(?im)[\r\n]Request\s+made\s+by:\s+\w+\s+(?P<dbauth_request>.+?)[\r\n]

If I wasn't feeling lazy I'd stick it in my regex tool, but that takes all the fun out of it. Best of luck.

View solution in original post

Lowell
Super Champion

I assume you are comparing interactive field extraction using "|rex" vs setting up a permanent field extraction in transforms.conf or props.conf. Is that correct, you exact situation wasn't clearly spelled out.

Try one of these:

(?i)^Request\s+made\s+by:\s+\w+\s+(?P<dbauth_request>.+)$
(?im)[\r\n]Request\s+made\s+by:\s+\w+\s+(?P<dbauth_request>.+?)[\r\n]

If I wasn't feeling lazy I'd stick it in my regex tool, but that takes all the fun out of it. Best of luck.

Lowell
Super Champion

BTW. I think the interactive field extraction (IFX) tool uses a python search command (and therefore the python regex engine) whereas Splunk uses PCRE for built-in field extractions (as well as the rex search command), so it is conceivable to get some subtle rexex flavor difference like this when using IFX (although, the normally don't seem subtle when your looking at them). It's also possible something else was going on.

0 Karma

Lowell
Super Champion

Yeah. I do see that I made a typo on the second one. You don't need the [\r\n] before Request, but it may be helpful to tell the regex engine to always expect the word Request to be the start of a line. Glad you have a working solution.

0 Karma

nocostk
Communicator

These are search-time extractions. I'm just using the 'extract field' tool from my search results. Within there there is a 'test' button that will apply the regex.

Of your two suggestions I tried the former yesterday with the same results. I then tried a modified version of the latter today and it is extracting correctly. Thanks!

(?im)Request\s+made\s+by:\s+\w+\s+(?P.+?)[\r\n]

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...