Splunk Search

Extracting and concatenating regex captured groups in a single transform / extraction

althomas
Communicator

Hi all,

I'm trying to get pivots working with a user's data, but I'm having issues getting the fields auto-extracted prior to use in the pivots.

In our example, the user has decided to include commas in the response time log message. I want to have this extracted out as an integer, but I'm not having much luck.

Example:

rex field=message "Took (?<response_time_ms>\S+) ms" | rex mode=sed field=response_time_ms "s/,//g" | where response_time_ms > 1000

This is straightforward enough at search time, but I was wondering if there as a way to do it automagically, like so:
transforms.conf

[my_response_time]
FORMAT = response_time_ms::$1$2
REGEX = [Tt]ook (?:(\d+),){0,1}(\d+) ms
SOURCE_KEY = message

props.conf

[my_sourcetype]
REPORT-my_response_time = my_response_time

Is this possible in any way? Doing the above just gives response_time_ms a value of "$1$2" literally, rather than replacing the value.

Cheers!!

Best regards,
Alex

0 Karma
1 Solution

micahkemp
Champion

From the transforms.conf docs:

  * At index time only, you can use FORMAT to create concatenated fields:
    * Example: FORMAT = ipaddress::$1.$2.$3.$4

If you want concatenated fields at search time, you'll have to use a combination of props/transforms and eval (which can go in props).

View solution in original post

micahkemp
Champion

From the transforms.conf docs:

  * At index time only, you can use FORMAT to create concatenated fields:
    * Example: FORMAT = ipaddress::$1.$2.$3.$4

If you want concatenated fields at search time, you'll have to use a combination of props/transforms and eval (which can go in props).

althomas
Communicator

After fiddling for a bit, I've managed to find a solution to this which will extract it out automatically for me:

transforms.conf

[response_time_extract]
REGEX = Took (?:(?<resp_time_1>\d+),){0,1}(?<resp_time_2>\d+) ms

props.conf

[test]
REPORT-test_field_extr = response_time_extract
EVAL-response_time_ms = if(isnull(resp_time_1),resp_time_2,resp_time_1 . resp_time_2)

The data looks (sort of) like this:

100
500
1,100
2,300

The transforms will always extract out the numbers under 1000 and will only extract the numbers 1000 and above if they exist. It will then concatenate them if they both exist, otherwise it will only use the second capturing group.

0 Karma

micahkemp
Champion

Excellent. You should convert this comment to an answer and accept it.

0 Karma

althomas
Communicator

Tried to but failed. I've just moved it underneath yours instead and accepted yours (as it is correct as well!).

0 Karma
Get Updates on the Splunk Community!

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...

Updated Data Management and AWS GDI Inventory in Splunk Observability

We’re making some changes to Data Management and Infrastructure Inventory for AWS. The Data Management page, ...

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...