Splunk Search

Trim field for operations

msarro
Builder

Hey everyone, I have an event type containing two fields that I need to trim. They're currently in this format:

02/09/11 at 2:56 PM","eth 0/1","NNNNNNNNNNNN","NNNNNNNNNNNN","72 ms","0","0 ms","0","4.200","G711U"

The fields I need trimmed are the "72 ms" and "0 ms". I just need the " ms" portion to be removed so I can start doing mathematical operations. The best would be removing this at index time, however I can also do it at search time. I tried using the field extractor but the regex it generates doesn't seem to make any sense.

This is what it generated: (?i)^(?:[^"]*"){9}(?P[^ ]+)

Tags (1)
0 Karma
1 Solution

David
Splunk Employee
Splunk Employee

If this comes in CSV format, it might be best to use Splunk's wizardry to extract the fields early on:

http://www.splunk.com/base/Documentation/latest/Data/Extractfieldsfromfileheadersatindextime

If that's not an option (or so you can get up and running first), the following regex should do the trick, for the particular event you've pasted:

YourSearch | rex field=_raw "\"(?<Field1>\d*) ms\",\"\d*\",\"(?<Field2>\d*) ms" 
           | stats avg(Field1), avg(Field2)

You can take that regular expression and move it into props.conf as well:

[my_sourcetype]
EXTRACT-extract_two_fields = "(?<Field1>\d*) ms","\d*","(?<Field2>\d*) ms

(I don't believe you need to escape the quotes in props.conf -- you may need to put those escapes back in, if not).

Should you have Splunk pull out the field names (as per the first link), you'd end up with ResponseTime="72 ms" and Jitter="0 ms" (or whatever those represent). You could then pull out the numbers by doing:

YourSearch | rex field=ResponseTime "(?<NumResponseTime>\d*)" 
           | rex field=Jitter "(?<NumJitter>\d*)"

Let me know if that answers your question.

View solution in original post

David
Splunk Employee
Splunk Employee

If this comes in CSV format, it might be best to use Splunk's wizardry to extract the fields early on:

http://www.splunk.com/base/Documentation/latest/Data/Extractfieldsfromfileheadersatindextime

If that's not an option (or so you can get up and running first), the following regex should do the trick, for the particular event you've pasted:

YourSearch | rex field=_raw "\"(?<Field1>\d*) ms\",\"\d*\",\"(?<Field2>\d*) ms" 
           | stats avg(Field1), avg(Field2)

You can take that regular expression and move it into props.conf as well:

[my_sourcetype]
EXTRACT-extract_two_fields = "(?<Field1>\d*) ms","\d*","(?<Field2>\d*) ms

(I don't believe you need to escape the quotes in props.conf -- you may need to put those escapes back in, if not).

Should you have Splunk pull out the field names (as per the first link), you'd end up with ResponseTime="72 ms" and Jitter="0 ms" (or whatever those represent). You could then pull out the numbers by doing:

YourSearch | rex field=ResponseTime "(?<NumResponseTime>\d*)" 
           | rex field=Jitter "(?<NumJitter>\d*)"

Let me know if that answers your question.

David
Splunk Employee
Splunk Employee

The Rex uses Perl Compatible Regular Expressions (PCRE).. at is at least very similar. You might get into issues where you have to escape things (such as a quote), but that you can generally figure out by starting with something simple, and building up with a little trial and error. The authoritative source for PCRE is http://perldoc.perl.org/perlre.html , and if you google PCRE tutorial, or Perl Regular Expressions Tutorial, you might find a few things that are a bit more "how do I get started." This one looks promising: http://www.troubleshooters.com/codecorn/littperl/perlreg.htm

0 Karma

msarro
Builder

Do you happen to have a good tutorial for the rex syntax, specifically for trimming/removing extra characters? I'd very much appreciate it.

0 Karma

David
Splunk Employee
Splunk Employee

Excellent. I'm glad to hear that.

0 Karma

msarro
Builder

The insert into props.conf worked like a bell. Thanks!

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...