Splunk Search

Using regex to capture exactly 20 characters

hharvey
Explorer

I need to create a field extraction that extracts the first 20 characters ONLY from an error log; I've got the regex that extracts the full error:

rex "\#[\w0-9\W]{9}\:\s(?P!ERROR[^\\*]+)"

FYI in my regex above: !ERROR = < error> (no space) - the text editor is removing anything after < even when using the code sample optiion.

Is there regex that will capture only the first 20 characters as the field < error>? Here are the logs in question and I provided an example of the field data I am trying to extract.

I feel like I may be able to use the substr command for eval, but not exactly sure of the correct format... this doesn't seem to work:

ex "\#[\w0-9\W]{9}\:\s(?P!ERROR[^\\*]+)" | top 100 error | eval error=substr("error", 1, 20)

s1-sn701:2012-08-14 09:55:09,723 INFO  [STDOUT] [ERROR] 2012-08-14 09:55:09           LP::ThisController - #aWMfOOXSL: EAL: ASYNC: in async payment, could not create items, api returned 320
s1-sn903:2012-08-14 07:01:34,169 INFO  [STDOUT] [ERROR] 2012-08-14 07:01:34           LP::OfferController - #dN'Fi<<Od: Error decoding or storing lat/long, exception was 'undefined method `[]' for nil:NilClass'
s1-sn902:2012-08-14 01:33:23,562 INFO  [STDOUT] [ERROR] 2012-08-14 01:33:23           UI::ReportController - #fm7e(n$2J: API returned 952 error for report data
s1-sn902:2012-08-14 01:11:31,431 INFO  [STDOUT] [ERROR] 2012-08-14 01:11:31           LP::ThisController - #9['?rp`fY: PAYKEY from payment data is blank or missing on item page
s1-sn902:2012-08-14 01:11:31,430 INFO  [STDOUT] [ERROR] 2012-08-14 01:11:31           LP::ThisController - #9['?rp`fY: PAYKEY from session is blank or missing on item page
s1-sn902:2012-08-14 00:15:16,746 INFO  [STDOUT] [ERROR] 2012-08-14 00:15:16           LP::ThisController - #Xq5Bez;vF: Attempting to purchase item that is expired
s1-sn701:2012-08-13 23:55:22,969 INFO  [STDOUT] [ERROR] 2012-08-13 23:55:22           LP::OfferController - #\)F3XjY_v: PAYKEY is blank or missing on item page
s1-sn701:2012-08-13 23:29:31,458 INFO  [STDOUT] [ERROR] 2012-08-13 23:29:31           LP::ThisController - #z|gXWQY1S: EAL: ASYNC: in async payment could not create items, api returned 320
s1-sn902:2012-08-13 12:40:13,350 INFO  [STDOUT] [ERROR] 2012-08-13 12:40:13           UI::Rails - #ErS;=x*'): Failed to get [1]https://aurl.url.com/v1/85/pp/accounting/  [2]betsy@betsyklein.com/
s1-sn902:2012-08-13 12:40:13,349 INFO  [STDOUT] [ERROR] 2012-08-13 12:40:13           UI::Rails - #ErS;=x*'): ["classpath:/META-INF/jruby.home/lib/ruby/1.8/uri/common.rb:436:in `split'"
s1-sn902:2012-08-13 12:40:13,347 INFO  [STDOUT] [ERROR] 2012-08-13 12:40:13           UI::Rails - #ErS;=x*'): -----------------------------
s1-sn902:2012-08-13 12:40:13,346 INFO  [STDOUT] [ERROR] 2012-08-13 12:40:13           UI::Rails - #ErS;=x*'): bad URI(is not URI?): [3]https://aurl.url.com/85/bills/pp/accounting/  [4]uname@aurl.com/
s1-sn902:2012-08-13 12:40:13,346 INFO  [STDOUT] [ERROR] 2012-08-13 12:40:13           UI::Rails - #ErS;=x*'): Oops, an error occured!

Example of data I want to extract as the error field:

EAL: ASYNC: in async
Error decoding or st
API returned 952 err
PAYKEY from payment 
PAYKEY from session 
Attempting to purcha
PAYKEY is blank or m
ASYNC: in async pay
Failed to get [1]htt
["classpath:/META-IN
--------------------
bad URI(is not URI?)
Oops, an error occur
Tags (1)
0 Karma
1 Solution

kristian_kolb
Ultra Champion

Hmm, that was a bit hard to read... 🙂

First do you NEED the full error messsage, otherwise you can just alter the rex to just capture up to 20 characters;

rex "\#[\w0-9\W]{9}:\s(?P<ERROR>[^\\*]{1,20})"

Also, you could probably make it a bit easier on the eye like this;

rex "\#.{9}:\s(?P<ERROR>.{20})"

if the messages themselves are always more than 20 chars long.

Hope this helps,

Kristian

View solution in original post

kristian_kolb
Ultra Champion

Hmm, that was a bit hard to read... 🙂

First do you NEED the full error messsage, otherwise you can just alter the rex to just capture up to 20 characters;

rex "\#[\w0-9\W]{9}:\s(?P<ERROR>[^\\*]{1,20})"

Also, you could probably make it a bit easier on the eye like this;

rex "\#.{9}:\s(?P<ERROR>.{20})"

if the messages themselves are always more than 20 chars long.

Hope this helps,

Kristian

kristian_kolb
Ultra Champion

Well, you used that construct in the beginning - the {9} 🙂
As you said EXACTLY 20 characters it's probably more correct to use {20} instead of {1,20} - but that's your decision.

/k

0 Karma

hharvey
Explorer

Thanks Kristian! adding {1,20} did it, I just didn't realize that was an option in regex.

i agree, my post was pretty to read through. sorry!

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...