Splunk Search

Delimited field extracts always result in rex errors

JosephHobbs
Path Finder

I recently started trying to set up some field extracts for a few of our events.  In this case, the logs are pipe delimited and contain only a few segments.  What I've found that most of these attempts result in an error with rex regarding limits in limits.conf.

For example: this record:

2022-02-03 11:45:21,732 |xxxxxxxxxxxxxxx.xxxxxx.com~220130042312|<== conn[SSL/TLS]=274107 op=26810 MsgID=26810 SearchResult {resultCode=0, matchedDN=null, errorMessage=null} ### nEntries=1 ### etime=3 ###

When I attempt to use a pipe delimited field extract (for testing) the result is this error:

JosephHobbs_0-1643909429403.png

When I toss this regex (from the error) into regex101 (https://regex101.com/r/IswlNh/1) it tells me it requires 2473 steps, which is well above the default 1000 for depth_limit...  How is it that an event with 4 segments delimited by pipe is so bad?

I realize there are 2 limits (depth_count/match_count) in play here and I can increase them, but nowhere can I find recommended values to use as a sanity check.  I also realize I can optimize the regex, but as I am setting this up via UI using the delimited option, I don't have access to the regex at creation time.  Not to mention, many of my users are using this option as they are not regex gurus...

So my big challenge/question is...  Where do I go from here?  My users are going to use this delimited options, which evidently generates some seriously inefficient regex under the covers.  Do I increase my limit(s), and if so what is a sane/safe value?  Is there something I'm missing?

Thanks!

Labels (3)
0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

Can you use the fact that pipes are the delimiter character?

^(?P<field1>[^\|]+)\s\|(?P<field2>[^\|]+)\|(?P<field3>.*)

https://regex101.com/r/MLYmkL/1

 

0 Karma

JosephHobbs
Path Finder

No doubt the regex can be improved significantly as you demonstrated.  I guess my challenge is, how do I tell my users that the OOB delimited option just doesn't work and that they now have to go learn regex to extract their fields?

At the end of the day, I see 3 possibilities...

  • I'm doing something wrong...
  • The default limits are just too low and I should increase them (to what?)...
  • Splunk's delimited parsing UI just generates really inefficient regex..

 

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

If you don't want to use rex, you could use makemv

| makeresults
| eval _raw="2022-02-03 11:45:21,732 |xxxxxxxxxxxxxxx.xxxxxx.com~220130042312|<== conn[SSL/TLS]=274107 op=26810 MsgID=26810 SearchResult {resultCode=0, matchedDN=null, errorMessage=null} ### nEntries=1 ### etime=3 ###"
| makemv _raw delim="|"
| eval field1=mvindex(_raw,0)
| eval field2=mvindex(_raw,1)
| eval field3=mvindex(_raw,2)

 I think the issue with your rex is there are a few greedy matches so it keeps restarting the matches hence the high number of steps.

0 Karma

JosephHobbs
Path Finder

The point here being, that's not my regex.  That was generated by the Splunk UI when I tried to create a field extract using 'delimiting with pipe'...  The only reason I have that regex in hand is because the error message included it...

0 Karma

ITWhisperer
SplunkTrust
SplunkTrust

OK, I see. Sometimes, splunk is too clever for its own good! 😀

0 Karma

JosephHobbs
Path Finder

Yea.  I feel like it must have something to do with how the UI handles these.  It seems like it's using regex and it's a bit overzealous on that regex.  Configuring the same delimited (as delimited) from a back-end perspective works fine without issues...

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...