Splunk Search

Why is index-time field extraction not searchable?

dottom
Path Finder

I'm double posting, original issue posted here: http://www.splunk.com/support/forum:SplunkGeneral/4378

When I use double-quotes in my index-time field extractions, the meta-data is not searchable. I've seen this problem on 4.0.11 and 4.1.3.

Sample text:

results=AA,BB,CC CC,DD

Transforms.conf without double-quotes:

REGEX = ^results=(.*?),(.*?),(.*?),(.+)$
FORMAT = key1::$1 key2::$2 key3::$3 key4::$4
WRITE_META = true

Transform.conf with double-quotes:

REGEX = ^results=(.*?),(.*?),(.*?),(.+)$
FORMAT = key1::"$1" key2::"$2" key3::"$3" key4::"$4"
WRITE_META = true

Results:

If you use the first transforms.conf without the double-quotes, there are two problems:

  • The value for key3 (with a space) is not captured correctly. This is in the documentation which says to use double-quotes.

  • The fields extracted on 4.1.3 are incorrect for key4. Instead of having a field "key4" it has "CC key4". I don't recall seeing this behavior in 4.0.x.

However, if you use the second transforms.conf with the double-quotes:

  • The meta-data is not searchable, i.e. search for "key1=AA" fails.


UPDATE 6/15/2010

Here are my conf files so you can replicate this issue. I also have a screenshot below.

inputs.conf:

[monitor:///var/log/test]
disabled = 0
sourcetype = mytest

props.conf:

[mytest]
TRANSFORMS-test = extract-fields

fields.conf:

[key1]
INDEXED = true

[key2]
INDEXED = true

[key3]
INDEXED = true

[key4]
INDEXED = true

transforms.conf:

[extract-fields]
REGEX = ^results=(.*?),(.*?),(.*?),(.+)$
FORMAT = key1::"$1" key2::"$2" key3::"$3" key4::"$4"
WRITE_META = true

screenshot:

In this screenshot, notice that the values are indeed extracted and show up in the search result. However, searching for "key1=AA" (or any other key=value) returns no results.

http://dottom.com/public/images/screenshot_8jd49x4d.png

Tags (2)

parallaxed
Path Finder

I should add that you're getting no results for the second conf, which kind of backs that up. The first transforms.conf is valid. If you think there's nothing wrong with your regex, try splitting the capture in to 2 separate transforms and see if you can get it to work that way?

0 Karma

parallaxed
Path Finder

As with a lot of Splunk quirks, I don't see this documented (http://www.splunk.com/base/Documentation/latest/Admin/Transformsconf), so I'm not certain you need those quotes, or that it's even valid syntax in the latest version. Space-escaping is mentioned in that document, but only in relations to FIELDS= capturing, which is used alongside auto-kv/delims extraction (which is not what you're doing).

0 Karma

dottom
Path Finder

Yes, all the fields are defined in fields.conf

You need the double-quotes in transforms.conf when the regular expression backreference captures a value with a space in it.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...