Splunk Search

How to write regex to ignore first line and set each additional line as value of a field at index time?

essklau
Path Finder

Folks, I don't understand why this is killing me, but it is.

In short, I want to, at index time,
1) ignore first line
2) set each additional line as a value of the field "datahere"

I have the below data which is a poorly obscured script output.

C:\Superfiend\howdydoody>bingotime.tst lots of spaces and silliness ending with double quotes"
DDD32\Jump Start
DDD32\Stacktime
DDD32\Is this thing on
DDD32\welcome to duluth
DDD32\junky nonsense
FTYA

The first line begins with "C:" and ends with that double quote ("), and is always the same. The subsequent lines are each a value of the field, "datahere". The number of lines after the first is variable.

I would like, at index time, to extract:

datahere=DDD32\Jump Start
datahere=DDD32\Stacktime
datahere=DDD32\Is this thing on
datahere=DDD32\welcome to duluth
datahere=DDD32\junky nonsense
datahere=FTYA

Is this possible? Please advise.

Thanks,

e

Tags (2)
1 Solution

somesoni2
SplunkTrust
SplunkTrust

Try this for your props.conf

[bingolog]
EXTRACT-datahere = (?<datahere>.*)
NO_BINARY_CHECK = 1
SEDCMD-firstline = s/^C:.*//g
SHOULD_LINEMERGE = false

This should remove the first line starting with "C:" and add field datahere with all other line's value (search time field extraction).

View solution in original post

somesoni2
SplunkTrust
SplunkTrust

Try this for your props.conf

[bingolog]
EXTRACT-datahere = (?<datahere>.*)
NO_BINARY_CHECK = 1
SEDCMD-firstline = s/^C:.*//g
SHOULD_LINEMERGE = false

This should remove the first line starting with "C:" and add field datahere with all other line's value (search time field extraction).

anwarmian
Communicator

Great job!!!!!

0 Karma

essklau
Path Finder

Beautiful! This worked.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

If that's all one big event then you should be able to use props.conf's SEDCMD:

SEDCMD-firstline = s/^[^\n\r]+[\n\r]+//
SEDCMD-keys = s/^/datahere=/g

I didn't test these... also, this would stick to search-time extraction. Is there a specific reason for you to want index-time extraction?

0 Karma

anwarmian
Communicator

SEDCMD works well during search time

0 Karma

essklau
Path Finder

Martin, the carat for the start of the line makes sense, and it did make it to the config. file, but for whatever reason, it decided start of character will do instead of start of line. This happened to rex commands, too, using the sed mode. Weird.

0 Karma

kristian_kolb
Ultra Champion

SEDCMD is done at at index time, and the idea behind @martin_mueller's config is to remove the first line, and prepare the subsequent lines to be natively parsed by Splunk, by changing them into key-value pairs.

From your comment it looks like it didn't work.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Yup, the extraction of the datahere field happens at search time while the sed replacements happen at index time.

As for your second comment, that sounds as if the caret (= regex anchor to "start of line") didn't copy over into your config files. Also, the C:\... line shouldn't survive the first SEDCMD.

0 Karma

essklau
Path Finder

The second line results in
datahere=Cdatahere=:datahere=\datahere=S.....

0 Karma

essklau
Path Finder

SEDCMD in props.conf.spec says it is only used at index time, but you say this would stick to search-time extraction?

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...