All Apps and Add-ons

Multiple fields per event with different data

Karunamon
Explorer

I'm trying to load a large-ish collection of file upload reports into Splunk. Our application generates a report file which contain lines looking something like this:

"folder1/folder2/aa1da494-7879-476d-8f49-699fadfb3390/rep_156883_6d4f1dba35a867f381cbb7a5fa1928c1_M30_700.tar" 152504823B 0B completed"

There will be anywhere from one to hundreds of these lines in the file, each one referencing a different file name. What I'm trying to do is capture these lines using a field extraction. My problem is that only the first line is actually being captured. If I try searching for the second or onward lines using the extraction I defined, nothing is found.

The regex being used for the fields is pretty simple:

(?im)(?<=^")(?P(FIELDNAME).+?)(?=")

(The parenthesis in fieldname should be angle brackets, limitation of the field here)

Props.conf for the file:

#etc/system/local/props.conf
[TransferManifest]
BREAK_ONLY_BEFORE = ## Transfer manifestation
MUST_BREAK_AFTER = Total transferred bytes: \d*
NO_BINARY_CHECK = 1
SHOULD_LINEMERGE = true
TIME_PREFIX = ## Start:
pulldown_type = 1
TRUNCATE=900000
MAX_EVENTS=90000

(The absurd values for truncate and max_events reflect the occasional large size of these files, mostly composed of lines like the one mentioned above)

The inputs.conf for these files:

#etc/apps/search/local/inputs.conf
[monitor:///mnt/apps/upload_reports]
disabled = false
followTail = 0
source = UploadReports
host = uploadserver01.company.com
index = upload_reports
sourcetype = TransferManifest
whitelist = *.txt

I've defined MV_ADD and REPEAT_MATCH as True in my system/local/transforms.conf.

Any idea what I'm missing here?

0 Karma
1 Solution

jeff
Contributor

Try this instead:

(?im)"(?<FIELDNAME>[^"]+)".+[\r\n]{0,2}

If all of the lines in your log are captured in a single event, then ^ marks the beginning of your record. You may also try your (?im-s) explicitly turning off single-line mode...

View solution in original post

0 Karma

jeff
Contributor

Try this instead:

(?im)"(?<FIELDNAME>[^"]+)".+[\r\n]{0,2}

If all of the lines in your log are captured in a single event, then ^ marks the beginning of your record. You may also try your (?im-s) explicitly turning off single-line mode...

0 Karma

Karunamon
Explorer

Thanks Jeff, that's what I ended up needing. I was approaching this completely the wrong way 🙂

0 Karma

Karunamon
Explorer

Aha. That might be part of the problem, as I've been running this through the standard field extractions UI (manager/search/data/props/extractions)

0 Karma

jeff
Contributor

Where are you defining your regex? If you are doing multivalued extraction at search time, you'll need to define that in transforms.conf (and refer to it in props.conf) on your search head.

Karunamon
Explorer

No output - there are no transforms defined yet on this instance.

0 Karma

jeff
Contributor

Shoot... thought that might have been it. I ran into something similar way back in the day. REPEAT_MATCH only applies at index time, so it's a non-issue. What does the output from

splunk cmd btool transforms list <your stanza> --debug

on your search head look like?

0 Karma

Karunamon
Explorer

Trying both of those, there doesn't appear to be any change. The first item in each event is still the only one captured on the field.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...