All Apps and Add-ons

Splunk Field Extraction Error with Edited Regex

adumbrys
Explorer

I have logs which contain a long series of pipe delimited fields.

My issue is that there are some fields which do not have any values, and instead of some character being loaded in place of a NULL field, the field is left blank.

For example this log would record various information about site visitors, some fields are left blank based on the device and parts of the website visited.

|wired|||||/||00005wcuu-jSbW_AypQB1ZDLdjH:180ds1m45|Search|||Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/34.0.1847.137 Safari/537.36|

But when I extrapolate the regex to meet this scenario, I still receive "the generated regex was unable to match all examples"

Here is a regex created in attempt to generate fields:

^(?P<FIELDNAME1>[^\|]+)\|(?P<FIELDNAME2>[^\|]+)\|(?P<FIELDNAME3>[^\|]+)\|(?P<FIELDNAME4>[^\|]+)\|(?P<FIELDNAME5>[^\|]+)\|(?P<FIELDNAME6>[^\|]+)\|(?P<FIELDNAME7>[^\|]+)\|(?P<FIELDNAME8>[^\|]+)\|(?P<FIELDNAME9>[^\|]+)\|(?P<FIELDNAME10>[^\|]+)\|(?P<FIELDNAME11>[^\|]+)\|(?P<FIELDNAME12>[^\|]+)

None of the log events will contain Pipes within the fields, so I thought that it would be simple enough to tell Splunk that anything (even nothing) between two pipes is a field.

Any suggestions are greatly appreciated!

0 Karma

kristian_kolb
Ultra Champion

One problem lies in the fact that you use the one-or-more quantifier (the plus sign) for your non-pipe character classes. Use the zero-or-more quantifier (the asterisk) or perhaps better still - don't use regex at all.

Have you looked into REPORT instead of EXTRACT? This allows you to make your extractions by specifying a delimiter, e.g.

props.conf

[your_sourcetype]
REPORT-www = extract_weblog_fields

transforms.conf

[extract_weblog_fields]
DELIMS = "|"
FIELDS = field1, field2, field3, field4

Just name the fields appropriately. Read more on REPORT and FIELD EXTRACTION in the docs.

/K

adumbrys
Explorer

Can I do this when I don't have Splunk installed locally, and just accessing it through the browser?

I definitely need to do more reading on this, because I would suppose that I could create config files and submit them to my administrator?

0 Karma

richgalloway
SplunkTrust
SplunkTrust

By using '+' in your field descriptions you're telling regex there must be at least one character between pipes. Try using '*'.

---
If this reply helps you, Karma would be appreciated.

adumbrys
Explorer

Thank you! This definitely caught the majority of the delimited fields. Now I just need to make sure each event in the logs is in the same exact format.

I'm getting some little errors, where one pair of pipes isn't being caught properly...and the user agent string in the last field is cut off after the first few letters.

But most importantly I'm not getting the same problems as before...which is progress!

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...