Splunk Search

Regex for multi-value field in which some values are listed and then aren't

jwalzerpitt
Influencer

I am trying to create a regex for a multivalue field (Message) in which some values are listed and sometimes aren't listed depending on the event. We are ingesting Shibboleth logs via _json format, and I am trying to extract three values from the Message field: URL, username, and src_ip (in bold in each event).

There are three different events for Shibboleth.
alt text

Is it possible to create a regex that would apply to all three events?

I have one regex that covers the first event and extracts the three fields.

alt text

Thx

0 Karma

inventsekar
Ultra Champion

maybe, the user name and src_ip are looking good. maybe, club these two in a single rex and use a separate rex for URL.

(photo is fine for reading) maybe, Can you please copy the logs and your rex as a text, so that we test it.

0 Karma

jwalzerpitt
Influencer

I've been struggling with the frustrating code tag markdown as I selected the code button, which adds the tickmarks to the beginning and end of the code, but the page still yells at me when I go to post it

0 Karma

inventsekar
Ultra Champion

posting it on the comment would be difficult. maybe, please post it as a separate answer or edit your question and add the text please.

0 Karma

jwalzerpitt
Influencer

I believe I figured it out - I had to create three separate regexes, one for each field, and when evaluating, I did not see any Non-Matches for each regex. Regexes are as follows:

^(?:[^|\n]|){13}(?P[^|]+)
^(?:[^|\n]
|){3}(?P[^|]+)
^(?:[^|\n]*|){8}(?P[^|]+)

0 Karma

sundareshr
Legend

It would be safer to create three separate regexes. That way extraction is not affected by minor changes to the log format. I would suggest, in the field extraction UI, create three separate field extraction rules.

*UPDATED*

props.conf
[stanza_name]
REPORT-extract_mv_fields: extract_url extract_src_ip extract_user

transforms.conf
[extract_url]
REGEX=(?<url>http[^\|]+)
MV_ADD=true

[extract_src_ip]
REGEX=(?<url>\d+\.\d+\.\d+\.\d+)
MV_ADD=true

[extract_user]
REGEX=\|{4}(?<user>\w+)\|{5}"
MV_ADD=true
0 Karma

jwalzerpitt
Influencer

I tried to do three separate regexes, but Splunk yelled when I tried to reuse the field extracted names (url, username, src_ip) for the second regex (logout with username).

Thx

0 Karma

sundareshr
Legend

Try the updated ans

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...