Splunk Search

Regex to extract fields with different format

nuaraujo
Path Finder

Hello all,

I need your help in order to get a regex that may extract fields from some messages.

Example 1
USER: user1 UPDATED CUSTOMER - 123456. Added new user. New user was added.

What I am looking at this message:
username: user1
operation: UPDATED CUSTOMER (always two words in uppercase)
customer_id: 123456 (always preceded by "-" and ending with ".") (not available in all messages)
comment: Added new user. New user was added

Example2
USER: user2 ADDED COUNTRY with identifier: Germany

What I am looking at this message:
username:user2
operation: ADDED COUNTRY (always two words in uppercase)
comment: with identifier: Germany (in this message I do not have customer_id field)

I am using the following REGEX that I far from being accurate. It works for the first use case but not for the second

 ... | rex field=message    "^USER: (?P<username>.+?) (?P<operation>[A-Z].+?) - (?P<id>.+?)\. (?P<message>.*)"

What I am looking for a final result:
|username.... | operation...........................| id...............| message................................................|
|user1............|UPDATED CUSTOMER.....| 123456 ....| Added new user. New user was added |
|user2............|ADDED COUNTRY............| ..................| with identifier: Germany.........................|

Can someone help me, building a general regex, please?

Tags (2)
0 Karma
1 Solution

javiergn
Super Champion

Hi @nuaraujo,

Try the following regex instead. I've tested it on my lab with you two examples and it seems to be working fine. Note I am assuming your customer ID is a number so you might need to tweak that if that's not the case.

^USER: (?P<username>\S+)\s+(?P<operation>[A-Z]+ [A-Z]+)(\s+\-\s+(?P<customerid>\d+)\.)?\s+(?P<message>.*)

Thanks,
J

View solution in original post

0 Karma

javiergn
Super Champion

Hi @nuaraujo,

Try the following regex instead. I've tested it on my lab with you two examples and it seems to be working fine. Note I am assuming your customer ID is a number so you might need to tweak that if that's not the case.

^USER: (?P<username>\S+)\s+(?P<operation>[A-Z]+ [A-Z]+)(\s+\-\s+(?P<customerid>\d+)\.)?\s+(?P<message>.*)

Thanks,
J

0 Karma

nuaraujo
Path Finder

Thanks
Thanks
Thanks 🙂

0 Karma

lloydknight
Builder

Hello @nuaraujo

try something like this.

... | rex field=message "USER:\s(?<username>.+?)\s(?<operation>\w+\s\w+)\s\-?\s(?<id>\d+)\.?\s(?<message>.*)"

Hope it helps!

0 Karma

p_gurav
Champion

Can you try :

| rex field=message    "^USER: (?P<username>.+?) (?P<operation>\w+) (?P<comment>.*)" | rex field=comment "CUSTOMER - (?P<id>[^\.]+)"
0 Karma

nuaraujo
Path Finder

Thank @p_gurav.

Your suggestion would already be a good solution. However, can you just help me getting 2 words for "operation"? In your suggestion, I am only getting one. Even so, BIG THANK YOU for your quick reply.

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi nuaraujo,
try this:

| rex field=message    "^USER: (?P<username>.+?) (?P<operation>[A-Z]+ [A-Z]+) (?<comment>.*)"
| rex field=comment "- (?<id>\d+)\. (?<message>.*)"

In this way you have all fields.

Bye.
Giuseppe

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...