Splunk Search

How do I use rex to extract a field that may contain an ampersand

saulverde
Path Finder

I have a non standardized field in one of the logs that we pull. I am building an inline rex string to extract the field. The string below extracts everything except for one entry that should be. That entry contains an ampersand "&".

regex: \[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w.{1,3}])\]

Data not extracted with this regex: D - External Subnets - AT&T

I have tried the following:

\[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w+&\w])\]
\[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w+"&"\w])\]
\[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w+\&\w])\]
\[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w+\\&\w])\]
\[Site:\s(?<site>\w\s[[-\s\w+\s]+|[-\s\w+\s\w+\s]+]-[\s\w+|\s\w+\\\&\w])\] - I tried this after researching some perl coding suggestions

I believe this is because the ampersand is used to repeat the previously matched pattern. I'm not sure how to escape the ampersand so it reads as a litteral value. I also haven't been able to find any reference for a specific character sequence to use in the place of the ampersand to search for it.

Thanks for any help you can offer.

1 Solution

richgalloway
SplunkTrust
SplunkTrust

This should do the job. It will catch everything between "[Site: " and "]".

"\[Site:\s*(?P<site>.*)\]"
---
If this reply helps you, Karma would be appreciated.

View solution in original post

0 Karma

richgalloway
SplunkTrust
SplunkTrust

This should do the job. It will catch everything between "[Site: " and "]".

"\[Site:\s*(?P<site>.*)\]"
---
If this reply helps you, Karma would be appreciated.
0 Karma

saulverde
Path Finder

Thank you.

When I use that it starts pulling from the adjacent field also which is the IP so I end up with far too many unique fields
Sample data with adjacent field:
[Site: V - A - VLAN 213 - Full] [XXX.XXX.XXX.XXX]
Field values being pulled now:
V - A - VLAN 213 - Full] [xxx.xxx.xxx.xxx

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Making the quantifier less greedy should fix that.

"\[Site:\s*(?P&lt;site&gt;.*?)\]"
---
If this reply helps you, Karma would be appreciated.

saulverde
Path Finder

That worked perfectly. Thanks. I'll look up the trailing "?" and see why that solved the problem but thank you very much for your help.

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Can you supply some sample data? If not, what terminates the Site field?

---
If this reply helps you, Karma would be appreciated.
0 Karma

saulverde
Path Finder

The closing square bracket is the termination of the value in the log.

Here are a couple examples, like I said the field doesn't have a standardized naming convention so I did my best with the regex above which catches everything except for the value that includes the ampersand.

Sample data that I need to extract:
[Site: V - A - VLAN 213 - Full]
[Site: D - External Subnets - AT&T]

0 Karma
Get Updates on the Splunk Community!

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...