Splunk Search

How can I use regex with wildcard patterns in a search to capture a host field?

martin_smith
Engager

Can simple regular expressions be used in searches?

I'm trying to capture a fairly simple pattern for the host field. For example a host name might be T1234SWT0001 and I'd like to capture any device with "T" + "four digits" + "SWT" + "anything". I think the regex would be something along the lines of T\d\d\d\dSWT.

1 Solution

Glenn
Builder

AFAIK you unfortunately can't do regex style matching in the initial part of the search (ie. the bit before the first "|" pipe). This is probably because of the way that Splunk searches for "tokens" in the index using string (or substring in the case of non-regex wildcard use) matching. Splunk only accepts the * wildcard here, see http://docs.splunk.com/Documentation/Splunk/6.3.1/Search/Usethesearchcommand#Keywords.2C_phrases.2C_...

So, if you want to match with a regular expression, you need to take the approach of searching for all data before the pipe, and then filtering after the pipe with the regex command. In your case, this would be:

index=myindex your search terms | regex host="^T\d{4}SWT.*"

^ anchors this match to the start of the line (this assumes that "T" will always be the first letter in the host field. If not, remove the caret "^" from the regex)
T is your literal character "T" match
\d{4} matches exactly four digits (\d)
S is your literal character "S" match
W is your literal character "W" match
T is your literal character "T" match
.* will match zero or more of any character, and this is technically not required (and slightly less efficient) as it will still match without.

regex command doc: http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Regex

View solution in original post

Glenn
Builder

AFAIK you unfortunately can't do regex style matching in the initial part of the search (ie. the bit before the first "|" pipe). This is probably because of the way that Splunk searches for "tokens" in the index using string (or substring in the case of non-regex wildcard use) matching. Splunk only accepts the * wildcard here, see http://docs.splunk.com/Documentation/Splunk/6.3.1/Search/Usethesearchcommand#Keywords.2C_phrases.2C_...

So, if you want to match with a regular expression, you need to take the approach of searching for all data before the pipe, and then filtering after the pipe with the regex command. In your case, this would be:

index=myindex your search terms | regex host="^T\d{4}SWT.*"

^ anchors this match to the start of the line (this assumes that "T" will always be the first letter in the host field. If not, remove the caret "^" from the regex)
T is your literal character "T" match
\d{4} matches exactly four digits (\d)
S is your literal character "S" match
W is your literal character "W" match
T is your literal character "T" match
.* will match zero or more of any character, and this is technically not required (and slightly less efficient) as it will still match without.

regex command doc: http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Regex

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...