Splunk Search

How can I use regex with wildcard patterns in a search to capture a host field?

martin_smith
Engager

Can simple regular expressions be used in searches?

I'm trying to capture a fairly simple pattern for the host field. For example a host name might be T1234SWT0001 and I'd like to capture any device with "T" + "four digits" + "SWT" + "anything". I think the regex would be something along the lines of T\d\d\d\dSWT.

1 Solution

Glenn
Builder

AFAIK you unfortunately can't do regex style matching in the initial part of the search (ie. the bit before the first "|" pipe). This is probably because of the way that Splunk searches for "tokens" in the index using string (or substring in the case of non-regex wildcard use) matching. Splunk only accepts the * wildcard here, see http://docs.splunk.com/Documentation/Splunk/6.3.1/Search/Usethesearchcommand#Keywords.2C_phrases.2C_...

So, if you want to match with a regular expression, you need to take the approach of searching for all data before the pipe, and then filtering after the pipe with the regex command. In your case, this would be:

index=myindex your search terms | regex host="^T\d{4}SWT.*"

^ anchors this match to the start of the line (this assumes that "T" will always be the first letter in the host field. If not, remove the caret "^" from the regex)
T is your literal character "T" match
\d{4} matches exactly four digits (\d)
S is your literal character "S" match
W is your literal character "W" match
T is your literal character "T" match
.* will match zero or more of any character, and this is technically not required (and slightly less efficient) as it will still match without.

regex command doc: http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Regex

View solution in original post

Glenn
Builder

AFAIK you unfortunately can't do regex style matching in the initial part of the search (ie. the bit before the first "|" pipe). This is probably because of the way that Splunk searches for "tokens" in the index using string (or substring in the case of non-regex wildcard use) matching. Splunk only accepts the * wildcard here, see http://docs.splunk.com/Documentation/Splunk/6.3.1/Search/Usethesearchcommand#Keywords.2C_phrases.2C_...

So, if you want to match with a regular expression, you need to take the approach of searching for all data before the pipe, and then filtering after the pipe with the regex command. In your case, this would be:

index=myindex your search terms | regex host="^T\d{4}SWT.*"

^ anchors this match to the start of the line (this assumes that "T" will always be the first letter in the host field. If not, remove the caret "^" from the regex)
T is your literal character "T" match
\d{4} matches exactly four digits (\d)
S is your literal character "S" match
W is your literal character "W" match
T is your literal character "T" match
.* will match zero or more of any character, and this is technically not required (and slightly less efficient) as it will still match without.

regex command doc: http://docs.splunk.com/Documentation/Splunk/6.3.1/SearchReference/Regex

Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...