Splunk Search

How to optimize a regex function of concurrent consonants?

twisterdavemdCM
New Member

I'm trying to calculate a potential risk score from the number of concurrent consonants in a domain name. (e.g. egorklwqyrjvbsxvhvcws.com is rarely a domain that people intentionally browse... 🙂

So I'm psudo-coding for Splunk in my mind, and I'm envisioning a mess of PCRE regex for assessment criterion that's going to thrash our forwarders and indexers.

Is there a better way to implement the following structure?:

Set (Consonant_Risk_value) = 0%

IF Rex(domain_name)/([bcdfghjklmnpqrstvwxyz]{5})/i OR Rex(domain_name)/([bcdfghjklmnpqrstvwxyz]{6})/I
THEN set (Consonant_Risk_value) = 40%

ELSE 

IF Rex(domain_name)/([bcdfghjklmnpqrstvwxyz]{7})/i OR Rex(domain_name)/([bcdfghjklmnpqrstvwxyz]{8})/I

THEN set (Consonant_Risk_value) = 60%

ELSE

IF Rex(domain_name)/([bcdfghjklmnpqrstvwxyz]{>8})/i

THEN set (Consonant_Risk_value) = 80% 
0 Karma

richgalloway
SplunkTrust
SplunkTrust

For something similar, check out the ut_shannon() function in the URL Toolbox app (https://splunkbase.splunk.com/app/2734/#/details).

---
If this reply helps you, Karma would be appreciated.
0 Karma

woodcock
Esteemed Legend

Like this:

... | eval Consonant_Risk_value=case((match(domain_name, "[bcdfghjklmnpqrstvwxyz]{9,})/i")), "80%",
                                     ((match(domain_name, "[bcdfghjklmnpqrstvwxyz]{7})/i")) OR
                                      (match(domain_name, "[bcdfghjklmnpqrstvwxyz]{8})/I"))), "60%",
                                     ((match(domain_name, "[bcdfghjklmnpqrstvwxyz]{5})/i")) OR
                                      (match(domain_name, "[bcdfghjklmnpqrstvwxyz]{6})/I"))), "40%",
                                     true(), "0%")

P.S. Have you heard about Shannon Entropy?
https://www.splunk.com/blog/2016/04/21/when-entropy-meets-shannon/

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...