Splunk Search

Another Noobie question: how to search for terms / keywords that are "close" to each other?

ks5752
Engager

strong textHi,

I've been searching around the forum and have been unable to find any guidance on this question. I figure that I am either approaching the task incorrectly, or I am not using the right search keywords.... anywho.... here goes with the question...

I'd like to be able to find records that have specific keywords that are within close proximity of each other. For instance with the phrase below...

WHEN IN THE COURSE OF HUMAN EVENTS IT BECOMES NECESSARY FOR ONE PEOPLE TO DISSOLVE

and

WHEN THE EVENTS ARE POSTED AND THE PEOPLE REACT, IT BECOMES A MESS

when trying to match on the keywords of EVENTS and BECOMES I would like to be able to control the return of information based on how close these two keywords are to each other...

PSEUDO CODE

FIND RECORDS WHERE "EVENTS" IS WITHIN 2 WORDS OF "BECOMES"

that way, the query would identify the first phrase and not the second phrase...

or

PSEUDO CODE

FIND RECORDS WHERE "EVENTS" IS GREATER THAN 3 WORDS APART FROM "BECOMES"

that variant would then identify the second phrase....

Any guidance would be appreciated.

Thanks.

Tags (1)
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

To quote xkcd... stand back, I know regular expressions!

... | regex _raw="EVENTS(\W+\w+){0,2}\W*BECOMES"

... | regex _raw="EVENTS(\W+\w+){3,}\W*BECOMES"

The first one allows for zero to two words between the two keywords, the second requires at least three.

gkanapathy
Splunk Employee
Splunk Employee

Oh, it is quite important to add that the tokens "EVENTS" and "BECOMES" should be in the base search, or the search will be wastefully inefficient. So e.g.:

sourcetype=presidential_speeches EVENTS BECOMES | regex ...

otherwise you will retrieve events without the words you're interested in, and have to filter them with regex. By specifying the terms in the base search, you will avoid retrieving events/documents that don't contain the terms of interest at all. This is very similar really to how Splunk does phrase and key-value searching.

gkanapathy
Splunk Employee
Splunk Employee

The final \W* should probably be \w+ to avoid the last work being merged into BECOMES. Also, for text searches in general, probably makes sense to specify (?i)

0 Karma

yannK
Splunk Employee
Splunk Employee

brilliant.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...