Splunk Search

Another Noobie question: how to search for terms / keywords that are "close" to each other?

ks5752
Engager

strong textHi,

I've been searching around the forum and have been unable to find any guidance on this question. I figure that I am either approaching the task incorrectly, or I am not using the right search keywords.... anywho.... here goes with the question...

I'd like to be able to find records that have specific keywords that are within close proximity of each other. For instance with the phrase below...

WHEN IN THE COURSE OF HUMAN EVENTS IT BECOMES NECESSARY FOR ONE PEOPLE TO DISSOLVE

and

WHEN THE EVENTS ARE POSTED AND THE PEOPLE REACT, IT BECOMES A MESS

when trying to match on the keywords of EVENTS and BECOMES I would like to be able to control the return of information based on how close these two keywords are to each other...

PSEUDO CODE

FIND RECORDS WHERE "EVENTS" IS WITHIN 2 WORDS OF "BECOMES"

that way, the query would identify the first phrase and not the second phrase...

or

PSEUDO CODE

FIND RECORDS WHERE "EVENTS" IS GREATER THAN 3 WORDS APART FROM "BECOMES"

that variant would then identify the second phrase....

Any guidance would be appreciated.

Thanks.

Tags (1)
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

To quote xkcd... stand back, I know regular expressions!

... | regex _raw="EVENTS(\W+\w+){0,2}\W*BECOMES"

... | regex _raw="EVENTS(\W+\w+){3,}\W*BECOMES"

The first one allows for zero to two words between the two keywords, the second requires at least three.

gkanapathy
Splunk Employee
Splunk Employee

Oh, it is quite important to add that the tokens "EVENTS" and "BECOMES" should be in the base search, or the search will be wastefully inefficient. So e.g.:

sourcetype=presidential_speeches EVENTS BECOMES | regex ...

otherwise you will retrieve events without the words you're interested in, and have to filter them with regex. By specifying the terms in the base search, you will avoid retrieving events/documents that don't contain the terms of interest at all. This is very similar really to how Splunk does phrase and key-value searching.

gkanapathy
Splunk Employee
Splunk Employee

The final \W* should probably be \w+ to avoid the last work being merged into BECOMES. Also, for text searches in general, probably makes sense to specify (?i)

0 Karma

yannK
Splunk Employee
Splunk Employee

brilliant.

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...