I am running into an issue with the add-on "Fuzzy search for splunk", I am trying to use it to find malicious process names that are similar to a legitimate one, the issue I have is that the add-on can't seem to parse through hyphens and spaces. The below search will give me a 100% match with "legit-unique-services.exe" and "legit unique services.exe". There is a number of legitimate processes that are like this.
| fuzzy wordlist="services.exe" compare_field=process_name
Is there anything I can do to fix this? Or does the add-on have to be updated to handle this?
Most likely the issue you are running into is with the "delims" option. From the add on readme:
Delims accepts a regex string, escaped splunk style, and defaults to
(\\\\|/|\s+|;|-)
So if you were to pass in a different delimiter to the command like:
| fuzzy wordlist="list.exe" delims="(\\\\)" compare_field=process_name
You may get better results.
Thanks mate, using that delimiter will mean that we just exclude process names that have hyphens/spaces in them entirely. Is there a way to include them and get them to match?
I looked at the code again to make sure I wasn't speaking out of turn and it basically works like this:
>>> import re
>>> pattern='(\\\\|/|\s+|;|-)'
>>> testdata='this-is-a-test.txt'
>>> matches=re.split(pattern, testdata)
>>> matches
['this', '-', 'is', '-', 'a', '-', 'test.txt']
>>> pattern='(\\\\)'
>>> testdata='this-is-a-test.txt'
>>> matches=re.split(pattern, testdata)
>>> matches
['this-is-a-test.txt']
The delims value is just a splitter, not a filtering mechanism despite the bad variable naming I used in the script. I'll have to rename that later on... At any rate, modifying the delims to not include the hyphen should solve your issue.
That delims option tells the command how to split up stuff that gets put into the command. For example, if you have an input 'this-is-my-process.exe', the default value splits this into multiple words and compares your wordlist to each word: "this", "is", "my", and "process.exe".
By changing the delims value, you can change this behavior so that "this-is-my-process.exe" is evaluated as a whole word.
The command shouldnt be excluding anything based on the provided delims. I'll check the code later to validate 100%.