Identify predictor fields

eregon · ‎03-26-2024

Good morning fellow Splunkthiasts!

I have an index with 100k+ events per minute (all of them having the same sourcetype), approximately 100 fields are known in this dataset. Some of these events are duplicit, while others are unique. My aim is to understand the duplication and be able to explain what events exactly get duplicated.

I am detecting duplicities using this SPL:

index="myindex" sourcetype="mysourcetype" | eventstats count AS duplicates BY _time, _raw

Now I need to identify what fields or their combination make the difference, under what circumstances the event is ingested twice.

I tried to use predict command, however it is somehow producing new values for "duplicates" field, but it does not disclose the rule by which it makes the decision. In other words, I am not interested in prediction itself, I want to know the predictors.

Is something like that possible in SPL?

jason_hotchkiss · ‎03-26-2024

Have you considered this article: https://community.splunk.com/t5/Splunk-Search/How-do-I-find-all-duplicate-events/m-p/9764

Identify predictor fields

fields

other

They're back! Join the SplunkTrust and MVP at .conf24

Enterprise Security Content Update (ESCU) | New Releases

Detecting Remote Code Executions With the Splunk Threat Research Team