I would like to compare two field values and return a new field with a percent match between the two.
Current search:
index=dlp severity="1:High" sender!="N/A"
| table _time, sender, recipients, Filename, Count, severity, incident_id, policy,
| sort -_time
For example, if part of my search returns
sender: John.Smith@Coolcompany.com
Recipients: JohnSmith546@mail.com
I would like a new field named PercentMatch to return
PercentMatch: 80% ( or whatever the actual calculation may be)
The goal is to help determine when users are sending themselves emails to their personal account. Thank you
Have you thought about a workaround using the cluster command?
"The cluster command groups events together based on how similar they are to each other"
https://docs.splunk.com/Documentation/SplunkCloud/6.6.0/SearchReference/Cluster
Assuming the _time as unique identifier per mail I could think of something like:
| makeresults
| eval sender="John.Smith@Coolcompany.com"
| eval recipientes="JohnSmith546@mail.com"
| eval combined = sender + "," + recipientes
| makemv delim="," combined
| stats values(combined) as combined BY _time
| stats count BY combined, _time
| cluster labelonly=true t=0.1 match=ngramset field=combined
| stats, values(combined), dc(cluster_label) BY _time
This compares both adresses and gives them the same cluster_label, if they are similar. A final dc(clusterlabel)=1 means, that it might be the same person
I don't think something like this (comparing two strings for similarities) is natively available. You might have to create some custom search command to achieve the same. Have a look at following post