Splunk Search

How to write a search for the most occurring field value per record and calculate their probability percentage?

sshkaya3344
Engager
Device Table1 Table2 Table3 Table4 Table5
Name1   XP      XP    XP     XP     XP 
Name2    7      7    XP     Null    XP
Name3   XP      XP    XP     7     2000 
Name4   XP     Null   Null  Null    XP 
Name5   7       7      7    Null   Null

I am trying to pull the best probable match per record (the best match would be the most occuring string per record).
and also would like to calculate the probability percentage of those matches. I tried several commands but not getting idea how to get there. Need help please.

Tags (3)
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

Here's a thought:

| stats count | eval device = "Name1 XP XP XP XP XP
Name2 7 7 XP  XP
Name3 XP XP XP 7 2000
Name4 XP    XP
Name5 7 7 7   " | makemv delim="
" device | mvexpand device | rex field=device "(?<device>\S+) (?<table1>\S+)? (?<table2>\S+)? (?<table3>\S+)? (?<table4>\S+)? (?<table5>\S+)?"
| foreach table* [eval match = mvappend(match , <<FIELD>>)] | top device match | dedup device | sort device

The first six lines set up dummy data, the last line is the actual search. Here's the result for your example:

device  match  count  percent
Name1      XP      5  100.000000
Name2      XP      2  40.000000
Name3      XP      3  60.000000
Name4      XP      2  40.000000
Name5       7      3  60.000000

I didn't take care to resolve the conflict for device Name2 where 7 and XP are equally likely. Is that relevant to your use case?
Similarly, I treated null values as invalid rather than letting device Name4 be voted null.

View solution in original post

martin_mueller
SplunkTrust
SplunkTrust

Here's a thought:

| stats count | eval device = "Name1 XP XP XP XP XP
Name2 7 7 XP  XP
Name3 XP XP XP 7 2000
Name4 XP    XP
Name5 7 7 7   " | makemv delim="
" device | mvexpand device | rex field=device "(?<device>\S+) (?<table1>\S+)? (?<table2>\S+)? (?<table3>\S+)? (?<table4>\S+)? (?<table5>\S+)?"
| foreach table* [eval match = mvappend(match , <<FIELD>>)] | top device match | dedup device | sort device

The first six lines set up dummy data, the last line is the actual search. Here's the result for your example:

device  match  count  percent
Name1      XP      5  100.000000
Name2      XP      2  40.000000
Name3      XP      3  60.000000
Name4      XP      2  40.000000
Name5       7      3  60.000000

I didn't take care to resolve the conflict for device Name2 where 7 and XP are equally likely. Is that relevant to your use case?
Similarly, I treated null values as invalid rather than letting device Name4 be voted null.

martin_mueller
SplunkTrust
SplunkTrust

Sure, it just takes a slightly different route than a simple dedup.

... | foreach table* [eval match = mvappend(match , <<FIELD>>)] | top device match
    | eventstats max(count) as max by device | where count==max | fields - max | sort device

Same until the top.

sshkaya3344
Engager

Thank you so much for the quick reply and it's quite helpful. Yes, in equally likely case can I display both matches?

Regards,
Sanju
,Thank you so much for the quick reply and it's quite helpful. Yes, in equally likely case, can we display both matches??

Regards,
Sanju

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...