Comments and answers for "Compare two values from extracted fields - if match increment counter"
https://answers.splunk.com/answers/559800/compare-two-values-from-extracted-fields-if-match.html
The latest comments and answers for the question "Compare two values from extracted fields - if match increment counter"Comment by DalJeanis on DalJeanis's comment
https://answers.splunk.com/comments/561744/view.html
@splunk_95 - try this. This is a count of all items in A where there was a match the same day in B.
index="..." source="log a" OR source="log b"
| bin _time span=1d
| eval matchvalue = if( source="log a",A,B)
| stats values(source) as source, count(source="log a") as CountA, count(source="log b") as CountB by _time matchvalue
| where mvcount(source)>1
| stats sum(CountA) as count by _time
| timechart span=1d countWed, 09 Aug 2017 20:13:58 GMTDalJeanisComment by splunk_95 on splunk_95's comment
https://answers.splunk.com/comments/561701/view.html
hey :)
thanks for the reply.
So essentially I feel "rename B as A" is not working however this seems to fail at "where count > 1". I look through the values of A and the top 10 values the count is 1. There is also no indication of increase in number of 'A' events after the renaming.
Do you reckon sorting it by source like the other example would be better?
I feel the renaming isn't exactly doing what we would like.
Ideally I can get this out the count of matched values from both logs into a timechart instead of a table that would be great.Wed, 09 Aug 2017 16:56:53 GMTsplunk_95Comment by 3no on 3no's comment
https://answers.splunk.com/comments/561627/view.html
index="..." (source="log a" OR source="log b") // show the data
| rename B as A // rename fields B to field A
| dedup A, source // show the unique value of A by source (so you know which are original A and wich are original B)
| stats count by A // Count by field A
| where count > 1 // We take only the field A where the count is superior to 1, because if the value was on A and B count should be 2
| table A // show this values
| stats count // return the count
Try from the beginning and start adding each command to see if it gives you the correct values (when I say command, I mean everything that comes after a pipe "|")
And let me know how it goes :)
3noWed, 09 Aug 2017 08:24:53 GMT3noComment by splunk_95 on splunk_95's comment
https://answers.splunk.com/comments/560969/view.html
Hi
The search you suggested below didn't seem to work... what would be the best way to debug it?Tue, 08 Aug 2017 15:05:04 GMTsplunk_95Comment by 3no on 3no's comment
https://answers.splunk.com/comments/560009/view.html
Yes, my bad A and B are not in the same event (as DalJeanis said)
How about if you try this way :
index="..." (source="log a" OR source="log b") | rename B as A | dedup A, source | stats count by A | where count > 1 | table A | stats count
3noThu, 03 Aug 2017 09:10:09 GMT3noComment by splunk_95 on splunk_95's comment
https://answers.splunk.com/comments/559317/view.html
Thanks for your reply.
I apologize for the confusion.
So my definition for match is "an event in log A which is equivalent to an event in log B"
i.e (assume in both logs each event is always 5 digits)
log A :
A= 12345
A= 23456
A= 34567
A= 12345
Suppose log B:
B=54321
B=98765
B=34567
B=12345
B=12345
So for non unique 'match' i should get the value of `CountMatch` to equal 3.
For a 'unique' (i.e if the two events matched isnt previous match) a previous match I should get the value of 'CountMatch' to be 2 for the example above. I tried to understand the code above but I dont think it quite does that.. (please correct me if im wrong)?
Also does the fact there may be a different number of events in both logs make a difference to the code in your comment?
Many, many thanks - I have had a lot of problems with this - your help is really appreciated.Wed, 02 Aug 2017 20:16:43 GMTsplunk_95Comment by DalJeanis on DalJeanis's comment
https://answers.splunk.com/comments/559887/view.html
Every record that reaches the end of the code is exactly one unique match, so `| stats count by _time` is one way, or `| timechart span=1d count` is another.
----------
If you need to know non-unique matches, then you need to define what you mean. If there are 4 A records and 5 B records, do you want the non-unique match number to be 4, 5, 8, 9 or 20? I'll assume 9 for this code, so the meaning of "match" is "records in either file that were matched in the other file".
index="..." source="log a" OR source="log b"
| bin _time span=1d
| eval matchvalue = if( source="log a",A,B)
| stats values(source) as source, count(source="log a") as CountA, count(source="log b") as CountB by _time matchvalue
| where mvcount(source)>1
| eval CountMatch = CountA+CountB
| stats count as DistinctMatchCount, sum(CountMatch) as TotalMatchCount by _time
| untable _time series count
| timechart span=1d count by seriesWed, 02 Aug 2017 18:05:41 GMTDalJeanisComment by splunk_95 on splunk_95's answer
https://answers.splunk.com/comments/559884/view.html
Hi thanks for your suggestion.
Im a little unclear as to how I could get a count of the number of matches..
As ideally I would put the number of matches onto a timechart (so one column would be matches and another would be unique matches - `dc(matches)` for example)
From the code you wrote - how would I get the count of number of matches where A==B - `just stats count(source) by _time matchvalue`?
I have tried to `stats count (matchvalue)` but that didn't seem to workWed, 02 Aug 2017 17:43:30 GMTsplunk_95Answer by DalJeanis
https://answers.splunk.com/answering/559836/view.html
This puts the value of `A` or `B` into a single field `matchfield` so you can stats them together. We `bin` the `_time` at the 1 day level, and use the value of `source` as an easy proxy for remembering whether it is `A` or `B`. If there are two different sources, then we know we found both of them.
index="..." source="log a" OR source="log b"
| bin _time span=1d
| eval matchvalue = if( source="log a",A,B)
| stats values(source) as source by _time matchvalue
| where mvcount(source)>1
| timechart span=1d count
----------
Updated to include the `timechart` line.Wed, 02 Aug 2017 15:12:55 GMTDalJeanisComment by DalJeanis on DalJeanis's answer
https://answers.splunk.com/comments/559833/view.html
No, that's going to check each individual event to see whether the values of A and B on that event match. Since they are coming from different indexes, match will never be other than 0.Wed, 02 Aug 2017 15:05:39 GMTDalJeanisComment by splunk_95 on splunk_95's comment
https://answers.splunk.com/comments/559817/view.html
awesome thanks! Just as an extension, if I only wanted to consider only the unique values of A against values of B is that possible?
so if
A =12345
A= 23456
A= 23489
A= 12345 (This event would not be compared against all values of B)
Also the spl doesn't seem to be working I checked the extracted fields and can see matching values in both A and B but match seems to return a value of zero... any idea how best to debug?Wed, 02 Aug 2017 14:13:43 GMTsplunk_95Comment by 3no on 3no's comment
https://answers.splunk.com/comments/559815/view.html
Yes, it will check for every value of A if it equals a value of B (same as foreach), if it match it will give to "match" the value of 1, else 0. Then you make the sum to know how much occurence you have.
The span=1d means it will sum "match" over one day, this means that if you make your search over a week you'll get 7 value (one for each day). I'm not sure to understand your question on that last partWed, 02 Aug 2017 14:03:06 GMT3noComment by splunk_95 on splunk_95's answer
https://answers.splunk.com/comments/559812/view.html
Thank you for the reply, though I would like to learn exactly how this answer works (for my splunk development).
Does that eval command check A against every instance of B? Sorry if that is a silly question.. I just cant see what logic makes it check that, kinda like the 'foreach' command in c#.
Also another criteria I had was that this only considered the events over a day so if you only place that threshold on the timechart it should be fine i.e some sort of 'span=1d or _time' is not needed near the eval command?Wed, 02 Aug 2017 13:46:12 GMTsplunk_95Answer by 3no
https://answers.splunk.com/answering/559810/view.html
index="..." source="log a" OR source="log b" | eval match=if(A==B,1,0) | timechart span =1d sum(match)Wed, 02 Aug 2017 13:37:48 GMT3no