Splunk Search

How to compare events from two sources to find outliers in data?

bcusick
Communicator

Hi,

I'm trying to compare events from two sources to show where the outliers are (they "should" be the same but we know that there are discrepancies.

I can compare "number of rows/events" easily with a "chart count by source" command, but I also want to check the integrity of the field values.

Basically, each event has 10 fields (same ten fields in each source). How do I check that they are the same, and return some kind of message/raw event/field value if they are not the same?

Thanks.

0 Karma

somesoni2
SplunkTrust
SplunkTrust

May be something like this

source=source1 | eval source1Data=Field1."##".Field2."##"...<<all 10 fields concatenated>>."##".Field10 | appendcols [search source=source2 | eval source2Data=Field1."##".Field2."##"...<<all 10 fields concatenated>>."##".Field10]
| eval result=if(source1Data=source2Data,"Matched","Unmatched")
0 Karma

bcusick
Communicator

Would it be possible to do this with raw data? Meaning using the field "_raw"? This runs, but results are incorrect due to the sources being different..Here's that example:

source="D:\Bluesheets\ExtractFromFES.csv" eval FESdata=_raw | appendcols [search source="D:\Bluesheets\SentToReg.csv" | eval Regdata=_raw] | eval result=if(FESdata=Regdata,"Matched","Unmatched") | table result

0 Karma

SierraX
Communicator

Ok it's maybe a bit late … but for future searchers the other answers are to complicated, no help or wrong
```
index=main sourcetype="test2" | stats values(source) as sources by _raw | eval sources=if(mvcount(sources)>1,"match","no match")
```
Beware of linecounts>1 in the main search, this could create false "no match"

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Can you post your query? and may be some sample data?

0 Karma

bcusick
Communicator

This keeps telling me I have mismatched "]" but I checked multiple times to ensure it's correct. Could the fact that my fields contain "." and "-" have anything to do with this?

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Since I used appendcols, it will compare source1 event1 with source2 event1. It would fail for the cases no of rows differ in the sources.

0 Karma

bcusick
Communicator

This looks like it will work. Will provide an update tomorrow. Will this know to compare source1 event1 with source2 event1?

0 Karma

bcusick
Communicator

This is for reporting to regulators, so everything should be EXACTLY the same. Same timestamp, same field order, etc. I want to be able to check if any fieldname is different (I can pivot on field TRANSACTION_ID) for everything else. All field names static, and yes, If row numbers are the same (which they should be) I should be able to compare row1.source1 to row1.source2

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Does both sources have timestamp and do they differ? What should be the order of rows/events for field comparison;first row of source1 with first row of source2?? Are field names static?

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...