Splunk Dev

Splunk DB Connect - Duplicate events in Rising column

janmichael
New Member

I setup 3 DB inputs as part of our requirement to ingest DB logs. Problem is we're encountering duplicate events upon ingesting data to splunk. I can observe the duplicate events when there's no available data to be fetched from DB, Splunk DB connect up still use the last checkpoint value and keeps re-ingesting the data until it fetch new data in DB.

Here's sample internal logs for how checkpoint value generates in 1 of the DB setup. Any idea where the issue and how to fix it.
$SPLUNK_HOME/var/lib/splunk/modinputs/server/splunk_app_db_connect

{"value":"2020-03-02 00:42:39.307","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T11:45:00.019+11:00"}
{"value":"2020-03-02 00:47:17.17","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T11:50:00.084+11:00"}
{"value":"2020-03-02 00:58:00.783","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:05:00.164+11:00"}
{"value":"2020-03-02 00:58:00.783","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:15:00.018+11:00"}
{"value":"2020-03-02 00:58:00.783","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:20:00.624+11:00"}
{"value":"2020-03-02 00:58:00.783","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:25:00.017+11:00"}
{"value":"2020-03-02 00:58:00.783","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:35:00.423+11:00"}
{"value":"2020-03-02 01:37:59.38","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:50:00.108+11:00"}

Thanks.

0 Karma

zeel_ajadiya
Loves-to-Learn Lots

Has this issue been resolved, and if so, what was the solution?

0 Karma

Richfez
SplunkTrust
SplunkTrust

My first guess:

When you set up the input, did you set it up to

WHERE abc < ? ORDER BY abc DSC

Or did you

WHERE abc <= ? ORDER BY abc DSC

This would do it, and it's the only thing I can think of off hand. You use >= (or <=, whatever) instead of strictly > or <.

0 Karma

janmichael
New Member

Hi rich,

It's currently set as WHERE login_time > ? ORDER BY login_time

I already tried order by in DESC, but keep having duplicate events also. I implemented multiple DB inputs with diff connection type before but it's my first time to encounter duplicate events in rising column.

I'm using the MS SQL as connection type now.

0 Karma

Richfez
SplunkTrust
SplunkTrust

Well then, second guess. The timestamp in MSSQL is higher precision than what Splunk is ... recovering? Pulling in? Receiving? Whatever term you want to use.

Can you confirm that the timestamp, when viewed in SQL Server Management Studio, is actually 2020-03-02 00:58:00.783 Because maybe it's actually, like 2020-03-02 00:58:00.783445 and the timestamp specifier on the input needs to be made more precise?

If that's not it, I may be out of simple ideas. Or maybe not, let's see what happens. 🙂

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...