Splunk Dev

Splunk DB Connect - Duplicate events in Rising column

janmichael
New Member

I setup 3 DB inputs as part of our requirement to ingest DB logs. Problem is we're encountering duplicate events upon ingesting data to splunk. I can observe the duplicate events when there's no available data to be fetched from DB, Splunk DB connect up still use the last checkpoint value and keeps re-ingesting the data until it fetch new data in DB.

Here's sample internal logs for how checkpoint value generates in 1 of the DB setup. Any idea where the issue and how to fix it.
$SPLUNK_HOME/var/lib/splunk/modinputs/server/splunk_app_db_connect

{"value":"2020-03-02 00:42:39.307","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T11:45:00.019+11:00"}
{"value":"2020-03-02 00:47:17.17","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T11:50:00.084+11:00"}
{"value":"2020-03-02 00:58:00.783","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:05:00.164+11:00"}
{"value":"2020-03-02 00:58:00.783","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:15:00.018+11:00"}
{"value":"2020-03-02 00:58:00.783","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:20:00.624+11:00"}
{"value":"2020-03-02 00:58:00.783","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:25:00.017+11:00"}
{"value":"2020-03-02 00:58:00.783","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:35:00.423+11:00"}
{"value":"2020-03-02 01:37:59.38","appVersion":"3.1.4","columnType":93,"timestamp":"2020-03-02T12:50:00.108+11:00"}

Thanks.

0 Karma

zeel_ajadiya
Loves-to-Learn Lots

Has this issue been resolved, and if so, what was the solution?

0 Karma

Richfez
SplunkTrust
SplunkTrust

My first guess:

When you set up the input, did you set it up to

WHERE abc < ? ORDER BY abc DSC

Or did you

WHERE abc <= ? ORDER BY abc DSC

This would do it, and it's the only thing I can think of off hand. You use >= (or <=, whatever) instead of strictly > or <.

0 Karma

janmichael
New Member

Hi rich,

It's currently set as WHERE login_time > ? ORDER BY login_time

I already tried order by in DESC, but keep having duplicate events also. I implemented multiple DB inputs with diff connection type before but it's my first time to encounter duplicate events in rising column.

I'm using the MS SQL as connection type now.

0 Karma

Richfez
SplunkTrust
SplunkTrust

Well then, second guess. The timestamp in MSSQL is higher precision than what Splunk is ... recovering? Pulling in? Receiving? Whatever term you want to use.

Can you confirm that the timestamp, when viewed in SQL Server Management Studio, is actually 2020-03-02 00:58:00.783 Because maybe it's actually, like 2020-03-02 00:58:00.783445 and the timestamp specifier on the input needs to be made more precise?

If that's not it, I may be out of simple ideas. Or maybe not, let's see what happens. 🙂

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer Certification at ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...