Handling DBconnect data duplicacy using some uniqu...

joydeep741 · ‎03-28-2019

I am using DBConnect to PULL data from DB to SPLUNK.

My DB had 900 rows.
Say, My query runs at 7am and pulls 890 records and stores in an index called db_index.

Now, I realize the query did not get me entire 900 records and I need to re-run the query and store in the same index.

But doing so, I get duplicate events in the same index.

I want to add a unique field to each event everytime my query runs.

So that I can segregate data using that key.
Example:
The query that ran at 9am should have a key 111
The query that ran at 11am should have a key 222

sduff_splunk · ‎03-28-2019

DBConnect supports a rising-column, which is a field that is incremented in some manner after each row is added to the DB. DBConnect then makes sure that each time it runs, it grabs all the rows > the last value of that column, and then remembers the last value again for next time.

https://docs.splunk.com/Documentation/DBX/3.1.4/DeployDBX/Createandmanagedatabaseinputs#Choose_input...

If that doesn't work for you, you can always get the _indextime of the events, and discard any events with an index time less than the last time the query ran.

joydeep741 · ‎03-28-2019

I dont see an apt field to be masrked for rising-column.

and _indextime comes different for each event. I want a field that is common for the entire data that came when a particular scheduled search ran.

Handling DBconnect data duplicacy using some unique key

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!