I have a DB Connect 2 MySQL database that gets updated on a regular basis. How can I get Splunk to reflect the current status of the database versus the originally indexed version of the database?
If you are indexing data from your database into Splunk, you need to set a rising column that Splunk can use to find new data in your database. Your other option is to re-index the entire database periodically. Either way, you then run the input on a schedule. There's more detail in the link below.
From the docs:
Batch Input: A batch input is an input that invokes the same query each time and returns all results.
Rising Column: A rising column input finds the new records you want and returns only those records with each query. When you create a rising column input type, you must specify the rising column. You can specify as rising column any column whose value increases >over time, such as a timestamp or sequential ID. For example, a rising column could be last_update, employee_id, customer_id, transaction_id and so on. The rising column is how DB Connect keeps track of what records are new.
http://docs.splunk.com/Documentation/DBX/2.2.0/DeployDBX/Createandmanagedatabaseinputs
Splunk's datastore is a poor place to synchronously replicate the state of a database table. That's because once an event is indexed in Splunk it cannot be updated. One possible solution is to take snapshots of the table using a batch input. You can then refer to the relevant snapshot in your searches. If it's just the most recent snapshot you need, consider the lookup or query functions of DB Connect to access the current state of the table.
Can you explain what you're trying to do at a little higher level?
If you are indexing data from your database into Splunk, you need to set a rising column that Splunk can use to find new data in your database. Your other option is to re-index the entire database periodically. Either way, you then run the input on a schedule. There's more detail in the link below.
From the docs:
Batch Input: A batch input is an input that invokes the same query each time and returns all results.
Rising Column: A rising column input finds the new records you want and returns only those records with each query. When you create a rising column input type, you must specify the rising column. You can specify as rising column any column whose value increases >over time, such as a timestamp or sequential ID. For example, a rising column could be last_update, employee_id, customer_id, transaction_id and so on. The rising column is how DB Connect keeps track of what records are new.
http://docs.splunk.com/Documentation/DBX/2.2.0/DeployDBX/Createandmanagedatabaseinputs
Thanks for the response. I am already using the Rising Column approach, but my issue is that this approach only looks for new records to index. Once a record has been indexed, I'm trying to figure out how to change it if a field within the corresponding record in the database gets updated by the user. For example, let's say the Rising Column was tracked to 'Invoice_ID' and there is a field within each record called 'Invoice_Number'. If someone makes a mistake on the 'Invoice_Number' entry and later corrects the mistake through a database update function, how can I get Splunk to recognize that the data within the indexed record has been changed, even though the 'Invoice_ID' is still the same for that changed record?
Ah. Yes that is a problem. That really would only work if you also have an "updated" column with a timestamp value in it, then you could track that instead.
Yeah, it's a tricky one. So... is there any solution for this situation. It seems to me that databases get updated all of the time, right? There should be some way of accounting for changed field values, right?
It depends on the db vendor and what the developer implemented in their table schema. You might want to take a look at the following:
MS SQL rowversion column:
http://www.codeproject.com/Articles/698025/Rowversion-datatype-in-SQL-Server-Track-which-rows
Oracle ORA_ROWSCN:
https://docs.oracle.com/cd/B19306_01/server.102/b14200/pseudocolumns007.htm
MySQL Timestamp:
Also if you are willing to explore something beyond dbconnect, there is the mysql app. I'm not sure if it includes a binary log reader or not, but worth a look.