Getting Data In

Best way to delete all data in an index every night?

joesrepsolc
Communicator

I have an index (few million rows) that I need to delete and re-index the new data every night from a DB input. The data doesn't support a great way for me to use a rising column (or I would) and the team that use the DB are back-dating data in there too, which makes it now fun to search for updates.

Today I'm using the "|delete" in a scheduled search for that index, then running the db connect input a few minutes after that timeframe. Is this the best way to do this? Looking for any suggestions on how best to completely remove the data in an index, and reload. No other automation tools available to me at this time (puppet, chef, etc).

Thanks!

Joe

Tags (3)
0 Karma

niketn
Legend

@joesrepsolc if you need to refresh your DB data pulled into Splunk, best way would have been to use dbxquery to fetch data and update a KV Store and accelerate/replicate to indexer the same to support querying through million rows. (PS: Also should not have too many columns)

delete would be a bad way to remove searchable data in an index as it still occupies space on your indexers.. Refer to doc: https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Delete

Also, indexing million records every day and purging the same seems a bad use of license.

Check out one of older answers for details on KV Store Acceleration and Index Replication of KV Store: https://answers.splunk.com/answers/432770/scaling-kv-store-performance.html

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

joesrepsolc
Communicator

Thanks for the response niketnilay...

I am aware that its not a good practice, and it does not clear up the actual space in the index... uses more licensing everyday that I reload the complete dataset... but we couldn't come up with another way (yet).

I've not used the KV store solution to date, but need to know more about it. Unsure that it is a good fit for this volume of records either. Trying to read more about the limitations, maximums, use cases for KV store before going to that solution. Not feeling that it will do much for me at this time.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...