Getting Data In

Best way to delete all data in an index every night?

joesrepsolc
Communicator

I have an index (few million rows) that I need to delete and re-index the new data every night from a DB input. The data doesn't support a great way for me to use a rising column (or I would) and the team that use the DB are back-dating data in there too, which makes it now fun to search for updates.

Today I'm using the "|delete" in a scheduled search for that index, then running the db connect input a few minutes after that timeframe. Is this the best way to do this? Looking for any suggestions on how best to completely remove the data in an index, and reload. No other automation tools available to me at this time (puppet, chef, etc).

Thanks!

Joe

Tags (3)
0 Karma

niketn
Legend

@joesrepsolc if you need to refresh your DB data pulled into Splunk, best way would have been to use dbxquery to fetch data and update a KV Store and accelerate/replicate to indexer the same to support querying through million rows. (PS: Also should not have too many columns)

delete would be a bad way to remove searchable data in an index as it still occupies space on your indexers.. Refer to doc: https://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Delete

Also, indexing million records every day and purging the same seems a bad use of license.

Check out one of older answers for details on KV Store Acceleration and Index Replication of KV Store: https://answers.splunk.com/answers/432770/scaling-kv-store-performance.html

____________________________________________
| makeresults | eval message= "Happy Splunking!!!"
0 Karma

joesrepsolc
Communicator

Thanks for the response niketnilay...

I am aware that its not a good practice, and it does not clear up the actual space in the index... uses more licensing everyday that I reload the complete dataset... but we couldn't come up with another way (yet).

I've not used the KV store solution to date, but need to know more about it. Unsure that it is a good fit for this volume of records either. Trying to read more about the limitations, maximums, use cases for KV store before going to that solution. Not feeling that it will do much for me at this time.

0 Karma
Get Updates on the Splunk Community!

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...

.conf24 | Personalize your .conf experience with Learning Paths!

Personalize your .conf24 Experience Learning paths allow you to level up your skill sets and dive deeper ...