Getting Data In

Can I delete specific data from an index?

dc99dc99
New Member

I know this has been asked before, but I'm hoping that I've misunderstood how deletion works.

The situation is that we have a single main index with 500,000,000 items in it, and 300,000,000 of those are the result of someone accidentally writing their windows security logs from their production machines into the index.

We're extremely low on disk space and in lieu of getting more provisioned, which is problematic I hoped I might be able to remove those entries out of the index somehow.

I know I can run a delete, but I understand this won't remove the data from the index. I also realise I can delete a whole index using the CLI, or delete data from an index based on an expiry strategy.

Can i remove data from an index that's mixed with other data from the same time period, or am I completely stuck? Perhaps I can move the data we want to keep to a new index and delete the erroneous data. Am I permanently stuck with those 300,000,000 junk rows?

Please help

David

Tags (1)
0 Karma

martin_mueller
SplunkTrust
SplunkTrust

The only standard way of removing data other than deleting an index is to cross age- or size-based thresholds per index (default 500GB and several years), and delete indeed doesn't clear up disk space... but you knew that already 🙂

In theory you could manually delete single buckets, if and only if that bucket contains nothing but undesired events... however, that's likely a risky procedure and certainly needs working backups to be feasible.

Moving data to a new index selectively... I don't know of a way to do that. You could of course re-index from raw data.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

In addition to separating indexes and introducing temporary indexes for testing purposes, I avoid using the default/main index entirely in production environments. That way any data added carelessly without specifying an index can safely be dropped by cleaning the index.

0 Karma

dc99dc99
New Member

I thought that might be the case. We'll be more careful separating out indexes in future

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...