Hi,
I have some CSV files which were indexed, but a proportion of the events were corrupted in the index. Each file has up to 1 million records. Is there a way to ask Splunk to re-index this file and to only index events that it doesn't current have? Each event has a unique record ID field.
An easier way might be to delete the events you have in your index now, clean the fishbucket, and just let Splunk reindex them.
Yes. This is what I do.
Run the search that has the events you need to delete, I assume you don't want to delete the entire index. If you do, run the below command with the index name that you wish to wipe out, then clean _thefishbucket. Otherwise, run your search to find your events, then pipe that "| delete".
cd out to the Splunk\bin directory. Type splunk stop. Then type splunk clean eventdata -index _thefishbucket
Then type splunk start. The rest is automatic, assuming you have fixed the files.
Clean the fishbucket?
You might want to make a report of the record IDs you have in Splunk, then cull those from your input file. Then use splunk add oneshot to import the file (or some other method).
I was kind of hoping for something a little less manual....