Currently, we want to delete some events (that is, all events with a certain sourcetype in a defined range in 2016) from Splunk. And normally, deleting with ... | delete
works fine and almost all events could be deleted successfully. However, for some single days, the delete-query hangs and lets thousends of events undeleted. The index, sourcetype etc. are all equal, but the events won't let themselves be deleted 😞
What we observe is that when we run the search index=myindex sourcetype=mytype earliest="03/27/2016:00:00:00" latest="03/28/2016:00:00:00" | delete
we get
INFO: 0 events successfully deleted
INFO: 0 events successfully deleted
INFO: 0 events successfully deleted
INFO: 0 events successfully deleted
...
till to the bitter end. Butindex=myindex sourcetype=mytype earliest="03/27/2016:00:00:00" latest="03/28/2016:00:00:00 | stats count
immediately gives the result
INFO: Your timerange was substituted based on your search stringcount
239343
now we are totally distressed. Does anybody know how to get Splunk to delete the events?
P.S.
Unfortunaly cleaning the whole index is not an option.
Did you check the permissions of the files in the buckets?
Perhaps splunk ran as root for a while, and was corrected to run as splunk... now some files still are owned by root?
If so, easy solution is to stop splunk on the indexer(s), and chown -Rf splunk. /opt/splunk
assuming you dont keep your data in other places.
Thank you for your answer,
all the files all have the correct user and permissions. To be absolute sure, I copied the complete bucket to backup, removed the bucket (after splunk shutting down) and copied the backup back.
After the restart the event are at place again but not deleteable like before.
restarting splunk usually resolves it
Wow, between that and the fsck I really thought one of them would have solved the issue.
Hello,
As per delete command documentation (https://docs.splunk.com/Documentation/Splunk/6.6.0/SearchReference/Delete)
Note: The delete command does not work if your events contain a field named index aside from the default index field that is applied to all events. If your events do contain an additional index field, you can use eval before invoking delete, as in this example:
index=fbus_summary latest=1417356000 earliest=1417273200 | eval index = "fbus_summary" | delete
So try this query instead:
index=myindex sourcetype=mytype earliest="03/27/2016:00:00:00" latest="03/28/2016:00:00:00" | eval index = "myindex" | delete
Regards
Can you tell the other way to delete the index buckets using delete query because i tried the ways you suggested above. can you provide solution for this
Thank you, but this doesn't work neither, the ...| delete
gives no error, just report it deletes 0 events.
As mentioned, most events could be deleted except the 239343, and all events have the same index-field.
To be sure, I tested it, like in your recommendation and also done a ` ... | stats count by index '. Everything is okay.
Welcome, could you please provide a sample log line and Splunk version, to try to reproduce the issue?
Regards
we are running splunk.version 6.4.4 on 6 indexer and 4 searchheads.
here comes an example event
1459108799815, revisit_creationtime=1459108799815, cookie_value="01m401s5v5i2w9izrz", leadout_click_bokey="IrmmSwDt8tX2HXLBVRWhpA", leadout_shop_id="9701", leadout_type="OFFER", leadout_provider="EBYDE", leadout_click_position=2, leadout_affiliate="ipc-android", root_category_id="3626", category_id="26491", product_type="nonVaried", product_id="4019314", product_name="Shimano CN-HG95", manufacturer_name="Shimano", tracetime=1459105200, redirect_to="http://rover.ebay.com/rover/1/707-53477-19255-0/1?ff3=4&pub=5574635388&toolid=10001&campid=533777055...", leadouts=1, reloadblocked_leadouts=0, checkouts=0, loggedin_leadouts=0, loggedin_checkouts=0, page_template="GoToShop", analyze_begin=1459105200, analyze_end=1459108800, kpi_type=session_object_lo
Thanks, I ingested the event and I was able to delete it normally, so I believe the issue now is with the buckets.
you can run the following command to get the distribution of buckets on indexers with the corresponding path of the bucket on filesystem, then you can check the permissions and ownership of buckets if there is something wrong.
| dbinspect index=myindex
Regards
Hi aakwah,
all rights for the bucket seems to be okay. All files in the involved buckets are owned by the splunk-user and have read and write permissions. Of course the dictionaries are also executable. 🙂
Unfortunaly in splunk 6.4.4 the dbinspect-command haven't the corruptonly-option yet 😞
Should I run the splunk cmd fsck?
Best Regards
I have run ./splunk cmd splunkd fsck scan --all-buckets-one-index --index-name=myindex
and got "No issues found" many times. I think the buckets are okay.
Is the splunk fsck comparable to the linux fsck?
Best regards
Marco
Hello Marco,
I believe that splunk fsck handling metadata of the bucket.
Did you run fsck command on all the 6 indexers?
I think it is a good idea now to capture splunkd.log events on indexers and on the searchead during the execution of the delete query, may be there is a clear error message.
Regards
Dear aakwah,
I did run the fsck on all indexers.
I also have examined the log-files. On the searchhead where I executed the query also on the indexers. Ones direct in the file and also with splunks index=_internal ... nothing showy 😞
I'm so distressed, I think there is nothing but deleting the involved buckets ... 😞
Thank you and best regards
Marco
It is really weird, final thoughts from my side,
- Try to run the delete query from indexers directly
- Try to make the time range smaller one hour for example or try to delete single event
- Finally submit a case to Splunk support 🙂
Regards
Thank you, I alos tried to delete single events, but it also doesn't work.
I also copied the complete bucket to backup, removed the bucket (after splunk shutting down) and copied the backup back.
After the restart get the events by a query but can't delete them like before.
best regards
Marco
is that an indexer cluster?
no, it is not a cluster