Splunk Search

How do I delete events in one index that exist in another?

kmcarrol
Path Finder

I've read up on delete and am familiar with the implications, but I'm having trouble figuring out how to mark events for deletion that are found in another index. The idea is very simple, but doesn't work. I'm basically trying to build a master index of unique IDs based on a daily incremental update of changes and additions. Similarly, I have a log file that indicates deleted records and I'd like to join those log results and pipe to delete to clean out my reference index.

  index=pgbs | join type=inner Id [search index=pgbs-incremental] | delete
  index=pgbs | join type=inner Id [search index=pgbs-audit extracted_EventType="Delete Entity"] | delete

Unfortunately it seems that delete cannot be invoked after join...

  Error in 'delete' command: This command cannot be invoked after the non-streaming command 'join'.
1 Solution

woodcock
Esteemed Legend

You can only pipe raw events to delete so try this:

index=pgbs [search index=pgbs-incremental | fields Id] | delete

And this:

index=pgbs [search index=pgbs-audit extracted_EventType="Delete Entity" | fields Id] | delete

Be aware that delete does almost nothing useful other than prevent events from ever showing up in search results.

View solution in original post

lguinn2
Legend

Try this instead:

index=pgbs [search index=pgbs-incremental | fields Id]

If it works, then add the | delete on the end. The limitation here is that subsearches have a default limit of 10,000 results. So you won't be able to delete more than 10,000 events at a time. But you could run this multiple times, choosing a smaller time range each time.

woodcock
Esteemed Legend

You can only pipe raw events to delete so try this:

index=pgbs [search index=pgbs-incremental | fields Id] | delete

And this:

index=pgbs [search index=pgbs-audit extracted_EventType="Delete Entity" | fields Id] | delete

Be aware that delete does almost nothing useful other than prevent events from ever showing up in search results.

kmcarrol
Path Finder

Thanks! It isn't obvious to me why the syntax works, but it does. The alternative was that dedup would have to sift through more and more and more events. Thanks again!

Thanks to lguinn as well, who mentioned the same and added the reminder of the subsearch limits. I had considered the same and I have incorporated your guidance into my process documentation.

0 Karma

woodcock
Esteemed Legend

I told you why it does/not work. You can only delete events which means that you cannot delete non-events. Think about it: once you pass events into a transforming non-streaming command, you are no longer working with events.

0 Karma

kmcarrol
Path Finder

Yes, sorry. I understood your explanation why deleting after a join wasn't valid. What I didn't initially understand is why "index=pgbs [search index=pgbs-incremental | fields Id]" finds the records that I'm looking for since there isn't an explicit match on the Id field. I think the answer is that this syntax creates a free text search on the ID values, right? And if I happened to have another field "Previous ID" or "Reference ID" or even just a completely unrelated field with a random match, it would delete that record too, right?

0 Karma

woodcock
Esteemed Legend

kmcarrol
Path Finder

Holy cow! I know I'm very new to Splunk but I can't believe I haven't seen that yet, especially with all the reading up I did on the join command. That certainly allays my fears. Thanks for the reference!

0 Karma

lguinn2
Legend

Also, take a look at the search job inspector when you run the search. It will often show you the "expansion" of the subsearch, and shows a lot of other useful info about your search performance.

kmcarrol
Path Finder

As I think about the syntax, is this a free text search that could potentially match the returned list of IDs against fields in pgbs other than ID?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...