Getting Data In

How to efficiently resend data into Splunk with a REST API that will replace old data if it has been updated

kdanielsobrien
Explorer

Hi,

I am looking to resend data to Splunk in the most efficient way. I want to resend data into Splunk with a REST API that will replace old data if it has been updated. I don't want to resend all of the data, only anything that has changed.

The goal is to not give Splunk more data than it needs.
I am searching the data based on a by-minute time range so even in the course of 5-10 minutes, resending all of that would be a lot of data if most of it is repeating events.

I'm very new to all of this so I was looking for some guidance on where to start or helpful links to get started.

0 Karma

woodcock
Esteemed Legend

You can only do that if you store the data in a Lookup File in Splunk. If you do this, you would update it like this:

Some search to pull in new data here (could be dbxquery or something else)
| some SPL to transform the data and ensure that a distinct key field such as "host" exists
| inputlookup append=true YourLookupFIleHere.csv
| dedup host
| outputlookup YourLookupFIleHere.csv
0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi @kdanielsobrien,
maybe my words could seem strange: are you sure that you need Splunk?
Splunk works in a different way than a database:

  • ingested data aren't modifiable (also by REST API),
  • you have always all the data (in the retention period) and not only the data you want, you have all the data you ingested and you cannot delete them,
  • if you use the delete command, data continue to stand in the buckets (in the retention period) but aren't searchable;
  • for this reason Splunk is commonly used for compliance better than a DB!

Anyway, the first question is: way you want to have this approach? to save storage or what else?

Anyway, if you want to do this, you could create a summary index populating it every day with all the correct data you want ( https://docs.splunk.com/Documentation/Splunk/8.0.0/Knowledge/Usesummaryindexing ).

Ciao.
Giuseppe

0 Karma

kdanielsobrien
Explorer

Hi Giuseppe,

I guess I want to filter what data is being sent to Splunk.. For example, I send all the data to Splunk for a 10 minute time span. After I have sent the data to Splunk, a few minutes of data have been replaced with updated new values/data. I only want to resend the new/updated data to Splunk for the few minutes that have been changed.

I want to filter what data is being sent to Splunk because I will waste a lot of GB of data if I resend all of the data from a time span, just to update a few events in Splunk search.

0 Karma

starcher
Influencer

That isn’t how Splunk works. It’s not a database so has no feature like that.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...