Getting Data In

Automatically remove events older than one year

Branden
Builder

I have a requirement to have data older than one year removed from Splunk. By "older than year", I mean the event has to be older than one year, not necessarily when it was indexed.

In my indexes.conf file, I set:

[main]
frozenTimePeriodInSecs = 31536000

31536000 seconds should be one year.

And yet it's showing the earliest events (185,000 of them) as July 18, 2010 (today is August 15, 2011). It was my expectation that the earliest event would be August 15, 2010. Tomorrow's earliest event would be August 16, 2010, etc...

How can I instruct Splunk to automatically purge events older than one year?

Thanks!

Tags (2)

dwaddle
SplunkTrust
SplunkTrust

Splunk removes (freezes) data whole buckets at a time. It can't freeze the bucket until the newest event within the bucket is older than frozenTimePeriodInSecs. You could use the dbinspect search command ( http://docs.splunk.com/Documentation/Splunk/latest/SearchReference/Dbinspect ) to examine your buckets and evaluate how large of a time range the bucket covers. That will give you an idea at least of how long past a year you can expect the OLDEST event in the bucket to stick around.

By default, buckets are limited by a time range (maxHotSpanSecs) and a bucket data size
(maxDataSize). If either of these are exceeded, you splunk will roll the bucket from hot to warm.

You could tune the value of maxHotSpanSecs to be the shortest amount of time you might consider doing archiving - say 1 day (86,400 seconds). You still will not get exact archiving - but you minimize how long "archivable" stays around simply because it exists in a bucket that has much newer data in it as well.

If you need more a more precise archiving capability -- say something that makes you able to stand up to lawyer scrutiny -- then I would suggest an enhancement request.

The whole notion of buckets and such is understandably difficult to relate to less technical people. A good analogy for explaining to your nontechnical people would be the paper banker's boxes. Each banker's box has a range of dates written on the box -- and without going through the whole box you can't discard individual documents. So, you have to keep some things in the box a little longer than you might have wanted just because they're in the same box as something a few days newer.

dwaddle
SplunkTrust
SplunkTrust

see update, lemme know if it helps or not

0 Karma

Branden
Builder

Got dbinspect to work.... honestly, I'm not quite sure what to do with the information there.
It seems like there has to be an easier way to do this.

0 Karma

dwaddle
SplunkTrust
SplunkTrust

Precise-to-the-minute, no. IF you can plan your bucket boundaries well, then you can get pretty close -- like rounded to the day. For dbinspect, run a search over all time of "| dbinspect index=main"

0 Karma

Branden
Builder

So are you saying there's no real way to do it? I was hoping for precise "1 year" cut-off.

I'm playing around with dbinspect like you suggested, but it only outputs "no events found"; not sure what I'm supposed to get out of it.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...