Getting Data In

Index time based retention - based on indexed time or event time?

Runals
Motivator

This information is probably located in one of the docs but didn't find it in anything I've read just now. Under normal circumstances current data rolls in and rolls out based on any number of parameters such as frozenTimePeriodInSecs. What happens when you ingest a bunch of historical data though and how does that impact retention? If the retention is strictly sized based it is one thing but time based seems to be another. My gut says this would be based on indexed time but not sure how historical data and timestamps play into bucket creation.

Tags (1)
0 Karma
1 Solution

sowings
Splunk Employee
Splunk Employee

It's based upon the event time.

A bucket (the constituent of an index, (read more here) spans a range of time. This range is set by the event time of the events in that bucket. A bucket is a candidate for rotation (this includes hot to warm, warm to cold, and cold to frozen) when it is the oldest bucket "in scope"(*). Oldest by this definition is based upon the newest time in the index. So a bucket can contain events from 2010, and then have a single event from June 21 2013, and it won't be a candidate for time based rules until frozenTimePeriodInSecs after June 21, 2013.

Note also that the most restrictive rule applies, so if an index is nowhere near full, but the time-based rule says it's time to go, then the bucket will be frozen (consider the _internal index; it has a max size of 500GB, but a retention time period of only 28 days).

  • Scope can be an entire volume, spanning multiple indexes (with volume:foo directives), or a single index, or an bucket state within an index, such as "warm buckets".

View solution in original post

chimbudp
Contributor

Indexed data has the original Timestamp of the incoming events into Splunk.
SO, every events are synchronized with event time and not the indexed time.
Later ,data will be moved from Hot->Warm-> Cold.->Frozen(based on indexes.conf settings)
When we Search for historical data , we need to restore the indexed data to thawed path , and by renaming the indexes (you might read the restore archived data in Splunk) ,we could able to see the historical events with historical Timestamp.

0 Karma

sowings
Splunk Employee
Splunk Employee

As a follow-up to this, note that thawed data lives outside of any retention policy whatsoever. The buckets therein must be managed manually.

0 Karma

sowings
Splunk Employee
Splunk Employee

It's based upon the event time.

A bucket (the constituent of an index, (read more here) spans a range of time. This range is set by the event time of the events in that bucket. A bucket is a candidate for rotation (this includes hot to warm, warm to cold, and cold to frozen) when it is the oldest bucket "in scope"(*). Oldest by this definition is based upon the newest time in the index. So a bucket can contain events from 2010, and then have a single event from June 21 2013, and it won't be a candidate for time based rules until frozenTimePeriodInSecs after June 21, 2013.

Note also that the most restrictive rule applies, so if an index is nowhere near full, but the time-based rule says it's time to go, then the bucket will be frozen (consider the _internal index; it has a max size of 500GB, but a retention time period of only 28 days).

  • Scope can be an entire volume, spanning multiple indexes (with volume:foo directives), or a single index, or an bucket state within an index, such as "warm buckets".

immortalraghava
Path Finder

What happens when the old data is in hotbucket? Does this

"This range is set by the event time
of the events in that bucket."

still applied here ? The folder name does not it show this for hot bucket like it is mentioned for the warm buckets.

0 Karma
Get Updates on the Splunk Community!

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...