Getting Data In

What happens when indexer encounters an event with timestamp older than configured retention period

immortalraghava
Path Finder

I had this particular scenario where I was not able to assert Splunk indexer behavior.
Retention period for a index is configured as 6 years.
I try to send some logs which are older than the configured retention period.

Some time the logs get into the index and some time it doesn't. (I run a simple search to find count of events)
(The log file did reach the indexer that part is tested. I find entries in metrics.log. )

What could be the reason for this intermittent behavior? Which stage does the filtering based on retention period takes place.
Will this old event also go through hot, warm and frozen states of a bucket ?
Any clarifications would be helpful.

Thanks

0 Karma

mayurr98
Super Champion

Based on consistency of timestamp on your data, there may be cases where you get a very old timestamp, say Dec 2013, today (may be bug, wrong logging or timestamp parsing). A data bucket is frozen only when the latest event (highest timestamp) on the bucket is older than your retention period. If the old data was received recently it'll be stored in a bucket with latest event within retention period and will be roll to frozen. All Splunk queries/report dashboard will show the earliest timestamp on the index as Dec 2013, even though your retention is 1 year only.

My suggestion would be to also enforce your data retention based on total index size (maxTotalDataSizeMB) along with retention period (frozenTimePeriodInSecs). This way you can start rolling data bucket to frozen before you run out of space. See this for more details.

https://docs.splunk.com/Documentation/Splunk/6.5.2/Indexer/Setaretirementandarchivingpolicy#Freeze_d...

have a look at this answer
https://answers.splunk.com/answers/511747/why-is-the-retention-policy-not-working-on-certain.html

let me know if this helps!

0 Karma

immortalraghava
Path Finder

Thanks for the answer. But what I really hit was this.. Just now found this

https://answers.splunk.com/answers/31961/what-is-a-hot-quar-v1-directory-vs-standard-hot-v1.html

Even with quarantined buckets I find some inconsistencies. Some time old data, older than quarantinePastSecs gets into ordinary hot bucket. May be someone from Splunk should clear this. There are some comments already to the accepted answer which are still not addressed.

0 Karma

mescober_splunk
Splunk Employee
Splunk Employee

@immortalraghavan how did you check that the event went to quarantine bucket or to normal hot bucket? If it's by search, the events in the quarantine bucket will still return in search when searching that given log.

Only reason it won't be searchable after logging is when the bucket gets frozen based on retention policy (either size based or time based).

0 Karma

immortalraghava
Path Finder

The data is not rolled it is still in the hot quarantine bucket. The bucket was not there before sending the old data. THats how I confirmed that my current ingest created it. But it is not showing up in the results. Is there any other way I could check

0 Karma
Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...