I have an index "main" and several sources associated with this index. The size limit of the index has been reach (150MB), but when I look for the earliest event, there is a difference between the sources.
Exemple:
source1 - first time event is August/2015 (50005771 events)
source2 - first time event is January/2016 (127797272 events)
source3 - first time event is March/2016 (982610866 events)
source4 - first time event is March/2016 (60681838 events)
To get the first time event I used the search bellow.
| metadata type=sources index=main | convert ctime(firstTime) | convert ctime(lastTime) | convert ctime(recentTime)
Why Splunk doesn't index the data since August/2015 for source 2, 3 and 4? The sources shouldn't have the same first time event?
Splunk freezes data from your index by whole buckets based on the youngest event in the bucket, so the tail end of your index has a "fuzzy edge". Depending on what bucket data from what source is in, some data from source1 may be retained for much longer than some other data from source2.
I'm guessing there is a bucket with some old data from source1 and some newer data, so the newer data in the bucket stops the bucket from being frozen until other buckets with older youngest events are frozen first.
There are no limits to sources in the index.
I never had a problem with it.
Splunk freezes data from your index by whole buckets based on the youngest event in the bucket, so the tail end of your index has a "fuzzy edge". Depending on what bucket data from what source is in, some data from source1 may be retained for much longer than some other data from source2.
I'm guessing there is a bucket with some old data from source1 and some newer data, so the newer data in the bucket stops the bucket from being frozen until other buckets with older youngest events are frozen first.
Makes sense, hot buckets don't get frozen. First they need to roll to warm, either after a restart, when the bucket size is reached, when the bucket span is reached, or when too many hot buckets are open.
Thank you for your answer.
Using the search bellow I was able to find out the bucket ID with the old data. It is a hot bucket.
index=myindex | eval BID = replace(_cd, "(\d+):\d+", "\1") | top BID