Deployment Architecture

Do I even need cold storage if I am using the same disk?

daniel333
Builder

All,

Is there any value or having cold storage for my indexes if I am using the same disk? Why not leave everything in warm storage?

0 Karma

woodcock
Esteemed Legend

Only if you have a "data must become unsearchable after X days" policy.

0 Karma

s2_splunk
Splunk Employee
Splunk Employee

The distinction between HOT/WARM and COLD purely exists to allow customers to choose the fastest disk they can afford to hold recent data, while using cheaper storage for long term data.
If you don't do that, you don't really need COLD.

Take a look here for some more detail.

lfedak_splunk
Splunk Employee
Splunk Employee

Hey @daniel333, here's more info about buckets How "bucket spread" affects search performance

When you search for something in your indexed data, Splunk gives you the results of your search in reverse chronological order-- we assume you want information about what's happening most recently first, with older results arriving later. Splunk first looks in the hot bucket, then the warm buckets, then cold. The frozen db is never searched.

If you search for "fflanda" in your index, Splunk looks to see if it's in db-hot first. If it is, Splunk then looks at the timestamp of the event that "fflanda" was found in, and the range of time covered by db-hot. Based on that, Splunk decides whether to show you that result right away, or to look in the warm buckets to see if there are any more recent results than that. It will look in every warm bucket (and then in every cold one) that has a range that includes the timestamp of the event in whch "fflanda" was found.

In the case of the first example, the "standard" bucket setup, Splunk will immediately know that there are no results for "fflanda" that are more recent than the one it found in db-hot, and begin giving you your results immediately.

However, in the second example's case, Splunk will look in every warm bucket because it might contain a more recent result than right now--the "bucket span" extends to the future, so finding a more recent event than right now is possible. And Splunk will therefore wait to display any of your search results to you until it has finished searching every bucket that could yield such a result.

This is how the "spread" in time of the data in your buckets affects search performance. How you 'tune' your buckets can make a big difference to your search experience.

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...