All Apps and Add-ons

RT search creating extremely large (20G+) dispatch tmp artifacts on local file system despite SH_POOL used.

the_wolverine
Champion

Our SH_POOL is set up correctly, it works and SH_POOL is populated properly from all the Splunk happenings. However, when I launch the SOS distributed indexing performance app (version 3.1.0), the article in LOCAL/var/run/splunk/dispatchtmp/ gets extremely big and causes us to run out of disk space on this SH.

Why is anything being written to local? It should be written to SH_POOL.

1 Solution

hexx
Splunk Employee
Splunk Employee

Hi there, Tina!

Why is anything being written to local in a POOLING scenario?

This is a search optimization. When a search operator needs to use a temporary on-disk back store, it leverages dispatchtmp, which is always on the local disk, instead of the dispatch directory which is on NFS in the case of a search-head pool.

We discovered that this large artifact was caused by SoS app version 3.1 (and prior?), the RT nature of the "Real-time measured indexing rate and latency" panel (the top panel in the Distributed Indexing Performance view) caused an extremely huge dispatch artifact to be created if the user allowed the panel to continue to run.

Indeed, this is due to the nature of this search which attempts to assess the indexing latency and throughput rate of all incoming data. This means that we have to do a couple of things that can be very expensive in large scale deployments:

  • Search all the data with "index=* OR index=_*"
  • Use an open-ended real-time window without any constraints on _time, with the "real-time (all time)" time range

That is precisely why we modified this view to no longer run this search on load but to warn the user of the risk of running it for a long time, and only running if the user explicitly wants it.

View solution in original post

hexx
Splunk Employee
Splunk Employee

Hi there, Tina!

Why is anything being written to local in a POOLING scenario?

This is a search optimization. When a search operator needs to use a temporary on-disk back store, it leverages dispatchtmp, which is always on the local disk, instead of the dispatch directory which is on NFS in the case of a search-head pool.

We discovered that this large artifact was caused by SoS app version 3.1 (and prior?), the RT nature of the "Real-time measured indexing rate and latency" panel (the top panel in the Distributed Indexing Performance view) caused an extremely huge dispatch artifact to be created if the user allowed the panel to continue to run.

Indeed, this is due to the nature of this search which attempts to assess the indexing latency and throughput rate of all incoming data. This means that we have to do a couple of things that can be very expensive in large scale deployments:

  • Search all the data with "index=* OR index=_*"
  • Use an open-ended real-time window without any constraints on _time, with the "real-time (all time)" time range

That is precisely why we modified this view to no longer run this search on load but to warn the user of the risk of running it for a long time, and only running if the user explicitly wants it.

the_wolverine
Champion

Thanks, Octavio! I will use your response to convince the team that we need to allocate more storage to our SHs.

the_wolverine
Champion
  1. Why is anything being written to local in a POOLING scenario?

  2. We discovered that this large artifact was caused by SoS app version 3.1 (and prior?), the RT nature of the "Real-time measured indexing rate and latency" panel (the top panel in the Distributed Indexing Performance view) caused an extremely huge dispatch artifact to be created if the user allowed the panel to continue to run.

In version 3.2, the rt was removed and a Run button was added in addition to the following disclaimer:

Caution: This search can be resource intensive and should not run indefinitely. Use the search controls on the right to cancel, pause, or finalize the search

Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...