Solved: RT search creating extremely large (20G+) dispatch...

the_wolverine · ‎06-26-2014

Our SH_POOL is set up correctly, it works and SH_POOL is populated properly from all the Splunk happenings. However, when I launch the SOS distributed indexing performance app (version 3.1.0), the article in LOCAL/var/run/splunk/dispatchtmp/ gets extremely big and causes us to run out of disk space on this SH.

Why is anything being written to local? It should be written to SH_POOL.

hexx · ‎06-26-2014

Hi there, Tina!

Why is anything being written to local in a POOLING scenario?

This is a search optimization. When a search operator needs to use a temporary on-disk back store, it leverages dispatchtmp, which is always on the local disk, instead of the dispatch directory which is on NFS in the case of a search-head pool.

We discovered that this large artifact was caused by SoS app version 3.1 (and prior?), the RT nature of the "Real-time measured indexing rate and latency" panel (the top panel in the Distributed Indexing Performance view) caused an extremely huge dispatch artifact to be created if the user allowed the panel to continue to run.

Indeed, this is due to the nature of this search which attempts to assess the indexing latency and throughput rate of all incoming data. This means that we have to do a couple of things that can be very expensive in large scale deployments:

Search all the data with "index=* OR index=_*"
Use an open-ended real-time window without any constraints on _time, with the "real-time (all time)" time range

That is precisely why we modified this view to no longer run this search on load but to warn the user of the risk of running it for a long time, and only running if the user explicitly wants it.

View solution in original post

hexx · ‎06-26-2014

Hi there, Tina!

Why is anything being written to local in a POOLING scenario?

This is a search optimization. When a search operator needs to use a temporary on-disk back store, it leverages dispatchtmp, which is always on the local disk, instead of the dispatch directory which is on NFS in the case of a search-head pool.

We discovered that this large artifact was caused by SoS app version 3.1 (and prior?), the RT nature of the "Real-time measured indexing rate and latency" panel (the top panel in the Distributed Indexing Performance view) caused an extremely huge dispatch artifact to be created if the user allowed the panel to continue to run.

Indeed, this is due to the nature of this search which attempts to assess the indexing latency and throughput rate of all incoming data. This means that we have to do a couple of things that can be very expensive in large scale deployments:

Search all the data with "index=* OR index=_*"
Use an open-ended real-time window without any constraints on _time, with the "real-time (all time)" time range

That is precisely why we modified this view to no longer run this search on load but to warn the user of the risk of running it for a long time, and only running if the user explicitly wants it.

the_wolverine · ‎06-26-2014

Thanks, Octavio! I will use your response to convince the team that we need to allocate more storage to our SHs.

the_wolverine · ‎06-26-2014

Why is anything being written to local in a POOLING scenario?
We discovered that this large artifact was caused by SoS app version 3.1 (and prior?), the RT nature of the "Real-time measured indexing rate and latency" panel (the top panel in the Distributed Indexing Performance view) caused an extremely huge dispatch artifact to be created if the user allowed the panel to continue to run.

In version 3.2, the rt was removed and a Run button was added in addition to the following disclaimer:

Caution: This search can be resource intensive and should not run indefinitely. Use the search controls on the right to cancel, pause, or finalize the search

RT search creating extremely large (20G+) dispatch tmp artifacts on local file system despite SH_POOL used.

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms