Splunk Search

Why is my dashboard panel search using up so much disk space?

drewg33
Engager

I am having trouble with the search for a dashboard panel. The job is taking up too much of my disk quota (~350MB when run over 24 hour period) and is causing other jobs to queue up because I have exceeded my quota.

Obviously I can increase my disk quota, but I was trying to figure out why this job is such a disk hog in the first place and fix that because from what I can see, it should only be storing 10 rows of a table with a handful of columns each.

Is anyone able to explain why this search would use so much disk space or suggest any improvements?

index="proxylogs" | stats sum(bytes_from_client) as BytesFromClient, distinct_count(client_ip) as DistinctClient by domain | where BytesFromClient > 10000000 AND DistinctClient < 40 | eval Upload(GB)=BytesFromClient/1073741824 | fields domain, Upload(GB) | sort 10 - Upload(GB)
0 Karma
1 Solution

martin_mueller
SplunkTrust
SplunkTrust

I'm guessing your by domain has very high cardinality, making the temporary search results huge. Solving high-cardinality problems is an inherently hard thing to do. Additionally, check how large the set after the where is, large sorts can also use temporary files. This may be indicated in search.log accessible through the job inspector. To find out what specifically uses up space, check out the contents of $SPLUNK_HOME/var/run/splunk/dispatch/<search id>.

View solution in original post

somesoni2
Revered Legend

One option could be to use summary indexing to pre-calculate the summary for smallar period, say 1 Hr and then run your query on the summarized data. See more information here.

http://docs.splunk.com/Documentation/Splunk/6.0.5/Knowledge/Usesummaryindexing

https://wiki.splunk.com/Community:Summary_Indexing

martin_mueller
SplunkTrust
SplunkTrust

I'm guessing your by domain has very high cardinality, making the temporary search results huge. Solving high-cardinality problems is an inherently hard thing to do. Additionally, check how large the set after the where is, large sorts can also use temporary files. This may be indicated in search.log accessible through the job inspector. To find out what specifically uses up space, check out the contents of $SPLUNK_HOME/var/run/splunk/dispatch/<search id>.

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...