I noticed that I have some buckets with many tsidx files. I know that the hot buckets are being optimized on a regular basis by splunk. But I would like to optimize manually the warm and cold buckets.
see http://docs.splunk.com/Documentation/Splunk/latest/Indexer/Optimizeindexes
you can run splunk optimize manually with
splunk-optimize bucket_folder
usage:./splunk-optimize -d|--directory <dir>
[-h|--help]
[-m|--mode <normal|force|all> (defaults to normal)]
[-v|--verbose]
[-s|--min-src-count <number> (defaults to 8)]
[-i|--iterations-max <number> (defaults to unlmitted)]
[-b|--lex-tpb <number> (default 64 - merged lexicon terms per block)]
[-x|--max-allowed-size <number> (max allowed extra disk space required, in bytes)]
[-p]--page-size <bytes> (memory allocation size, default: 1048576 (1MB), minimum: 16384)]
and you can retrieve the list of all the buckets with a lot of tsidx files with this script
`
#!/bin/bash
# find_stidx.sh script for listing the buckets with too many tsidx
# required the base folder as argument
#settings
tsidx_limit=10
verbose=1 # display the count per folder
#verbose=0 # just display list of folders
include_hot=1 # look in hot and warm and cold buckets
if [ $# -lt 1 ]; then
echo 1>&2 "usage : $0
exit 2
fi
hot_bucket_list=""
db_bucket_list=""
base_folder=$1
# get the list of the buckets folders
if [ $include_hot -eq 1 ] ; then
hot_bucket_list=find $base_folder -name "hot_*"
fi
bucket_list=find $base_folder -name "db_*"
bucket_list="$hot_nucket_list $bucket_list"
# count the tsidx
if [ $verbose -eq 1 ] ; then
echo "list of buckets with more than $tsidx_limit tsidx files"
fi
for bucket in $bucket_list ; do
count=find $bucket -name "*.tsidx" | wc -l
if [ "$count" -gt "$tsidx_limit" ] ; then
if [ $verbose -eq 1 ] ; then
echo "$count tsidx in $bucket"
else
echo "$bucket"
fi
fi
done
`
Thank you
How to use script?
$SPLUNK_HOME/bin/splunk-optimize -d /h_data/splunk/splunk/var/lib/splunk/idx_3/db/<your bucket with many tsidx files here>
my index name is idx_3
directory is
/h_data/splunk/splunk/var/lib/splunk/idx_3/db
How to use?