Deployment Architecture

excessive session lock files using up inodes

rbaxley
Engager

We have splunk running on a linux server and it keeps crashing due to low disk space. I've traced the culprit to the server not being out of space, but running out if inodes. There is an extraordinary large amount (hundreds of thousands) of session-***** and session-*****.lock files in /opt/splunk/var/run/splunk. They go back 2-3 months. Why isnt splunk purging these old files? Any advice?
Can post log files as requested.

Thanks all

Tags (3)
1 Solution

Lucas_K
Motivator

Apart from just deleting these files did you find any follow up solution to this?

I've run into the same issue (v4.3.3).

Edit: Appears to have been a known bug -> http://docs.splunk.com/Documentation/Splunk/latest/ReleaseNotes/4.3.4 (SPL-48237)

Every request to splunkweb is creating a session lock file in var/run/splunk, ending up in DOS. (SPL-48237)

So the solution is to either upgrade to 4.3.4 OR implement a script to do a find -exec rm on files older than the last splunk restart date.

edit2: we've found that even when upgraded to 4.3.4 these files are still not purged correctly.

View solution in original post

johnhsu
Observer

Remove tons of session files
ll | grep session | awk '{print "rm "$8" "}' | csh

Note:
"$8" -- column 8 of ll comand

Thanks
Sincerely
John Hsu
johnthsu@hotmail.com

0 Karma

Lucas_K
Motivator

ok just an update to this.

I've checked all our search head instances (both 4.3.3 and 4.3.4) and found large numbers of files in var/run/splunk (upwards of 3 million on some instances). We are however using F5 LTM in front of these. The F5's have health monitors set to check every 5 seconds. As such there are 24 new session files per minute (2 x F5 LTM's). These are pre-authenticated sessions so have very little information in them so its easy to distinguish from current user logins.

I now run the following scripts every 5 minutes. Once the files have been deleted search head web interface performance seems to increase also (no metrics on this however so maybe placebo effect? 😉 ). If you are going to use this check the session/lock files and see if you can figure out which ones are actually valid sessions and which are stale. Adjust the following find commands as required.

For Splunk 4.3.4

As 4.3.4 has the lock file fix we need to only clean up the remaining session files.
This command only deletes the very small session files (~60 bytes). All our users (except for admin account) connect using sso so the legit session files are quite large by comparison so this is good way to filter out bogus sessions.

find /opt/splunk/var/run/splunk/ -name 'session*' -mmin +60 -type f -size -65c -exec rm {} \; 

For Splunk 4.3.3

There is no lock file fix so the script needs to delete these lock files also.
This script does the same as the 4.3.4 one and then finds the orphaned .lock files and deletes them also.

#!/bin/bash

#L.K Temp script to delete session files that are not cleaned up by splunk (see bug ref : SPL-48237)
#Modified for splunk 4.3.3 

#The following will delete files older than 1 hour that are smaller than 65 bytes.
#ie. unauthenticated web logins (f5 polling script etc)

SESSION_PATH="/opt/splunk/var/run/splunk/"

#Delete only small session files but dont delete lock files as they *may* be tied to existing logged in sessions
/usr/bin/find $SESSION_PATH  -name 'session*' ! -name 'session*.lock' -mmin +60 -type f -size -65c -exec rm -v {} \;

#Find orphan lock files without a matching session file.
files=$(/usr/bin/find $SESSION_PATH -name 'session*.lock' -mmin +60 -type f -size -65c )

for file in $files
do
        filename=$(basename "$file")
        extension="${filename##*.}"
        filename="${filename%.*}"
        file_path=$(dirname $file)
        #Check if matching session file exists.
                if [ ! -f $file_path/$filename ]
                then
                        #echo " safe to delete lock file : $file"
                        /bin/rm -v $file
                fi
done

I hope this is useful to someone that's run into the same issue.

Lucas_K
Motivator

Apart from just deleting these files did you find any follow up solution to this?

I've run into the same issue (v4.3.3).

Edit: Appears to have been a known bug -> http://docs.splunk.com/Documentation/Splunk/latest/ReleaseNotes/4.3.4 (SPL-48237)

Every request to splunkweb is creating a session lock file in var/run/splunk, ending up in DOS. (SPL-48237)

So the solution is to either upgrade to 4.3.4 OR implement a script to do a find -exec rm on files older than the last splunk restart date.

edit2: we've found that even when upgraded to 4.3.4 these files are still not purged correctly.

Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...