It appears the UNIX 4.5 app does not take into account file system caching when reporting free memory. For example, if I run etc/apps/unix/bin/vmstat.sh:
-bash-3.2$ ./vmstat.sh
memTotalMB memFreeMB memUsedMB memFreePct memUsedPct pgPageOut swapUsedPct pgSwapOut cSwitches interrupts forks processes threads loadAvg1mi
96165 50977 45187 53.0 47.0 2240619655 0.0 0 3534174241 855326036 56348253 286 3384 0.78
Splunk would tell me I have only 53% free memory on the server. However, the free command reports a more accurate story:
-bash-3.2$ free -m
total used free shared buffers cached Mem: 96165 45184 50980 0 1848 22945
-/+ buffers/cache: 20390 75774 Swap: 8189 0 8189
So in reality I have 78% free memory if you consider the file system cache. Has anyone else noticed this?
To wrap this up, for SPL-52295 the final fix was implemented as follows:
PARSE_1='/total memory$/ {memTotalMB=$1/1024} /free memory$/ {memFreeMB+=$1/1024} /buffer memory$/ {memFreeMB+=$1/1024} /swap cache$/ {memFreeMB+=$1/1024}'
The next version of the unix app, due out later this year, will contain this fix.
You can edit the *NIX script to add memCacheMB and memCachePct. Just adjust the following lines in $splunk_home/etc/apps/Splunk_TA_nix/bin/vmstat.sh.
HEADER='memTotalMB memFreeMB memUsedMB memCacheMB memFreePct memUsedPct memCachePct pgPageOut swapUsedPct pgSwapOut cSwitches interrupts forks processes threads loadAvg1mi waitThreads interrupts_PS pgPageIn_PS pgPageOut_PS'
HEADERIZE="BEGIN {print \"$HEADER\"}"
PRINTF='END {printf "%10d %10d %10d %10d %10.1f %10.1f %10.1f %10s %10.1f %10s %10s %10s %10s %10s %10s %10.2f %10.2f %10.2f %10.2f %10.2f\n", memTotalMB, memFreeMB, memUsedMB, memCacheMB, memFreePct, memUsedPct, memCachePct, pgPageOut, swapUsedP
ct, pgSwapOut, cSwitches, interrupts, forks, processes, threads, loadAvg1mi, waitThreads, interrupts_PS, pgPageIn_PS, pgPageOut_PS}'
DERIVE='END {memFreeMB+=memCacheMB; memUsedMB=memTotalMB-memFreeMB; memUsedPct=(100.0*memUsedMB)/memTotalMB; memFreePct=100.0-memUsedPct; memCachePct=(100.0*memCacheMB)/memTotalMB; swapUsedPct=swapUsed ? (100.0*swapUsed)/(swapUsed+swapFree) : 0; waitThreads=loadAvg1mi >
cpuCount ? loadAvg1mi-cpuCount : 0}'
and
PARSE_1='/total memory$/ {memTotalMB=$1/1024} /free memory$/ {memFreeMB+=$1/1024} /buffer memory$/ {memFreeMB+=$1/1024} /swap cache$/ {memCacheMB=$1/1024}'
you can get outputs like this:
# ./vmstat.sh
memTotalMB memFreeMB memUsedMB memCacheMB memFreePct memUsedPct memCachePct pgPageOut swapUsedPct pgSwapOut cSwitches interrupts forks processes threads loadAvg1mi waitThreads interrupts_PS pgPageIn_PS pgPageOut_PS
2832 2165 667 1388 76.4 23.6 49.0 2331754 0.5 3064 6377829 6892068 160152 367 546 0.09 0.00 247.50 0.00 0.00
[root@bogon bin]#
To wrap this up, for SPL-52295 the final fix was implemented as follows:
PARSE_1='/total memory$/ {memTotalMB=$1/1024} /free memory$/ {memFreeMB+=$1/1024} /buffer memory$/ {memFreeMB+=$1/1024} /swap cache$/ {memFreeMB+=$1/1024}'
The next version of the unix app, due out later this year, will contain this fix.
Current NIX version is 4.5. When will be next version release?
Yes, this is in scope for the next release of the unix app.
Is any workable solution for this?
You can edit the *NIX script to get the desired result. Just adjust the following lines in $splunk_home/etc/apps/unix/bin/vmstat.sh.
DERIVE='END {memUsedMB=memTotalMB-memFreeMB;
to
DERIVE='END {memUsedMB=memTotalMB-memFreeMB-memBuffMB-memCacheMB;
and
PARSE_1='/total memory$/ {memTotalMB=$1/1024} /free memory$/ {memFreeMB=$1/1024}'
to
PARSE_1='/total memory$/ {memTotalMB=$1/1024} /free memory$/ {memFreeMB=$1/1024} /buffer memory$/ {memBuffMB=$1/1024} /swap cache$/ {memCacheMB=$1/1024}'
Awesome answer!
It would be very beneficial for the *NIX App to include free RAM+buffers+cache in the output of vmstat.sh as it more accurately reflects the amount of (mapped) memory available for applications to use instead of only free RAM (unmapped memory).
Free memory should be calculated (from vmstat -s) as the sum of "free memory", "buffer memory" and "swap cache" (OR: free -k | awk 'NR==3 {print $4}'
). Free RAM is almost always at or below 10% shortly after reboot and applications/daemons start up; that's the modus operandi for Linux memory management and isn't an entirely accurate reflection of free memory available for everything except the kernel itself.
Noted, I will make sure this is considered for a future release.
I'm not sure that we want to consider the file system cache as part of your free memory. I could be wrong though.
My fault, I edited my answer above.
I'm not talking about swap, I'm talking about the Linux file system buffer and cache.