One the search head that our SOC uses, i get the following:
IOWait
Under unhealthy instances, its listing our indexers. I performed a TOP on one of them and I see the following:
top - 15:41:36 up 37 days, 11:50, 1 user, load average: 5.31, 6.58, 6.95
Tasks: 416 total, 1 running, 415 sleeping, 0 stopped, 0 zombie
%Cpu(s): 28.3 us, 2.5 sy, 0.0 ni, 66.2 id, 2.7 wa, 0.2 hi, 0.2 si, 0.0 st
MiB Mem : 31858.5 total, 311.6 free, 3699.4 used, 27847.5 buff/cache
MiB Swap: 4096.0 total, 769.0 free, 3327.0 used. 27771.1 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
984400 splunk 20 0 4475268 244140 36068 S 105.6 0.7 1:22.47 [splunkd pid=42128] search --id=remote_"Search Head FQDN"_scheduler__zzm+
796457 splunk 20 0 9232920 790724 36932 S 100.7 2.4 56:56.65 [splunkd pid=42128] search --id=remote_"Search Head FQDN"_scheduler__zzm+
895450 splunk 20 0 1281092 337308 32668 S 85.8 1.0 23:31.00 [splunkd pid=42128] search --id=remote_"Search Head FQDN"_1698412482.432+
Where is says "Search Head FQDN", that's just listing one of our Search Heads
Of course we started seeing this once we upgraded from 8.0.5 to 9.0.5
Seeking guidance on this matter
Many others have had the same problem after upgrading. It seems the alert is too sensitive. Once you have confirmed the instances are healthy, consider adjusting the alert threshold or disabling it.