As per another topic on "answers" I executed the following search:
index=_internal source=metrics.log group=queue | timechart perc95(current_size) by name
This confirms that my parsingqueue is almost always at 1000, which would probably explain why I have one splunkd process constantly using 100% of 1 out of 4 CPU's.
I am also receiving the following sequence of errors every 300ms from the splunkd.log, it might be a coincidence, it might be the cause.
02-22-2011 19:08:59.772 ERROR TcpInputProc - Received unexpected 68021378 byte message! from hostname=txxxxxxxxxx, ip=10.xxxxxxxx, port=45384
02-22-2011 19:08:59.772 INFO TcpInputProc - Hostname=txxxxxxxxxxxx closed connection
02-22-2011 19:08:59.855 INFO TcpInputProc - Connection in cooked mode from txxxxxxxxxxxx
02-22-2011 19:08:59.913 INFO TcpInputProc - Valid signature found
02-22-2011 19:08:59.913 INFO TcpInputProc - Connection accepted from txxxxxxxxxxx
Is it possible that some input from a forwarder keeps getting reprocessed?
Any pointers truly welcome.
... View more