Well, "guaranteed" is very difficult in from a rigorous mathematical or computer science perspective. (That is, how can I prove I will always have alerts fire). So, you have to define your intent by "guaranteed". Is it "against most likely scenarios" or "no matter what"?
Against most likely scenarios, I would recommend pushing the forwarder throughput throttle all the way open. A properly designed deployment should be able to handle forwarders sending higher than their normal volume for a short time, ideally without greatly impacting host performance. (Assumption there is your hosts are modern and have ample idle CPU/network bandwidth to deal with the burst)
I would also suggest alerting on forwarder lag and on event volumes. If these are your likely "will break" scenario, then seeing your lag double or your volumes go up by 50% gives you a hint that something is wrong and you need to check on it.
These are, however, just hedges against likely scenarios. If you are doing stock trading, for example, and a system failure could cause you to lose $500 million in minutes and almost go out business ... then "guaranteed" takes on a whole new meaning.
There are other things to consider beyond the alert itself. For example - an alert posted by email is hardly guaranteed. How does someone acknowledge they received said alert and are acting upon it?
You need to fully understand the "guaranteed" requirement. And if the need is a highly robust guarantee, then I would recommend discussing with Splunk professional services to help figure out all the pieces here.
... View more