what’s the best way to alert when a universal forwarder cant connect to the deployment server? I am looking to build a alert when a forwarder can not get the configuration from the deployment server. The built-in alert for the deployment monitor isn't quite what we are looking for. has anyone built there own that might be better?
Here's two that might help which I use:
Application Installation Failures From Deployment Manager
index=_internal sourcetype=splunkd "action=Install result=Fail" | top limit=100 ip app | lookup dnslookup clientip as ip | table clienthost app ip
Splunk universal forwarders not talking to the deployment server:
| tstats count where index=_internal groupby host | fields host | table host | search NOT [search index=_internal host=ulpspl09* source="/opt/splunk/var/log/splunk/splunkd_access.log" sourcetype=splunkd_access | rex field=uri "/services/broker/phonehome/connection_[^_]+_[89][0-9]{3}_[^_]+(_[0-9][^_]+)?_(?P<hostname>[^_]+)_" | eval host=hostname | dedup host | table host] | lookup dnslookup clienthost AS host | search clientip!=''
The above determines when we have server sending data to be indexed but not talking to the deployment server.
Here's two that might help which I use:
Application Installation Failures From Deployment Manager
index=_internal sourcetype=splunkd "action=Install result=Fail" | top limit=100 ip app | lookup dnslookup clientip as ip | table clienthost app ip
Splunk universal forwarders not talking to the deployment server:
| tstats count where index=_internal groupby host | fields host | table host | search NOT [search index=_internal host=ulpspl09* source="/opt/splunk/var/log/splunk/splunkd_access.log" sourcetype=splunkd_access | rex field=uri "/services/broker/phonehome/connection_[^_]+_[89][0-9]{3}_[^_]+(_[0-9][^_]+)?_(?P<hostname>[^_]+)_" | eval host=hostname | dedup host | table host] | lookup dnslookup clienthost AS host | search clientip!=''
The above determines when we have server sending data to be indexed but not talking to the deployment server.
hmmm.... Error in 'rex' command: Encountered the following error while compiling the regex '/services/broker/phonehome/connection_[^]+[89][0-9]{3}[^]+([0-9][^]+)?(?P[^]+)_': Regex: unrecognized character after (?P
did it a little differently in the end, index=_internal sourcetype=splunkd component=DC:PhonehomeThread OR component=DC:DeploymentClient err=not_connected | stats count by host err component | where count >= 200
Sorry about that, try the updated version! I'm guessing I missed something during the copy/paste...