Alerting

How do you alert on logs missing for multiple servers?

glb
New Member

For various reasons, I want to get alerts when my servers aren't forwarding their event logs to Splunk. I can do this for one server at a time by scheduling a search like index=myindex host=myhost | head 1 for some time window and then alert if there are no results.

Of course, there are many hosts on the network for which this would need to be done. Is there any way I could do this with one scheduled alert using a lookup with a list of the hosts? I'd be ok with the output being a report of the hosts that haven't forwarded logs in the specified window.

I'm a bit of a Splunk novice, so I've searched and found similar questions posted before, but none of the proposed solutions I could find worked correctly.

Thanks in advance!

Tags (2)
0 Karma
1 Solution

somesoni2
Revered Legend

If you already have a lookup table with list of hosts you want to monitor, you can run search like this to setup your alert (alerting if a host is not seen in last 30 min).

Efficient version

| tstats max(_time) as lastSeen WHERE index=myindex [| inputlookup yourhostLookup.csv | table host ] by host
| where lastSeen<relative_time(now(),"-30m")

Regular version

 index=myindex [| inputlookup yourhostLookup.csv | table host ] |  stats  max(_time) as lastSeen by host  | where lastSeen<relative_time(now(),"-30m")

View solution in original post

0 Karma

somesoni2
Revered Legend

If you already have a lookup table with list of hosts you want to monitor, you can run search like this to setup your alert (alerting if a host is not seen in last 30 min).

Efficient version

| tstats max(_time) as lastSeen WHERE index=myindex [| inputlookup yourhostLookup.csv | table host ] by host
| where lastSeen<relative_time(now(),"-30m")

Regular version

 index=myindex [| inputlookup yourhostLookup.csv | table host ] |  stats  max(_time) as lastSeen by host  | where lastSeen<relative_time(now(),"-30m")
0 Karma

glb
New Member

Wow, this seems to do what I want! I'm off to do some testing and verifying results, but this looks promising... I'll confirm later.

0 Karma

glb
New Member

This is working great, thanks again. One last question - the output for lastSeen is not a friendly format, and I can't figure out how to change it to something human-readable. I tried using eval() as I'd seen in a few examples, but it doesn't seem to work. I just end up with the list of hosts and an empty field where the time should be. Suggestions?

0 Karma

glb
New Member

Ah, this seems to work. | eval time=strftime(lastSeen, "%H:%M:%S %m/%d/%Y")
| table host time

0 Karma

glb
New Member

That was one of the articles I'd found when trying to find a solution. Unfortunately, that doesn't give me the results I'm looking for. I tried just | metadata type=hosts, and still only see a handful of boxes (10). There are literally thousands of hosts sending logs to Splunk in the environment, but I only care about a specific subset.

Thanks

0 Karma

somesoni2
Revered Legend
0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...