I'm trying to detect when a server goes from an error state to operational on our load balancers for an email alert. The first part of the query looks for the last 'operational' message and the second (joined) part of the search looks for a non-operational message.
The problem I'm having is t2 (time from second query) doesn't seem to get evaled. I don't see it in the results and can't do the final search on it.
earliest=-30m@m | eval r=_raw|eval t1=_time | stats first(r) as fr1 by lb_server |search fr1="*operational*"| search fr1 | sort lb_server | join lb_server [search host="12.130.11.2" latest=-60m@m | eval t2=_time | eval r=_raw | stats first(r) as fr2 by lb_server| search fr2!="*operational*"] | search t1 > t2
Hopefully I'm doing something wrong that's easy to fix, or there is another better, stronger, faster way do what I'm after.
Thanks
-Doug
Here's what the data looks like:
May 22 08:29:22 1.2.3.2 May 22 8:29:19 Primary NOTICE AlteonOS <slb>: Services are available for IP4 Virtual Server 3:1.2.4.142
May 22 08:29:22 1.2.3.2 May 22 8:29:19 Primary NOTICE AlteonOS <slb>: real server 1.2.3.41 operational
May 22 08:29:22 1.2.3.2 May 22 8:29:19 Primary NOTICE AlteonOS <slb>: real service 1.2.3.41:80 operational
May 22 08:19:52 1.2.3.2 May 22 8:19:49 Primary NOTICE AlteonOS <slb>: No services are available for IP4 Virtual Server 3:1.2.4.142
May 22 08:19:52 1.2.3.2 May 22 8:19:49 Primary ALERT AlteonOS <slb>: cannot contact real server 1.2.3.41
May 22 08:19:52 1.2.3.2 May 22 8:19:49 Primary ALERT AlteonOS <slb>: script 1 healthcheck failed on real server 1.2.3.41
May 22 07:11:50 1.2.3.2 May 22 7:11:47 Primary NOTICE AlteonOS <slb>: Services are available for IP4 Virtual Server 3:1.2.4.142
May 22 07:11:50 1.2.3.2 May 22 7:11:47 Primary NOTICE AlteonOS <slb>: real server 1.2.3.41 operational
May 22 07:11:50 1.2.3.2 May 22 7:11:47 Primary NOTICE AlteonOS <slb>: real service 1.2.3.41:80 operational
May 22 06:23:49 1.2.3.2 May 22 6:23:47 Primary NOTICE AlteonOS <slb>: No services are available for IP4 Virtual Server 3:1.2.4.142
May 22 06:23:49 1.2.3.2 May 22 6:23:47 Primary ALERT AlteonOS <slb>: cannot contact real server 1.2.3.42
May 22 06:23:49 1.2.3.2 May 22 6:23:47 Primary ALERT AlteonOS <slb>: script 1 healthcheck failed on real server 1.2.3.42
May 22 06:23:33 1.2.3.2 May 22 6:23:31 Primary ALERT AlteonOS <slb>: cannot contact real server 1.2.3.41
May 22 06:23:33 1.2.3.2 May 22 6:23:31 Primary ALERT AlteonOS <slb>: script 1 healthcheck failed on real server 1.2.3.41
May 21 17:21:39 1.2.3.2 May 21 17:21:33 Primary NOTICE AlteonOS <slb>: Services are available for IP4 Virtual Server 5:1.2.4.139
May 21 17:21:39 1.2.3.2 May 21 17:21:33 Primary NOTICE AlteonOS <slb>: real server 1.2.3.40 operational
May 21 17:21:39 1.2.3.2 May 21 17:21:33 Primary NOTICE AlteonOS <slb>: real service 1.2.3.40:80 operational
May 21 17:20:39 1.2.3.2 May 21 17:20:33 Primary NOTICE AlteonOS <slb>: No services are available for IP4 Virtual Server 5:1.2.4.139
May 21 17:20:39 1.2.3.2 May 21 17:20:33 Primary ALERT AlteonOS <slb>: script 1 healthcheck failed on real server 1.2.3.40
May 21 17:20:24 1.2.3.2 May 21 17:20:18 Primary ALERT AlteonOS <slb>: Script 1 failed on real(9): expect OK, received HTTP/1.1 302 Found
May 21 17:15:59 1.2.3.2 May 21 17:15:54 Primary NOTICE AlteonOS <slb>: real server 1.2.3.41 operational
May 21 17:15:59 1.2.3.2 May 21 17:15:54 Primary NOTICE AlteonOS <slb>: real service 1.2.3.41:80 operational
May 21 17:15:59 1.2.3.2 May 21 17:15:54 Primary NOTICE AlteonOS <slb>: Services are available for IP4 Virtual Server 3:1.2.4.142
May 21 17:15:59 1.2.3.2 May 21 17:15:54 Primary NOTICE AlteonOS <slb>: real server 1.2.3.42 operational
May 21 17:15:59 1.2.3.2 May 21 17:15:54 Primary NOTICE AlteonOS <slb>: real service 1.2.3.42:80 operational
May 21 17:15:26 1.2.3.2 May 21 17:15:20 Primary NOTICE AlteonOS <slb>: No services are available for IP4 Virtual Server 3:1.2.4.142
May 21 17:15:26 1.2.3.2 May 21 17:15:20 Primary ALERT AlteonOS <slb>: cannot contact real server 1.2.3.42
May 21 17:15:26 1.2.3.2 May 21 17:15:20 Primary ALERT AlteonOS <slb>: script 1 healthcheck failed on real server 1.2.3.42
May 21 17:15:25 1.2.3.2 May 21 17:15:19 Primary ALERT AlteonOS <slb>: cannot contact real server 1.2.3.41
May 21 17:15:25 1.2.3.2 May 21 17:15:19 Primary ALERT AlteonOS <slb>: script 1 healthcheck failed on real server 1.2.3.41
May 21 17:14:12 1.2.3.2 May 21 17:14:07 Primary NOTICE AlteonOS <slb>: real server 1.2.3.42 operational
May 21 17:14:12 1.2.3.2 May 21 17:14:07 Primary NOTICE AlteonOS <slb>: real service 1.2.3.42:80 operational
May 21 17:14:12 1.2.3.2 May 21 17:14:07 Primary NOTICE AlteonOS <slb>: Services are available for IP4 Virtual Server 3:1.2.4.142
May 21 17:14:12 1.2.3.2 May 21 17:14:07 Primary NOTICE AlteonOS <slb>: real server 1.2.3.41 operational
May 21 17:14:12 1.2.3.2 May 21 17:14:07 Primary NOTICE AlteonOS <slb>: real service 1.2.3.41:80 operational
May 21 17:13:41 1.2.3.2 May 21 17:13:35 Primary NOTICE AlteonOS <slb>: No services are available for IP4 Virtual Server 3:1.2.4.142
May 21 17:13:41 1.2.3.2 May 21 17:13:35 Primary ALERT AlteonOS <slb>: cannot contact real server 1.2.3.42
May 21 17:13:41 1.2.3.2 May 21 17:13:35 Primary ALERT AlteonOS <slb>: script 1 healthcheck failed on real server 1.2.3.42
May 21 17:13:40 1.2.3.2 May 21 17:13:34 Primary ALERT AlteonOS <slb>: cannot contact real server 1.2.3.41
May 21 17:13:40 1.2.3.2 May 21 17:13:34 Primary ALERT AlteonOS <slb>: script 1 healthcheck failed on real server 1.2.3.41
May 21 16:01:22 1.2.3.2 May 21 16:01:16 Primary NOTICE AlteonOS <slb>: Services are available for IP4 Virtual Server 5:1.2.4.139
May 21 16:01:22 1.2.3.2 May 21 16:01:16 Primary NOTICE AlteonOS <slb>: real server 1.2.3.40 operational
May 21 16:01:22 1.2.3.2 May 21 16:01:16 Primary NOTICE AlteonOS <slb>: real service 1.2.3.40:80 operational
May 21 15:21:35 1.2.3.2 May 21 15:21:30 Primary NOTICE AlteonOS <slb>: No services are available for IP4 Virtual Server 5:1.2.4.139
May 21 15:21:35 1.2.3.2 May 21 15:21:30 Primary ALERT AlteonOS <slb>: cannot contact real server 1.2.3.40
May 21 15:21:35 1.2.3.2 May 21 15:21:30 Primary ALERT AlteonOS <slb>: script 1 healthcheck failed on real server 1.2.3.40
May 21 15:21:20 1.2.3.2 May 21 15:21:15 Primary ALERT AlteonOS <slb>: Script 1 failed on real(9): expect OK, received HTTP/1.1 302 Found
You might be better off using the transaction search command.
... | transaction host startswith="some_nonoperational_message" endswith="operational_again"
I haven't seen your data so that is just an example , adjust the startswith & endswith filter patterns as necessary.
The maxspan and maxevents options may also be of use to refine your transaction results.
dswanson, create a field which extracts "server 1" or "server 2", then use that in the first section of the transaction command
I've used transaction before but not in the case where transaction crossed each other. For example:
server 1 fail
server 2 fail
server 2 succeed
server 1 succeed
It is possible in this scenario (without creating an alert for each server)?
Thanks