I am trying to capture particular types of errors that occur in our logs.
I've searched for my key events in my base search and have set up a transaction, mainly to articulate the series of events.
The problem is there are a lot of useless events within the transaction and the ordering of the events determines the correlations.
My search is the following (italics added):
*index="logs" process=beamCommonProcess
"Transitioned to Error State" OR "SET_RANGE activity requested for beam supply point" OR "DISABLE_BEAM activity is complete" OR "ENABLE_BEAM activity requested" OR "beam allocated to" OR "beam deallocated from"
| transaction startswith="beam allocated to" endswith="beam deallocated from" maxspan=2h
| search "Transitioned to Error State"
An example of one resultant transaction is below (the key events are bolded😞
2017-03-08 10:33:09,744 Beam allocated to beam supply point FBTR3 with priority Normal and requested time of 50 seconds
2017-03-08 10:33:09,748 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=24.812775050692412, optical solution=4.0, gantry angle=90.0, offsetX=0.0, offsetY=0.0, cycling=true
2017-03-08 10:33:25,825 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=24.812775050692412, optical solution=4.0, gantry angle=90.0, offsetX=-0.0, offsetY=-0.0, cycling=true
2017-03-08 10:33:25,830 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:27,991 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=24.812775050692412, optical solution=4.0, gantry angle=90.0, offsetX=0.279205229210377, offsetY=2.6493079096466716, cycling=false
2017-03-08 10:33:29,020 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:30,086 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:32,294 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=24.02959946072066, optical solution=4.0, gantry angle=90.0, offsetX=0.30082541500755466, offsetY=2.736273984389093, cycling=false
2017-03-08 10:33:33,020 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:34,394 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:36,799 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=23.277319229974324, optical solution=4.0, gantry angle=90.0, offsetX=0.0810159272338496, offsetY=2.1952383846082184, cycling=false
2017-03-08 10:33:37,521 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:39,002 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:41,507 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=22.534185910219062, optical solution=4.0, gantry angle=90.0, offsetX=-0.11149386633876263, offsetY=1.1417808186791392, cycling=false
2017-03-08 10:33:42,521 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:43,630 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:46,295 ScanningController failure: NECU Transitioned to Error State
NECU Error: [0x0] _SynchronizationSGCUTimeout : Timeslice: 1846 Submap: 879
FCU Error: [0x0] _SynchronizationSGCUTimeout : Timeslice: 1846 Submap: 879
RCU Error: [0x2] Threshold Violation : Timeslice: 1844 Submap: 879
(Y_VOLT_SEC_FB: -0.313242 V MapThresholdLow: -8.984e-02 MapThresholdHigh: 1.087e-01)
SGCU Error: [0x10] _FilteringAbsolute : Timeslice: 1842 Submap: 879
(MIN_CHARGE_PRIM: 4.940e-11 C AbsoluteThresholdLow: 7.475e-11 AbsoluteThresholdHigh: 1.198e-10)
2017-03-08 10:33:47,518 DISABLE_BEAM activity is complete
2017-03-08 10:33:52,136 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:52,705 DISABLE_BEAM activity is complete
2017-03-08 10:33:54,032 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:33:54,407 DISABLE_BEAM activity is complete
2017-03-08 10:34:31,793 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:34:34,067 ScanningController failure: NECU Transitioned to Error State
NECU Error: [0x0] _SynchronizationSGCUTimeout : Timeslice: 163 Submap: 83
FCU Error: [0x0] _SynchronizationSGCUTimeout : Timeslice: 163 Submap: 83
RCU Error: [0x1] Threshold Violation : Timeslice: 162 Submap: 83
(X_VOLT_SEC_FB: -1.858963 V MapThresholdLow: -6.005e-02 MapThresholdHigh: 4.256e-01)
SGCU Error: [0x10] _FilteringAbsolute : Timeslice: 159 Submap: 83
(MIN_CHARGE_PRIM: 6.159e-10 C AbsoluteThresholdLow: 7.119e-10 AbsoluteThresholdHigh: 7.569e-10)
2017-03-08 10:34:35,060 DISABLE_BEAM activity is complete
2017-03-08 10:34:42,384 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:34:44,595 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=21.821170363454122, optical solution=4.0, gantry angle=90.0, offsetX=0.05364231043623657, offsetY=2.493617070171168, cycling=false
2017-03-08 10:34:45,527 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:34:46,704 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:34:49,402 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=21.117035011519356, optical solution=4.0, gantry angle=90.0, offsetX=0.1846705224276548, offsetY=1.2619887272283408, cycling=false
2017-03-08 10:34:50,124 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:34:51,611 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:34:54,406 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=20.4015946204072, optical solution=4.0, gantry angle=90.0, offsetX=-0.03781585653143657, offsetY=0.4402160277866902, cycling=false
2017-03-08 10:34:55,063 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:34:56,618 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:34:59,412 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=19.69582753255141, optical solution=4.0, gantry angle=90.0, offsetX=-0.27699257680997136, offsetY=0.7716407595956332, cycling=false
2017-03-08 10:35:00,124 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:01,626 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:04,424 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=19.039369951465112, optical solution=4.0, gantry angle=90.0, offsetX=0.20159363677784936, offsetY=0.5737016999952265, cycling=false
2017-03-08 10:35:05,125 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:06,631 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:09,423 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=18.411236854677558, optical solution=4.0, gantry angle=90.0, offsetX=0.08439601799291348, offsetY=1.4156546492137045, cycling=false
2017-03-08 10:35:10,064 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:11,647 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:14,430 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=17.791509400314748, optical solution=4.0, gantry angle=90.0, offsetX=0.32660726181329897, offsetY=2.348215386456051, cycling=false
2017-03-08 10:35:15,126 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:16,637 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:19,237 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=17.180290349981682, optical solution=4.0, gantry angle=90.0, offsetX=0.19449483549697788, offsetY=1.6954874810565246, cycling=false
2017-03-08 10:35:20,026 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:21,331 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:23,739 Start of ES/BTS Controller SET_RANGE activity requested for beam supply point FBTR3: range=16.577682833914057, optical solution=4.0, gantry angle=90.0, offsetX=0.22313796916331732, offsetY=1.1142850033366156, cycling=false
2017-03-08 10:35:24,528 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:25,943 Start of ES/BTS Controller ENABLE_BEAM activity requested for beam supply point FBTR3
2017-03-08 10:35:29,060 DISABLE_BEAM activity is complete
2017-03-08 10:35:34,531 Beam deallocated from beam supply point FBTR3
So for the example event above, the message "SET_RANGE..." that happened just prior to the message "Error State..." correlates to it. In other words, there are many "SET_RANGE..." messages in chronological order, then there is an "Error State..." - the last "SET_RANGE..." message matches that error event. Further as the example shows, you can have one "SET_RANGE.." event that has two subsequent "Error State..." events. You'll notice there are numerous fields in the "SET_RANGE..." message. I want to grab that information and match it to the correlated "Error State.." event.
My table would look something like this for the transaction above:
2017-03-08 10:33:46,295 | Error State | beam supply point FBTR3 | range=22.534185910219062 | gantry angle=90.0
2017-03-08 10:34:34,067 | Error State | beam supply point FBTR3 | range=22.534185910219062 | gantry angle=90.0
As you can see, there were two separate "Error State..." events for the same "SET_RANGE..." events. This is not always the case. Most of the time, there is a single "SET_RANGE..." event to match to a single "Error State..." event. Note, the events with "ENABLE_BEAM" and "DISABLE_BEAM" are needed to determine if the beam is enabled - I only want to capture "Error State..." events in which the beam is enabled to start with!
YOUR HELP IS MUCH APPRECIATED!
DO NOT use transaction
; try this...
This is your original base search:
index="logs" process=beamCommonProcess "Transitioned to Error State"
OR "SET_RANGE activity requested for beam supply point"
OR "DISABLE_BEAM activity is complete"
OR "ENABLE_BEAM activity requested"
OR "beam allocated to"
OR "beam deallocated from"
The following creates a field called state_change
for each event which has a value of ENABLED
, DISABLED
or null()
(which is no value).
| eval state_change=case(searchmatch("ENABLE_BEAM activity requested"), "ENABLED",
searchmatch("DISABLE_BEAM activity is complete"), "DISABLED",
true(), null())
| rename comment AS "transaction startswith=\"beam allocated to\" endswith=\"beam deallocated from\" maxspan=2h"
Now we need to keep all the "SET_RANGE..." events which have an "ENABLED" event before the next "DISABLED" event, or to put it another way, we need to drop all the "SET_RANGE..." events which have a "DISABLED" event before the next "ENABLED" event.
| reverse
| streamstats count(eval(searchmatch(SET_RANGE activity requested for beam supply point)) AS SessionID last(state_change) AS state
| search state="ENABLED"
Now for each "SET_RANGE..." event that remains, find the earliest "Error State" event as well as the number of "Error State" events and keep only those "SET_RANGE..." events that are thus associated with an "Error State" event.
| stats min(_time) AS _time count(eval(searchmatch("NECU Transitioned to Error State"))) AS error_count list(_raw) AS _raw BY sessionID
| search error_count>0
At this point you will probably need to do something like this to reconstitute the raw events and then do the necessary field extraction using rex
, and the group them again to get your final details.
| mvexpand(_raw)
| rex "YourRegexForFieldsRequiredHere"
| stats values(*) AS * BY sessionID
DO NOT use transaction
; try this...
This is your original base search:
index="logs" process=beamCommonProcess "Transitioned to Error State"
OR "SET_RANGE activity requested for beam supply point"
OR "DISABLE_BEAM activity is complete"
OR "ENABLE_BEAM activity requested"
OR "beam allocated to"
OR "beam deallocated from"
The following creates a field called state_change
for each event which has a value of ENABLED
, DISABLED
or null()
(which is no value).
| eval state_change=case(searchmatch("ENABLE_BEAM activity requested"), "ENABLED",
searchmatch("DISABLE_BEAM activity is complete"), "DISABLED",
true(), null())
| rename comment AS "transaction startswith=\"beam allocated to\" endswith=\"beam deallocated from\" maxspan=2h"
Now we need to keep all the "SET_RANGE..." events which have an "ENABLED" event before the next "DISABLED" event, or to put it another way, we need to drop all the "SET_RANGE..." events which have a "DISABLED" event before the next "ENABLED" event.
| reverse
| streamstats count(eval(searchmatch(SET_RANGE activity requested for beam supply point)) AS SessionID last(state_change) AS state
| search state="ENABLED"
Now for each "SET_RANGE..." event that remains, find the earliest "Error State" event as well as the number of "Error State" events and keep only those "SET_RANGE..." events that are thus associated with an "Error State" event.
| stats min(_time) AS _time count(eval(searchmatch("NECU Transitioned to Error State"))) AS error_count list(_raw) AS _raw BY sessionID
| search error_count>0
At this point you will probably need to do something like this to reconstitute the raw events and then do the necessary field extraction using rex
, and the group them again to get your final details.
| mvexpand(_raw)
| rex "YourRegexForFieldsRequiredHere"
| stats values(*) AS * BY sessionID
Woodcock:
Thank you so much for your help! I think you have things going in the right direction.
I made a slight adjustment to your suggested search in order to prevent errors from happening:
index="logs" process=beamCommonProcess
"Transitioned to Error State"
OR "SET_RANGE activity requested for beam supply point"
OR "DISABLE_BEAM activity is complete"
OR "ENABLE_BEAM activity requested"
OR "beam allocated to"
OR "beam deallocated from"
| eval state_change=case(searchmatch("ENABLE_BEAM activity requested"), "ENABLED",
searchmatch("DISABLE_BEAM activity is complete"), "DISABLED",
true(), null())
| rename comment AS "transaction startswith=\"beam allocated to\" endswith=\"beam deallocated from\" maxspan=2h"
| reverse
| streamstats count(eval(searchmatch("SET_RANGE activity requested for beam supply point"))) AS SessionID last(state_change) AS state
| search state="ENABLED"
| stats min(_time) AS _time count(eval(searchmatch("NECU Transitioned to Error State"))) AS error_count list(_raw) AS _raw BY sessionID
| search error_count>0
The search says it found 10,826 events but the Statistics tab says "No results found." Any ideas? Do you mind providing me a brief explanation of what each one of your lines is doing, please?
I have re-edited my search to explain the parts.