I'm trying to set up an alert for this use case:
When the request time taken for an API is above X seconds threshold for Y consecutive requests on a GET/POST/PUT request then send an alert.
The challenges that I'm facing are due to having multiple APIs, multiple HTTP methods, multiple seconds thresholds and multiple consecutive requests thresholds. The thresholds are declared in a .csv file which can be easily updated by anyone and then uploaded as a lookup table.
| api | GET_time_threshold_s | GET_count_consecutive_overtime_threshold | POST_time_threshold_s | POST_count_consecutive_overtime_threshold | PUT_time_threshold_s | PUT_count_consecutive_overtime_threshold |
| OrdersApi | 0.5 | 7 | 0.8 | 5 | 1.5 | 3 |
So far I came up with a solution that works just for a single API, but I'm unsure of what's the best solution that has less maintenance possible. I don't know how to pass a lookup table field to the window argument of streamstats command so I created a separate query to generate the search command.
Generate search query
| inputlookup api_lookup_with_thresholds.csv
| where api="OrdersApi"
| eval query="sourcetype=IIS host=\"Prod*\" api=\"OrdersApi\"
| eval time_taken_s = round(time_taken/1000, 3)
| lookup api_lookup_with_thresholds.csv api
| eval is_GET_time_over_threshold=if(cs_method=\"GET\" AND time_taken_s >= GET_time_threshold_s, 1, 0),
is_POST_time_over_threshold=if(cs_method=\"POST\" AND time_taken_s >= POST_time_threshold_s, 1, 0),
is_PUT_time_over_threshold=if(cs_method=\"PUT\" AND time_taken_s >= PUT_time_threshold_s, 1, 0)
| sort +_time
| streamstats window=" + GET_count_consecutive_overtime_threshold + " global=false sum(is_GET_time_over_threshold) as rolling_over_GET_threshold by api, cs_method,
| streamstats window=" + POST_count_consecutive_overtime_threshold + " global=false sum(is_POST_time_over_threshold) as rolling_over_POST_threshold by api, cs_method,
| streamstats window=" + PUT_count_consecutive_overtime_threshold + " global=false sum(is_PUT_time_over_threshold) as rolling_over_PUT_threshold by api, cs_method | table _time, api, cs_method, time_taken_s, rolling_over_GET_threshold, rolling_over_POST_threshold, is_GET_time_over_threshold, is_POST_time_over_threshold" | return $query
The result would be a query like below that targets only OrdersApi.
Monitor search query
sourcetype=IIS host="Prod*" api="OrdersApi"
| eval time_taken_s = round(time_taken/1000, 3)
| lookup api_lookup_with_thresholds.csv api
| eval is_GET_time_over_threshold=if(cs_method="GET" AND time_taken_s >= GET_time_threshold_s, 1, 0),
is_POST_time_over_threshold=if(cs_method="POST" AND time_taken_s >= POST_time_threshold_s, 1, 0),
is_PUT_time_over_threshold=if(cs_method="PUT" AND time_taken_s >= PUT_time_threshold_s, 1, 0)
| sort +_time
| streamstats window=7 global=false sum(is_GET_time_over_threshold) as rolling_over_GET_threshold by api, cs_method
| streamstats window=5 global=false sum(is_POST_time_over_threshold) as rolling_over_POST_threshold by api, cs_method
| streamstats window=3 global=false sum(is_PUT_time_over_threshold) as rolling_over_PUT_threshold by api, cs_method
Is there a way to execute the generated search command in another search? Is there a better way to solve the use case while keeping maintenance as low as possible? Should I think about using the API to generate all the searches automatically?
I'm trying to find a solution that when uploading the new .csv file doesn't require updating all the search queries.
As an alternative solution I was thinking of saving the search above as a savedsearch with api , get_window , post_window , put_window parameters and call it from another search, one for each API but I couldn't read the values from the lookup table and pass them to the saved search.
... View more