Splunk IT Service Intelligence

Status over Time Multi-Value Alert Not working as expected

EricLloyd79
Builder

Perhaps I am just misunderstanding the concept behind Status over Time but I set a KPI to trigger if it is at Critical 90% of the time for the last 24 hours and when I open it to set that 90% (see screenshot) it shows that in the last 24 hours it was WELL below 90% but this notable event triggers over and over again non-stop almost.
Am I misunderstanding the concept behind an alert that triggers if the KPI is at Critical 90% of the last 60 mins?
alt text

0 Karma
1 Solution

MVREID
Path Finder

I am assuming you want to have the Notable Event when the condition has been Critical for ~21.5 hours of a 24 hour period. From your description that seems to be correct.

The real trouble with the multi-kpi editor seems to be related to a bug when it creates the actual search syntax. To see what I mean, click on the search you have created under the 'Correlation Searches' menu item under 'Configure'
Under "Actions' click Edit 'by Correlation Search Editor'.

Open that search by clicking 'Run Search' under the "Search" box.

Notice the line that says

| stats count as occurances latest(*) as * by alert_severity itsi_kpi_id itsi_service_id

occurances is spelled with an "a", but later a macro is called with a parameter of occurrence

| getPercentage(alert_period, occurrence)

That value doesn't exist.

That macro expands and is indeed looking for a value that would have been provided by the occurances variable created in the earlier stats statement, and additionally is core to the whole point of your intended multi-kpi search.

I have gotten around this by editing the search syntax and adding the statement

|eval occurrence=occurances

right after the stats statement that created the occurances variable.

Lets this run and I believe this will resolve your constant alerting condition.

Hope this was clear, feel free to ask for clarification.

View solution in original post

0 Karma

esnyder_splunk
Splunk Employee
Splunk Employee

Hi guys, thanks a lot for finding this. I've filed a bug and it should be fixed in the next ITSI release.

0 Karma

MVREID
Path Finder

I am assuming you want to have the Notable Event when the condition has been Critical for ~21.5 hours of a 24 hour period. From your description that seems to be correct.

The real trouble with the multi-kpi editor seems to be related to a bug when it creates the actual search syntax. To see what I mean, click on the search you have created under the 'Correlation Searches' menu item under 'Configure'
Under "Actions' click Edit 'by Correlation Search Editor'.

Open that search by clicking 'Run Search' under the "Search" box.

Notice the line that says

| stats count as occurances latest(*) as * by alert_severity itsi_kpi_id itsi_service_id

occurances is spelled with an "a", but later a macro is called with a parameter of occurrence

| getPercentage(alert_period, occurrence)

That value doesn't exist.

That macro expands and is indeed looking for a value that would have been provided by the occurances variable created in the earlier stats statement, and additionally is core to the whole point of your intended multi-kpi search.

I have gotten around this by editing the search syntax and adding the statement

|eval occurrence=occurances

right after the stats statement that created the occurances variable.

Lets this run and I believe this will resolve your constant alerting condition.

Hope this was clear, feel free to ask for clarification.

0 Karma
Get Updates on the Splunk Community!

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...