Splunk Search

Suppress well known events

kochera
Communicator

Hi,

we've the following scenario.

A logmessage indicates that a CPU-Fan has failed

Mar 17 11:00:21 h045ap 2011-03-17 rmclomv <kern.err> [ID 431010 kern.error] CPU_FAN @ MB.P0.F0.RS has FAILED.

The event pops up in the dashboard. The systemadministrator opens a case with our HW supplier. The replacement is scheduled for the next day. Our systemmonitoring script is reporting the fault every half an hour. How do we suppress this event within Splunk until the CPU fan has been replaced?

cheers, Andy

David
Splunk Employee
Splunk Employee

One method that could work is to use a lookup that will recognize whether the alert should be ignored or not. That would likely take some small amount of scripting to manage the alert state. You might use one search command to toggle an event's "triaged" flag (to steal ftk's phrasing) and then you could have two different searches on your dashboard. One with a lookup for untriaged events, which does the alert, and another with triaged events, so that you don't lose visibility of events you're ignoring.

I'm thinking something along these lines: http://answers.splunk.com/questions/3982/correlate-and-tag-splunk-events-with-change-control-tickets (also, conveniently, from ftk).

ftk
Motivator

Lookups are a good call on this.

0 Karma

nocostk
Communicator

You could re-direct those events to the bit bucket:

props.conf
[syslog]
TRANSFORMS-removefan=cleanfanalerts

transforms.conf
[cleanfanalerts]
REGEX = (?m).+CPU_FAN\s+@\s+MB.P0.F0.RS\s+has\s+FAILED$
DEST_KEY = queue
FORMAT = nullQueue
0 Karma

netwrkr
Communicator

It appears in 4.2 you can 'throttle' alerts -

more info here - http://www.splunk.com/base/Documentation/latest/User/Alertusecases

0 Karma

ftk
Motivator

You could create an event type based on the message and include a filter on the event type in your dashboard search. When creating an even ttype through Splunkweb be sure to adjust the permissions to make it visible to the rest of your team.

For example, create an event type "triaged_error" for the fan failure on that particular host, and add NOT eventtype="triaged_error" to your dashboard search to hide this event type. After the fan is replaced it is likely best to remove or disable the event type so that you don't accidentally filter any untriaged errors in the future.

The event type could look like this for example:

sourcetype="my_sourcetype" h045ap rmclomv "CPU_FAN @ MB.P0.F0.RS has FAILED"
0 Karma

ftk
Motivator

I'd probably go with lookups then as David suggests.

0 Karma

kochera
Communicator

Hi, I'm not sure if this is going to scale in a large momnitoring environment. Basically we would like to suppress any kind of event within a context sensitive menu.

andy

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...