Alerting

Button to pause then resume email alerting

proylea
Contributor

Hi Everyone
I have been asked to look into the possibility of having a button on the dashboard that will allow the user to pause splunks email alerting during an outage or scheduled downtime etc.

I am thinking along the lines of a dashboard button that fires a script that would use sed to edit the specified savedsearches.conf file and change the email alert field to 0, then change it back to 1 when the the button is pressed again.

I would like to know if anyone has done this and what the other options might be.

Kind Regards
Peter

Tags (2)

proylea
Contributor

This is what I ended up doing.

Configure a dashboard panel with a time dropdown and a status reading

alt text

alt text

The panel code looks like this

    <panel>
  <input type="dropdown" token="value1" searchWhenChanged="true">
    <label>Pause Email Alerting</label>
    <choice value="0">Resume</choice>
    <choice value="900">15 min</choice>
    <choice value="1800">30 min</choice>
    <choice value="3600">1 hour</choice>
    <choice value="7200">2 hours</choice>
    <choice value="14400">4 hours</choice>
    <choice value="28800">8 hours</choice>
    <choice value="57600">16 hours</choice>
    <choice value="115200">24 hours</choice>
    <choice value="230400">48 hours</choice>
    <default>0</default>
    <initialValue>0</initialValue>
  </input>
  <single>
    <title>Email Alerting Status</title>
    <search>
      <query>| inputlookup alert-pause-status.csv</query>
      <earliest>-1s</earliest>
      <latest>now</latest>
    </search>
    <option name="height">10</option>
    <option name="link.visible">false</option>
    <option name="refresh.time.visible">false</option>
    <option name="refresh.auto.interval">5</option>
  </single>
</panel>
<panel depends="$nothing1$">
  <single>  
    <search>
      <query>| alertemailoff $value1$</query>
      <earliest>-1s</earliest>
      <latest>now</latest>
    </search>
    <option name="height">10</option>
    <option name="link.visible">false</option>
    <option name="refresh.time.visible">false</option>
    <option name="refresh.auto.interval">15</option>
  </single>
</panel>

Then I configure commands.conf in the apps local dir to point at a perl script

 [alertemailoff]
 filename = pause-splunk-alerting.pl
 type = perl

Create a file called alert-pause-status.csv in the apps lookup dir

status
active

Then create a perl script to do the editing of the savedsearches.conf file and reload the config
Please forgive my very crude perl script, if anyone wants to rewrite and post be my guest.

#!/usr/bin/perl
$file = "/opt/splunk/etc/apps/eproduct/local/savedsearches.conf";
$status = "/opt/splunk/etc/apps/eproduct/lookups/alert-pause-status.csv";
$statustas = "/opt/splunk/etc/apps/tasman/lookups/alert-pause-status.csv";
$pausefor = $ARGV[0];
$check = $$;
$pid = "/opt/splunk/etc/apps/eproduct/bin/pause-splunk-alerting.pid";

use POSIX qw( strftime );
my $finish = strftime("%a %H:%M", localtime(time + $pausefor));

if($pausefor!=0){
    if(`ps -ef | egrep -v "grep|$check" | grep pause-splunk-alerting.pl`)                                                                              {
        print "Process Exists Exiting\n";
        exit;
    }else{
                open(PID, "> $pid") || die "could not open '$pid'  $!";
                print PID "$$\n";
                close(PID);
        print "Starting Process\n";
    }
}else{
        open (IN, $pid) || die "Cannot open file ".$pid." for read";
        $killpid = <IN>;
        close IN;
        `kill $killpid`;
        print "Resume Alerting\n";
        `rm $pid`;
}


open (IN, $file) || die "Cannot open file ".$file." for read";
@lines=<IN>;
close IN;

open (OUT, ">", $file) || die "Cannot open file ".$file." for write";
foreach $line (@lines)
{
   $line =~ s/action.email = 1/action.email = false/ig;
   print OUT $line;
}
close OUT;

open (IN, $status) || die "Cannot open file ".$status." for read";
@lines=<IN>;
close IN;

open (OUT, ">", $status) || die "Cannot open file ".$status." for write";
foreach $line (@lines)
{
   $line =~ s/active/paused til $finish/ig;
   print OUT $line;
}
close OUT;

open (IN, $statustas) || die "Cannot open file ".$statustas." for read";
@lines=<IN>;
close IN;

open (OUT, ">", $statustas) || die "Cannot open file ".$statustas." for write";
foreach $line (@lines)
{
   $line =~ s/active/paused til $finish/ig;
   print OUT $line;
}
close OUT;

`/opt/splunk/bin/splunk _internal call /services/saved/searches/_reload -auth <splunk admin>:<password>`;

sleep($pausefor);

open (IN, $file) || die "Cannot open file ".$file." for read";
@lines=<IN>;
close IN;

open (OUT, ">", $file) || die "Cannot open file ".$file." for write";
foreach $line (@lines)
{
   $line =~ s/action.email = false/action.email = 1/ig;
   print OUT $line;
}
close OUT;

open (IN, $status) || die "Cannot open file ".$status." for read";
@lines=<IN>;
close IN;

open (OUT, ">", $status) || die "Cannot open file ".$status." for write";
foreach $line (@lines)
{
   $line =~ s/paused.*$/active/ig;
   print OUT $line;
}
close OUT;

open (IN, $statustas) || die "Cannot open file ".$statustas." for read";
@lines=<IN>;
close IN;

open (OUT, ">", $statustas) || die "Cannot open file ".$statustas." for write";
foreach $line (@lines)
{
   $line =~ s/paused.*$/active/ig;
   print OUT $line;
}
close OUT;

`/opt/splunk/bin/splunk _internal call /services/saved/searches/_reload -auth <splunk admin>:<password>`;

`rm $pid`;

Now when you select a time from the dropdown it runs the perl script, changes the savedsearches.conf "action.email = 1" to "action.email = false" then changed the status file from active to "paused til ..." reloads the config then sleeps for the desired time and puts everything back again.

You can also select Resume from the dropdown to put it all back and resume email alerts.

Crude but effective, any improvements are welcome.

Cheers

MuS
Legend

Nice !

I removed your admin password from the script 😉

0 Karma

proylea
Contributor

Yeh those credentials would be hit and miss for sure lol!

0 Karma

MuS
Legend

Hi proylea,

You can run a REST POST against this endpoint:

/servicesNS/nobody/search/alerts/alert_actions/email

and change it to some non existing value like @woodcock mentioned. So a working example would be:

curl -k -u admin:changeme https://localhost:8089/servicesNS/nobody/search/alerts/alert_actions/email -d mailserver=foobar

This can be also used in a custom search command or an external script; as always it depends on your use case 😉

Hope this helps ...

cheers, MuS

proylea
Contributor

Yes that would be ok but I don't want to disable all email alerts only the specific ones related to the apps.
I have a working solution now it's crude but it works.

0 Karma

MuS
Legend

Most of the time, I create a lookup table which holds server names and alert_status like:

server, alert_status
foo,active
boo,down

and use a automatic lookup. All the alerts are saved and include the field alert_status="active" in the base search. This way you can set only some servers in maintenance mode while others are still active.

cheers, MuS

0 Karma

proylea
Contributor

Yeh I use similar lookups to do alerting ack and clear type functionality.

0 Karma

woodcock
Esteemed Legend

The easiest way is to misconfigure (break) the System Settings -> Email Alert Settings -> Mail Server Settings. For example, change Mail Host to something like ThisHostIsDeliberatelyBrokenUntilOutageIsFixed-ChangeItBackToASAP.

skoelpin
SplunkTrust
SplunkTrust

Cool idea!

0 Karma

proylea
Contributor

Actually most monitoring and alerting systems have it as standard functionality, perhaps Splunk could consider an update

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...