Alerting

Create alert when volume of logs increases significantly from a particular host

mbond_illumina
Explorer

I have a problem with a server that keeps violating the splunk indexing volume for the day as the volume of it's logs increases hugely. I'd like to set an alert so that when a theshold is reached it sends out an alert so I can investigate.

Tags (1)

jtrucks
Splunk Employee
Splunk Employee

You could use two approaches, volume of data and number of log entries.

Volume of data the empirical way:

earliest=-0d@d host=www* | eval b=len(_raw) | eval MB=b/1024/1024 | stats sum(MB)

Then alert based on a specific threshold. This may be fast enough to alert on it if you run this every few minutes to give it time to count, especially when it gets later in the day.

Number of log entries:

earliest=-0d@d host=www* | stats count

Then alert on a specific threshold.

If you want to alert based on rate of change, you get to use the above searches with a timechart instead of stats and do some type of comparison. This gets into funky nested searches and comparisons, possibly even summary indexes to compare previous data to current data etc. That may be overkill for what you want to do, however.

--
Jesse Trucks
Minister of Magic

jtrucks
Splunk Employee
Splunk Employee

Also, please mark Answered if this works for you 🙂

--
Jesse Trucks
Minister of Magic
0 Karma

jtrucks
Splunk Employee
Splunk Employee

Do saved search like this for better output:

host=www* | eval b=len(_raw) | eval subMB=b/1024/1024 | stats sum(subMB) as MB

This way MB is the sum total, not the individual event data. Then set Time Range to:

Start time: -0d@d Finish time: now

Then Alert Condition custom:

where MB > 100

I had it send results via email and inline and the result is a table that just has:

   MB
187.343

Set the frequency to anything you want, like every 5 minutes or whatever works for you. If it takes more than 60 seconds to complete the search near the end of the day, don't do it every minute.

--
Jesse Trucks
Minister of Magic

mbond_illumina
Explorer

Thanks to you both for your answers. That works great. However, when I create my alert based on the search, what do I choose for the 'Trigger If' setting? I presume I need to enter a custom condition. Do I enter 'MB > 100?'? If I chose 'No of Results' then wouldn't that just be the number of matched events rather than the Sum(MB)?

0 Karma

kristian_kolb
Ultra Champion

well. throttling the forwarder will reduce the load on the indexer and conserve license space, but it will fill up queues on the forwarder... not really any better. potential loss of events.

0 Karma

jtrucks
Splunk Employee
Splunk Employee

Another thing to consider is throttling that machine's logging through syslog or the indexer or a forwarder in the middle.

--
Jesse Trucks
Minister of Magic
0 Karma

kristian_kolb
Ultra Champion

or if you have a laaarge amount of data and don't want to search it unnecessarily, you could probably combine inputlookup/outputlookup with a metadata search in between to append to the lookup...

Or you can install the Splunk Depolyment Monitor app.

sdaniels
Splunk Employee
Splunk Employee

You can use a search such as this to look at volume for a particular host and then create an alert off of the search based on a threshold. This should look at volume of data in MB from "yourhostname" over the last 2 hours for example.

index=_internal source=*metrics.log series="<yourhostname>" earliest=-2h | eval MB=kb/1024 | chart sum(MB) by series | sort -sum(MB)

If you want something a bit more complex you could look at using a standard deviation calculation similar to this:

http://splunk-base.splunk.com/answers/47920/comparing-standard-deviations

0 Karma

jtrucks
Splunk Employee
Splunk Employee

The second option in my answer earlier shows how to count events, as well.

--
Jesse Trucks
Minister of Magic
0 Karma

sdaniels
Splunk Employee
Splunk Employee

You could just do a search on the number of events (not logs) but essentially you have individual events coming in from those logs so it's similar. Here is an example search that gives me the count of events on one host in the last 2 hours.

sourcetype="access_combined" host="apache-1.splunk.com" earliest=-2h| stats count

mbond_illumina
Explorer

Thanks for you reply. I was really looking at counting the number of logs as opposed to the amount in MB. Is it possible to do that? I did try the example you gave but it didn't give me a value for the MB for some reason.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...