Hello everyone.
Is there a way (using Splunk 6.0) of configuring an alert that would send in real time or almost real time (I could "live" with a 20 mins. delay) an email if Splunk detects an event with a value that hadn't been seen before?
Basically, we have a set of hardware devices sending data to our servers. Those hardware devices are identified by their MAC address. Every time one of those devices sends data, a log message like the following is recorded:
2015-02-19_18:20:14 host=server01 LVL=I Device with mac=11:22:33:44:55:66 sent us the following data=Loren Ipsum dolor amet
As mentioned, this happens every single time a device sends data.
What I'd like to know is whether there's a way of creating an alert so when a newly installed device (which had never sent us data before) starts sending data, Splunk could send an email indicating something like We got data from a new Device (with MAC address EE:DD:CC:BB:AA:99)
I saw this other thread (http://answers.splunk.com/answers/54604/finding-events-that-have-never-happened-before.html ) that mentions a new feature that hadn't been released at the time of the thread. Maybe there's something new since the time the thread was published on? The thread also mentions the Anomalousvalue
(http://docs.splunk.com/Documentation/Splunk/6.2.1/SearchReference/Anomalousvalue ) and Rare
(http://docs.splunk.com/Documentation/Splunk/6.2.1/SearchReference/Rare ) but I haven't been able to figure out what to write on the Search
bar to get it working. Maybe it's because at the time I was testing my searches there were already so many events that they weren't rare anymore?
I created a Real Time alert with the following search:
host=server* LVL=I "sent us the following data" rt=-61m
| stats min(_time) as first_time by mac
| where first_time > relative_time(now(), "-60m")
| eval first_time=strftime(first_time, "%Y/%m/%d %H:%M:%S")
| table mac, first_time
The idea was doing something like: "Fill up a window containing the events that occurred in the last 61 minutes and if there's a match that happened in the last 60 minutes, send an alert" It didn't work. I'm aware that even if it had worked the way I thought it would, I would have received a lot of alerts in my inbox, but I didn't get any.
Am I on the right path? Is there a way of doing this?
Thank you in advance!
I'm going to borrow from my not-quite-correct answer here -- http://answers.splunk.com/answers/216358/how-to-create-an-hourly-alert-when-never-seen-befo.html#ans...
Lookup tables are a great way of doing this - much better than a realtime search.
We'll begin by running this search over "all time":
host=server* LVL=I "sent us the following data"
| stats min(_time) as first_time by mac
| outputlookup mac_tracker.csv
Now we have a start point. We can schedule a search over every 15 minutes or so that runs this:
host=server* LVL=I "sent us the following data"
| stats min(_time) as first_time by mac
| inputlookup append=t mac_tracker.csv
| stats min(_time) as first_time by mac
| outputlookup mac_tracker.csv
Now your lookup table will get updated every 15 minutes with the first time of any new appearing mac address. From here you can easily schedule a search using the lookup file using something like:
| inputlookup mac_tracker.csv | where first_time >= now() - 1800
Set this up as an alert, and you have a simple way to alert on newly appearing MAC addresses essentially forever. Because we're using a lookup file to hold the state, even when indexed data ages off of Splunk you'll still have your long-collected list of MAC addresses.
Bonus points for using a KV-store lookup in Splunk 6.2 to make this perhaps even faster. 🙂
This a such a great question...not necessarily the topic (which is good use case), but mainly in the sense it shows that BorrajaX has stated what is not known, what is known, what has been tried, and evidence of trying to find an answer on their own. Great job being thorough versus just half-assing a quick question that are so hard to answer on this site w/o requesting more info. Also, a well explained answer by dwaddle.
Karma awarded to both BorrajaX and dwaddle for modeling what a good question-answer interaction should be on Splunk Answers.
Wow! Thanks a lot for your words. 🙂 It is also a great answer, indeed! Step by step and explaining the whys and the hows
I guess I was well trained by stackoverflow.
I'm going to borrow from my not-quite-correct answer here -- http://answers.splunk.com/answers/216358/how-to-create-an-hourly-alert-when-never-seen-befo.html#ans...
Lookup tables are a great way of doing this - much better than a realtime search.
We'll begin by running this search over "all time":
host=server* LVL=I "sent us the following data"
| stats min(_time) as first_time by mac
| outputlookup mac_tracker.csv
Now we have a start point. We can schedule a search over every 15 minutes or so that runs this:
host=server* LVL=I "sent us the following data"
| stats min(_time) as first_time by mac
| inputlookup append=t mac_tracker.csv
| stats min(_time) as first_time by mac
| outputlookup mac_tracker.csv
Now your lookup table will get updated every 15 minutes with the first time of any new appearing mac address. From here you can easily schedule a search using the lookup file using something like:
| inputlookup mac_tracker.csv | where first_time >= now() - 1800
Set this up as an alert, and you have a simple way to alert on newly appearing MAC addresses essentially forever. Because we're using a lookup file to hold the state, even when indexed data ages off of Splunk you'll still have your long-collected list of MAC addresses.
Bonus points for using a KV-store lookup in Splunk 6.2 to make this perhaps even faster. 🙂
This is great. And much more efficient that I thought it would be.
Can you explain what your search does and how it actually checks for new "NEVER BEFORE SEEN" MAC addresses?
Well, it is less of a search for "never before seen" and more of a observation of "first time seen." I'm taking as a rule that "the first time a MAC has been seen will have the lowest value of _time
".
The first search, run manually over all time, builds a great big flat file with the lowest value of _time
for each MAC.
The second one, scheduled over a smaller window of time, looks for all MAC addresses seen in that smaller window of time and compares it to the big flat file. Any MAC address already in the flat file will be in there with a lower value of _time
and that lower value of _time
will be kept. Any MAC that wasn't in the flat file at the start of this search will be added to it with the lowest value of _time
seen during this one search. This second search runs repeatedly throughout the day, incrementally adding a few items at a time to the great big file we created in the first step.
Now, we can use that file to lookup almost instantly the lowest _time
for a given MAC address because we've already distilled it down to the simplest data points.
This is a variation on a series of techniques that @araitz posted to Splunk Blogs many years ago -- http://blogs.splunk.com/2011/01/11/maintaining-state-of-the-union/