Splunk Search

Simple newbie question - "stats count per user" not alerting

stucky101
Engager

Gurus
I just started playing with splunk and after reading the alert howto it looks like a real-time/rolling window alert is a good start.
I tested a simple "Failed password" scenario where more than 2 failed logins per 60 seconds should trigger an alert.
This works as expected for all usernames. If I have the same user fail to logon 3 times within 60 seconds it sends an email.
However, it also sends an email when 3 different users fail to log on within this timeframe. I'm pretty sure "stats count per user" is the answer here but when I add that to "Failed password" in my search nothing triggers anymore. Not even when the same user fails 3 times within 60 seconds.
I believe there is a stats table that gets created as described here :

http://docs.splunk.com/Documentation/Splunk/latest/User/Alertusecases

This article makes it sound like I can just add this pipe to my query to make the alert aware of whether the same user fails or various users. Clearly I'm missing something.
I might have some misconception here but shouldn't I be able to view this table in the alert dashboard ? Where can I see the results in this table ?
The alert is "Failed password | stats count per user". As soon as I remove the pipe it starts working as before.

Any hint is appreciated.

Thx

Tags (2)
0 Karma

stucky101
Engager

Ok it has to do with the hostname field. Let's start with a basic search

"Failed password"

this gets 367 results such as

Aug 14 16:53:17 hostname sshd[31840]: Failed password for invalid user test_tuesday from {srcip} port 56847 ssh2

When I try to get a stats counter for the user by changing this to

"Failed password" | stats count by user

for the same timeframe I get the following table

user1 count 8
user2 count 1

That's it. When I drill down on the first user I get the message and it looks almost like the other one but it's from a few weeks ago.

Jul 24 17:31:11 hostname.domain.com sshd[1329]: Failed password for invalid user user1 from 10.91.25.76 port 54427 ssh2

Then I noticed is that this older message has the fqdn in it as a hostname and the newer ones don't.
I went through a short period where I was sending fqdns in the syslogs but changed that back soon after. Now it appears only messages that have the fqdn in there are extracting the user field. This would explain why the 1 minute or even 7 days timeframe doesn't yield any results.
I double checked and the same thing is true for the other user.

Isn't the hostname usually short in standard syslog ? Unless a '.' is used as delimiter I don't see how that would affect the extraction of the user field.

Am I onto something ?

0 Karma

stucky101
Engager

Ayn

I had already removed the index and sourcetype and tried again.
As per my last post :

"I noticed that when I remove the "index=foo sourcetype=goo" part and test again the 3 events show up in the timeline. They still don't show in the results field or get emailed though."

Search : "Failed password" | stats count by user
Start time : rt-1m
End time : rt-0m
Condition : If condition is met Custom condition
search : search count > 2
Alert mode : once per search

This does show all 3 events in the linear scale but in the area where you usually see the actual raw message it still says "No results found". I"m pretty sure this is why no email is triggered since it would have no raw message to send right ?
It looks like piping to the stats count removes the actual raw message and converts it to just a counter.
Are you saying I should get an alert whenever I see an event show up in the linear scale ?

0 Karma

Damien_Dallimor
Ultra Champion

Is the "user" field being extracted properly ? Also check the fieldname case, field names are case sensitive(user, User, USER)

0 Karma

stucky101
Engager

Damien

Thanks for your reply. I have the following now :

Search : index=foo sourcetype=goo "Failed password" | stats count by user
Start time : rt-1m
End time : rt-0m
Condition : If condition is met
Custom condition search : search count > 2
Alert mode : once per search

When I try to log on to a system 3 times in 60 seconds and fail the dashboard doesn't show any events now and nothing gets emailed.
I noticed that when I remove the "index=foo sourcetype=goo" part and test again the 3 events show up in the timeline. They still don't show in the results field or get emailed though.

0 Karma

Damien_Dallimor
Ultra Champion

That's right, thanks Ayn 🙂

0 Karma

Ayn
Legend

The index=foo sourcetype=goo were just examples of what you could put in your search. As in, "let's say you have logs with sourcetype 'foo' in your index 'goo'.". You have to modify the search terms to fit your situation. If putting just "failed password" worked just fine for you, just modify the search to just use that as a search term again.

Damien_Dallimor
Ultra Champion

Create a scheduled search like :

index=foo sourcetype=goo "Failed Password" | stats count by user 

Select Condition -> "if custom condition is met".

And enter this as the Custom condition search :

search count > 2

stucky101
Engager

I realized the example also pipes this to an actual table which I hadn't done so I tried this :

Failed password | stats count by user | table user

but still no alerts. Do I need to read the table manually ? Sorry if this is a stupid question...

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...