I have a log that contains details of policy violations committed by users and this is available for a period of few months. I would like to find users who are repeatedly committing the violations over a period of time.
If I do "time chart timechart limit=10 span=1w count by User_Name useother=f usenull=f", I only get top counts of users over time and not exactly repeat offenders.
Any pointers in this regard would be great.
Thanks
Following are some sample logs that I created just for clarifying, actual logs have lot more fields. Nonetheless user_name, date/time, rule violation are the fields that I am interested in.
21/05/2013 10:00:15 user1 violated rule1
21/05/2013 08:09:15 user1 violated rule2
22/05/2013 10:00:15 user1 violated rule1
23/05/2013 08:09:15 user2 violated rule2
28/05/2013 10:00:15 user1 violated rule5
29/05/2013 08:09:15 user3 violated rule4
31/05/2013 10:00:15 user1 violated rule7
01/06/2013 08:09:15 user3 violated rule2
02/06/2013 10:00:15 user1 violated rule8
05/06/2013 08:09:15 user3 violated rule5
05/06/2013 10:00:15 user1 violated rule6
06/06/2013 08:09:15 user4 violated rule2
06/06/2013 08:09:15 user4 violated rule5
06/06/2013 08:09:15 user1 violated rule2
06/07/2013 08:09:15 user4 violated rule9
07/06/2013 08:09:15 user4 violated rule10
In the above log user4 would qualify as one of top violators, but violations are not committed every week, whereas user1 is a repeat offender who violated rules atleast 2 times a week. I need the ability to find this pattern and then plot top repeat offenders over a period of time
Depending on how you define week you could use the following 2 searches (and maybe use cphairs method to get date_week if you want to base your searches on that):
This will give you a list of offenders with the number of weeks where at least 5 violations were detected and the number of weeks your search spans you could add | where count=weeks for offenders that offended every week:
|bucket _time span=7d | stats count(User_name) as violations by User_name,_time | where violations>5 | stats count sum(violations) as violations by User_name | addinfo |eval weeks=round(((info_max_time-info_min_time)/86400) / 7,0) | fields user,count,weeks,violations
This will produce a chart:
|bucket _time span=7d | stats count(User_name) as violations by User_name,_time | where violations>5 | chart sum(violations) over _time by User_name
There's not a built-in date_week field, but to roll your own this gives a reasonable approximation:
eval date_week=round(((_time/86400) % 365) / 7,0)
eval date_week=round(((_time/86400) % 365) / 7,0) | eventstats count by user_name date_week | timechart span=7d avg(count) by user_name
stats count by User_name will give me top offenders. But I would like to find user who committed >5 violations every week and plot the top repeat offenders over a period of may be 3 months.
I have edited the post above with some sample logs.
What pattern in the log qualifies as a repeat offender? You could use | stats count by User_Name to get the count over the entire period. What do the logs look like? Do you only want to see users who violate the same policy at least twice? A little more information would be great.