So, I have a search with a regex that has pulled 2 different fields- lets say user and client.
the url is something like,
{base_url}/user/{user_1}/hello.
This user field can hold 100's of values - user_1, user_2, user_3...........
I want to know how many times each "user" is hit on a daily basis for different clients(there are 4 clients). And I only want the users that have max hits everyday (top 5 ).
So, for everyday, for every client, top 5 users with the count ofcourse.
how do I do that.?
I tried this,
My_search|bucket span=1d _time | stats count by _time client user | head 5
This gives me a messed up output. Any ideas??
Hi sp1711,
The obvious search is something like:
My_search | timechart values(client) AS client count by user limit=5
but this shows the top 5 globally, not the top 5 per day.
The problem with "per-day" is that every day could have 5 completely different top user and thus for a month, you may need 150 series.
If you really want to calculate per day, it's something more like:
My_search
| bin span=1d _time
| stats count by _time client user
| sort - _time count
| dedup 5 _time
this will give you, per-day, the top 5 client, user ,count groups.
Add this to graph / chart it:
| timechart span=1d values(client) AS client sum(count) by user limit=1000
Hope this helps ...
cheers, MuS
Hi MuS,
That really got me close to what I want. I tried your second search
My_search
| bin span=1d _time
| stats count by _time client user
| sort - _time count
| dedup 5 _time
This gives me the top 5 users everyday along with which client it belongs. It doesn't give me top 5 users for every client. How do I tweak this for the expected result?
Just change the stats
like | stats count by client user _time
so it matches your needs. The first field after the by
statement is the the sorting one.
Yes I did try that before posting the comment. It only gives me top 5 person for everyday. It gives me
|client|User|count|
|A| 1|100|
|A|2|90|
|A| 3|80|
|A| 4|70|
|A|5|50|
It doesnt give the stats for other clients B,C and D
Ok, Instead of dedup 5 _time
I did dedup 5 client
this does the job. But I'm getting the data only for today even if I select a date range of a month in the search. Thats weird.!
Use the job inspector to verify what happens with the time range in the base search
So I checked that ,
The component, command.dedup has input of 10,000 and output of 10.
Which makes sense because whatever date range I choose I only get 2 days worth of result (top 5 each), which makes it 10. Is that any issue with limit?
what is the exact search command you're using now?
This is the search index="abc" tag=def sourcetype=access_combined "hello"|fields correlation_id|join correlation_id[search index="abc" tag=something sourcetype=access_combined "whatsup"]|rex "(?i)/users/(?P[^/]+)" | rex field=req_host "^(?[^.]*)"
The formatting is screwed up!
One of the regex has user in it and another has client.
It eats up some parts when I try to format.
ohhh you're using a subsearch....I'm no friend of them at all 😉 Because you hit limits with them and they are not really fast. This is not related to this question, but look at this answer http://answers.splunk.com/answers/129424/how-to-compare-fields-over-multiple-sourcetypes-without-joi... and try to adapt your search to a single stats search.
Thanks for the direction. 🙂