Hey everyone,
We currently have a query that tracks the top 100 users hitting our server in the past 24hrs. It looks something like this:
index=*ind* source=*src1.log sourcetype=serve type=INFO host=host1* HitServer=success userid=* | top limit=100 userid showperc=f
This query gets the job done, but it takes about 50 minutes every time.
Is there a better way to handle this massive amount of data?
Thanks!
Thank you for your awesome responses! None of them quite fixed the problem I had, but I did find something that worked for me.
I was playing around with my query and found that when I added "fields userid" before the top command, my query ran about twice as fast because it only had to pull out that one field from the huge sum of data. I hope this helps someone!
Thank you for your awesome responses! None of them quite fixed the problem I had, but I did find something that worked for me.
I was playing around with my query and found that when I added "fields userid" before the top command, my query ran about twice as fast because it only had to pull out that one field from the huge sum of data. I hope this helps someone!
Data Model Acceleration, or report acceleration are good ways to solve this challenge in general. with report acceleration, you just save the search, click the accelerate button, and Splunk will pre-calculate all those details for you. Data Model Acceleration will require you to change your query a bit, but is valuable if you also do a bunch of other analysis on that dataset that you would also like accelerated.
For more, check out report acceleration or data model acceleration at docs.splunk.com, and there's also great detail on both topics at conf.splunk.com -- just check out 2014 sessions and search the page for "acceleration" or "Data Model"
I thing you can use summary indexing
Populate a summary index with the top userid in a scheduled search that runs daily
index=*ind* source=*src1.log sourcetype=serve type=INFO host=host1* HitServer=success userid=* | sitop userid
Save the search as, "your_search_name".
Later, run this search
index=summary search_name="your_search_name" | top limit=100 userid showperc=f
hi riough
please i will like to remove the top commad from your search query, why do i say so is that if you look carefully it return the percent and count of useid in which if the userid have so many values it take alot of time to return the search query.
so what i advice you to do is to use the sort Command instead of top why do i say so is because the sort command sorts the results by the given list of fields. Results missing a given field are treated as having the smallest or largest possible value of that field if the order is descending or ascending, respectively. you may do something like this for more information on this sort commad i will like you to check on the documentation Splunk-6.1.1-SearchReference
thanks
............|sort 100 userid