Solved: What is the most efficient way to search for the t...

rlough · ‎03-18-2015

Hey everyone,

We currently have a query that tracks the top 100 users hitting our server in the past 24hrs. It looks something like this:

index=*ind* source=*src1.log sourcetype=serve type=INFO host=host1* HitServer=success userid=* | top limit=100 userid showperc=f

This query gets the job done, but it takes about 50 minutes every time.

Is there a better way to handle this massive amount of data?

Thanks!

rlough · ‎03-20-2015

Thank you for your awesome responses! None of them quite fixed the problem I had, but I did find something that worked for me.

I was playing around with my query and found that when I added "fields userid" before the top command, my query ran about twice as fast because it only had to pull out that one field from the huge sum of data. I hope this helps someone!

View solution in original post

rlough · ‎03-20-2015

Thank you for your awesome responses! None of them quite fixed the problem I had, but I did find something that worked for me.

I was playing around with my query and found that when I added "fields userid" before the top command, my query ran about twice as fast because it only had to pull out that one field from the huge sum of data. I hope this helps someone!

dveuve_splunk · ‎03-19-2015

Data Model Acceleration, or report acceleration are good ways to solve this challenge in general. with report acceleration, you just save the search, click the accelerate button, and Splunk will pre-calculate all those details for you. Data Model Acceleration will require you to change your query a bit, but is valuable if you also do a bunch of other analysis on that dataset that you would also like accelerated.

For more, check out report acceleration or data model acceleration at docs.splunk.com, and there's also great detail on both topics at conf.splunk.com -- just check out 2014 sessions and search the page for "acceleration" or "Data Model"

btt · ‎03-19-2015

I thing you can use summary indexing
Populate a summary index with the top userid in a scheduled search that runs daily

 index=*ind* source=*src1.log sourcetype=serve type=INFO host=host1* HitServer=success userid=* | sitop  userid

Save the search as, "your_search_name".
Later, run this search

index=summary search_name="your_search_name" | top limit=100 userid showperc=f

juvetm · ‎03-19-2015

hi riough
please i will like to remove the top commad from your search query, why do i say so is that if you look carefully it return the percent and count of useid in which if the userid have so many values it take alot of time to return the search query.
so what i advice you to do is to use the sort Command instead of top why do i say so is because the sort command sorts the results by the given list of fields. Results missing a given field are treated as having the smallest or largest possible value of that field if the order is descending or ascending, respectively. you may do something like this for more information on this sort commad i will like you to check on the documentation Splunk-6.1.1-SearchReference
thanks

............|sort 100 userid

What is the most efficient way to search for the top users hitting our servers?

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!