Reporting

Get top pages per time period

spammenot66
Contributor

Splunk export has a limitation of 10000 rows. How do i get 5 top pages (either by visit or pageviews) per day? Currently I'm using a data model. For example sake, lets say i there's more than 10000 visited pages for any given day. If i ran the query below with a pipe of |head 5 and ran the search for a full year, it wouldn't give me top 5 pages per day for a full year. Instead it would give me top 5 rows from all pages and days.

 |pivot MainSites HomePages dc(User) AS "DistinctUser"   dc(VisitSession) AS "DistinctVisits" sum(wtPageView) AS "TotalPageViews"  sum(Click) AS "TotalClicks"  SPLITROW TimePeriod_Daily  SPLITROW eventURL  | sort - "TotalPageViews"
0 Karma

spammenot66
Contributor

For anyone who is looking to do something similiar - get top x per y when dealing with a data model/pivot, you'll want to run a subsearch similiar to the one below. You'll get an extra column called sum(HomePages) which is the equivalent of total hits for the current time period.

|pivot MainSites HomePages SPLITROW TimePeriod_Daily  | appendcols  [pivot MainSites HomePages dc(User) AS "DistinctUser"   dc(VisitSession) AS "DistinctVisits" sum(wtPageView) AS "TotalPageViews"  sum(Click) AS "TotalClicks"  SPLITROW TimePeriod_Daily  SPLITROW eventURL  | sort - "TotalPageViews"| head 5]
0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

I don't think that is going to do what you think it does. Using a subsearch with 2 pivot commands is not a good practice to start, since using datamodel is the preferred method any way.

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

Without knowing the specifics of your Data Model, I can give you a "normal search" that gives you what you want, and then perhaps you can apply it to your data model.

<find_your_web_logs> | bin span=1d _time | top limit=5 VisitSession by _time

This will split your data in 1 day chunks, and then do a top with the split by of _time, which then gives the top 5 per day.

Sticking it into datamodel land (using some assumptions):

|datamodel <datamodel> <object> search | bin span=1d _time | top limit=5 VisitSession by _time

spammenot66
Contributor

If i used | top limit=5 VisitSession by _time, my result will look something like
_time VisitSession count percent

This tends to be completely useless if i want the top 5 eventURLs per month for the span of 12 months.
Time Period eventURL DistinctUser DistinctVisits TotalPageViews TotalClicks
month1 URL1 10 50 100 0
month1 URL2 9 40 100 0
month1 URL3 7 31 100 30
month1 URL4 6 30 100 20
month1 URL5 5 20 100 10
month2 URL1 10 50 100 0
month2 URL2 9 40 100 0
month2 URL3 7 31 100 30
month2 URL4 6 30 100 20
month3 URL5 5 20 100 10

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...