I was intrigued by a chart that I saw the other day in an App. The App had a dashboard that compared last weeks results vs this weeks results on the same chart. Unfortunately, I did not see the search behind the scenes.
I am looking for a week over week and day over day chart comparing stock trade volume from the previous day or week with today's or this week.
What would a search look like? Again the end goal for me is to have a line chart with 2 lines 1 with this yesterdays volume and 1 with todays volume.
Comparing week-over-week results used to a pain in Splunk, with complex date calculations. No more. Now there is a better way.
I wrote a convenient search command called "timewrap" that does it all, for arbitrary time periods.
... | timechart count span=1h | timewrap d
That's it!
Comparing week-over-week results used to a pain in Splunk, with complex date calculations. No more. Now there is a better way.
I wrote a convenient search command called "timewrap" that does it all, for arbitrary time periods.
... | timechart count span=1h | timewrap d
That's it!
The answers to this question helped me plot the hourly averages of some value for each of the past four weeks.
Most previous answers suggested if/then logic based on the time of each event. With four weeks to distinguish, I chose a more direct calculation of the week number (1, 2, 3, or 4):
eval week = floor(5 - (relative_time(now(), "@w") - relative_time(_time, "@w")) / 604800)
That large divisor is the number of seconds in a week.
The complete search satisfied a few other requests:
Preparing the labels required more work than computing the averages:
... earliest=-4w@w latest=@w NOT date_wday="saturday" NOT date_wday="sunday" |
eval week = floor(5 - (relative_time(now(), "@w") - relative_time(_time, "@w")) / 604800) |
eval weekstart = relative_time(_time, "@w") + 86400 |
eval weekend = weekstart + 345600 |
convert timeformat="%m/%d" ctime(weekstart) ctime(weekend) |
eval weekname = "Overall Average;Week " + week + ": " + weekstart + "-" + weekend |
eval hour_gmt = substr("0" + tostring(date_hour), -2) |
makemv weekname delim=";" |
chart avg(Value) over hour_gmt by weekname
This search yielded results like these (only the first four columns are shown here):
hour_gmt Overall Averages Week 1: 09/27-10/01 Week 2: 10/04-10/08
00 123 110 131
01 144 125 163
02 135 133 173
Many thanks to those who posted answers and comments. With your examples as a guide, I was able to develop this report in a short time.
I used appendcols for this one. Comments appreciated:
(broken) SYSCODE=LGN-* earliest=-1h@m latest=-0s@h | chart count(_raw) as today over _time | appendcols [ search SYSCODE=LGN-* earliest=-169h@m latest=-168h@m | eval _time=_time+60*60*24*7 | chart count(_raw) as lastweek over _time | fields lastweek ] | timechart span=1m sum(today) as today,sum(lastweek) as lastweek
(fixed) SYSCODE=LGN-* earliest=-1h@m | timechart span=1m count as today | appendcols [ search SYSCODE=LGN-* earliest=-169h@m latest=-168h@m | timechart span=1m count as lastweek | fields lastweek ] | timechart span=1m sum(today),sum(lastweek)
Searches for all LGN events in the last hour and in the same hour range exactly 1 week ago. Worked as planned, but it was a bit slow for only 42k matching events.
_time today lastweek
6/30/10 1:34:00.000 PM 1692 1529
6/30/10 1:35:00.000 PM 1565 1448
6/30/10 1:36:00.000 PM 1497 1409
... etc
Fixed search added. Still slow, but non-broken is an improvement.
Thanks Nick. I'll try that. Also realized this isn't doing exactly what I thought it was. 😕 Not all time slots are shown. Still tweaking/learning.
i definitely would not do "chart count(_raw) over _time". Firstly because count(_raw) is at least ten times more expensive than just 'count' for the same end result. And secondly because its not going to bin anything so you'll have one row per _time value. "timechart count" will be a lot faster and simpler.
Simeon and Sorkin's addinfo trick is a good one - addinfo gives you 2 new fields 'info_min_time' and 'info_max_time' which represent the absolute bounds of the timeframe of the main search, as epochtime integers.
But here's another way that uses the relative_time() functionality in eval.
First, for all of these, in Time Range Picker > 'Custom Time' > Advanced search language,
put -1d@d as 'earliest', and +1d@d in as 'latest'.
that will give you a proper timerange of yesterday through today, midnight to midnight.
1) <some search for your events> | eval marker = if (_time < relative_time(now(), "@d"), "yesterday", "today") | eval _time = if (marker=="yesterday", _time+24*60*60, _time) | timechart sum(volume) by marker
The first eval clause is making a determination for each event, where yesterday's events get a new field called 'marker', with a value of "yesterday" and today's events get marker="today".
The second eval clause then uses that to actually shift yesterday's timestamps forward by a day so that they're now timestamp values that occurred today. After that point, from timechart's perspective we actually only have 1 day's worth of events. Timechart then just has the simple task of graphing that data split by 'marker'.
see the very useful 'relative_time' function for eval
which I used here.
http://www.splunk.com/base/Documentation/latest/SearchReference/CommonEvalFunctions
2) If on the other hand you want to end up with a categorical x-axis like Hosts or something instead of time, you could use the same trick and just pipe to chart
instead of timechart
at the end. Again in this output i want each row to be a distinct host
value.
<some search for your events> | eval marker = if (_time < relative_time(now(), "@d"), "yesterday", "today") | eval _time = if (marker=="yesterday", _time+24*60*60, _time) | chart sum(volume) over host by marker
however there's often other easier and/or more efficient ways to do this. Note that all the finicky eval im doing above is kind of a hamfisted way of doing what timechart
and bin
do very easily:
<some search for your events> | bin _time span=1h | convert ctime(_time) timeformat="%m/%d" | chart sum(volume) by _time
Also this is somewhat related to http://answers.splunk.com/questions/1286/is-there-a-way-to-get-a-result-that-looks-like-timechart-bu...
3) if the times you want to compare are of different lengths, and/or they're not contiguous, like 'last 24 hours' vs 'preceding 72 hours' vs 'last month', there is yet another very powerful way given as an answer to a similar question : http://answers.splunk.com/questions/1288/can-i-get-a-table-of-statistics-where-different-columns-rep...
I typically use a macro like this:
[week_over_week(2)]
args = data, metric
definition = $data$ earliest=-1w@w latest=@h | timechart span=1h $metric$ as metric | addinfo | eval marker = if(_time < info_min_time + 7*86400, "last week", "this week") | eval _time = if(_time < info_min_time + 7*86400, _time + 7*86400, _time) | chart median(metric) by _time marker
iseval = 0
It gives week over week behavior and can be invoked in search like:
`week_over_week("index=_internal group=per_index_thruput", "per_second(kb)")`
It can be extended to be day over day like this:
[day_over_day(2)]
args = data, metric
definition = $data$ earliest=-1d@d latest=@h | timechart span=15m $metric$ as metric | addinfo | eval marker = if(_time < info_min_time + 86400, "last week", "this week") | eval _time = if(_time < info_min_time + 86400, _time + 86400, _time) | chart median(metric) by _time marker
iseval = 0
Like the previous one, the first argument to the macro is where to get the data, the second is what timechart aggregator to use.
If you were looking at a website, you may run:
`day_over_day("eventtype=pageview", "count")` <== page views per hour
`day_over_day("eventtype=pageview", "dc(clientip)")` <== visitors per hour
`day_over_day("sourcetype=access_combined status=404", "count")` <== 404 errors per hour
It's mathemagical:
... | timechart span=1h per_second(kb) as KBps | addinfo | eval marker = if(_time < info_min_time + 7*86400, "last week", "this week") | eval _time = if(_time < info_min_time + 7*86400, _time + 7*86400, _time) | chart median(KBps) by _time marker
Well, really it's a combination of math and programmatic functions to replace values.