Solved: Re: Line chart comparing yesterday's result with t...

dcroteau · ‎05-19-2010

I was intrigued by a chart that I saw the other day in an App. The App had a dashboard that compared last weeks results vs this weeks results on the same chart. Unfortunately, I did not see the search behind the scenes.

I am looking for a week over week and day over day chart comparing stock trade volume from the previous day or week with today's or this week.

What would a search look like? Again the end goal for me is to have a line chart with 2 lines 1 with this yesterdays volume and 1 with todays volume.

carasso · ‎12-06-2013

Comparing week-over-week results used to a pain in Splunk, with complex date calculations. No more. Now there is a better way.

I wrote a convenient search command called "timewrap" that does it all, for arbitrary time periods.

... | timechart count span=1h | timewrap d

That's it!

http://apps.splunk.com/app/1645/

View solution in original post

carasso · ‎12-06-2013

Comparing week-over-week results used to a pain in Splunk, with complex date calculations. No more. Now there is a better way.

I wrote a convenient search command called "timewrap" that does it all, for arbitrary time periods.

... | timechart count span=1h | timewrap d

That's it!

http://apps.splunk.com/app/1645/

joelshprentz · ‎10-26-2010

The answers to this question helped me plot the hourly averages of some value for each of the past four weeks.

Most previous answers suggested if/then logic based on the time of each event. With four weeks to distinguish, I chose a more direct calculation of the week number (1, 2, 3, or 4):

eval week = floor(5 - (relative_time(now(), "@w") - relative_time(_time, "@w")) / 604800)

That large divisor is the number of seconds in a week.

The complete search satisfied a few other requests:

Ignore activity on weekends.
Label the lines with the week number and the Monday to Friday date range.
Label the hours 00 through 23.
Also plot the overall averages.

Preparing the labels required more work than computing the averages:

... earliest=-4w@w latest=@w NOT date_wday="saturday" NOT date_wday="sunday" |
eval week = floor(5 - (relative_time(now(), "@w") - relative_time(_time, "@w")) / 604800) |
eval weekstart = relative_time(_time, "@w") + 86400 |
eval weekend = weekstart + 345600 |
convert timeformat="%m/%d" ctime(weekstart) ctime(weekend) |
eval weekname = "Overall Average;Week " + week + ": " + weekstart + "-" + weekend |
eval hour_gmt = substr("0" + tostring(date_hour), -2) |
makemv weekname delim=";" |
chart avg(Value) over hour_gmt by weekname

This search yielded results like these (only the first four columns are shown here):

hour_gmt   Overall Averages   Week 1: 09/27-10/01   Week 2: 10/04-10/08
   00             123                 110                   131
   01             144                 125                   163
   02             135                 133                   173

Many thanks to those who posted answers and comments. With your examples as a guide, I was able to develop this report in a short time.

twinspop · ‎06-30-2010

I used appendcols for this one. Comments appreciated:

(broken) SYSCODE=LGN-* earliest=-1h@m latest=-0s@h | chart count(_raw) as today over _time | appendcols [ search SYSCODE=LGN-* earliest=-169h@m latest=-168h@m | eval _time=_time+60*60*24*7 | chart count(_raw) as lastweek over _time | fields lastweek ] | timechart span=1m sum(today) as today,sum(lastweek) as lastweek

Searches for all LGN events in the last hour and in the same hour range exactly 1 week ago. Worked as planned, but it was a bit slow for only 42k matching events.

 _time  today   lastweek
 6/30/10 1:34:00.000 PM 1692    1529
 6/30/10 1:35:00.000 PM 1565    1448
 6/30/10 1:36:00.000 PM 1497    1409
 ... etc

twinspop · ‎06-30-2010

Fixed search added. Still slow, but non-broken is an improvement.

twinspop · ‎06-30-2010

Thanks Nick. I'll try that. Also realized this isn't doing exactly what I thought it was. 😕 Not all time slots are shown. Still tweaking/learning.

sideview · ‎06-30-2010

i definitely would not do "chart count(_raw) over _time". Firstly because count(_raw) is at least ten times more expensive than just 'count' for the same end result. And secondly because its not going to bin anything so you'll have one row per _time value. "timechart count" will be a lot faster and simpler.

sideview · ‎05-19-2010

Simeon and Sorkin's addinfo trick is a good one - addinfo gives you 2 new fields 'info_min_time' and 'info_max_time' which represent the absolute bounds of the timeframe of the main search, as epochtime integers.

But here's another way that uses the relative_time() functionality in eval.

First, for all of these, in Time Range Picker > 'Custom Time' > Advanced search language,
put -1d@d as 'earliest', and +1d@d in as 'latest'.

that will give you a proper timerange of yesterday through today, midnight to midnight.

1) <some search for your events> | eval marker = if (_time < relative_time(now(), "@d"), "yesterday", "today") | eval _time = if (marker=="yesterday", _time+24*60*60, _time) | timechart sum(volume) by marker

The first eval clause is making a determination for each event, where yesterday's events get a new field called 'marker', with a value of "yesterday" and today's events get marker="today".

The second eval clause then uses that to actually shift yesterday's timestamps forward by a day so that they're now timestamp values that occurred today. After that point, from timechart's perspective we actually only have 1 day's worth of events. Timechart then just has the simple task of graphing that data split by 'marker'.

see the very useful 'relative_time' function for eval which I used here. http://www.splunk.com/base/Documentation/latest/SearchReference/CommonEvalFunctions

2) If on the other hand you want to end up with a categorical x-axis like Hosts or something instead of time, you could use the same trick and just pipe to chart instead of timechart at the end. Again in this output i want each row to be a distinct host value.

<some search for your events> | eval marker = if (_time < relative_time(now(), "@d"), "yesterday", "today") | eval _time = if (marker=="yesterday", _time+24*60*60, _time) | chart sum(volume) over host by marker

however there's often other easier and/or more efficient ways to do this. Note that all the finicky eval im doing above is kind of a hamfisted way of doing what timechart and bin do very easily:

<some search for your events> | bin _time span=1h | convert ctime(_time) timeformat="%m/%d" | chart sum(volume) by _time

Also this is somewhat related to http://answers.splunk.com/questions/1286/is-there-a-way-to-get-a-result-that-looks-like-timechart-bu...

3) if the times you want to compare are of different lengths, and/or they're not contiguous, like 'last 24 hours' vs 'preceding 72 hours' vs 'last month', there is yet another very powerful way given as an answer to a similar question : http://answers.splunk.com/questions/1288/can-i-get-a-table-of-statistics-where-different-columns-rep...

Stephen_Sorkin · ‎05-19-2010

I typically use a macro like this:

[week_over_week(2)]
args = data, metric
definition = $data$ earliest=-1w@w latest=@h | timechart span=1h $metric$ as metric | addinfo | eval marker = if(_time < info_min_time + 7*86400, "last week", "this week") | eval _time = if(_time < info_min_time + 7*86400, _time + 7*86400, _time) | chart median(metric) by _time marker
iseval = 0

It gives week over week behavior and can be invoked in search like:

`week_over_week("index=_internal group=per_index_thruput", "per_second(kb)")`

It can be extended to be day over day like this:

[day_over_day(2)]
args = data, metric
definition = $data$ earliest=-1d@d latest=@h | timechart span=15m $metric$ as metric | addinfo  | eval marker = if(_time < info_min_time + 86400, "last week", "this week") | eval _time = if(_time < info_min_time + 86400, _time + 86400, _time) | chart median(metric) by _time marker
iseval = 0

Like the previous one, the first argument to the macro is where to get the data, the second is what timechart aggregator to use.

If you were looking at a website, you may run:

`day_over_day("eventtype=pageview", "count")`                    <== page views per hour
`day_over_day("eventtype=pageview", "dc(clientip)")`             <== visitors per hour
`day_over_day("sourcetype=access_combined status=404", "count")` <== 404 errors per hour

Simeon · ‎05-19-2010

It's mathemagical:

... | timechart span=1h per_second(kb) as KBps | addinfo | eval marker = if(_time < info_min_time + 7*86400, "last week", "this week") | eval _time = if(_time < info_min_time + 7*86400, _time + 7*86400, _time) | chart median(KBps) by _time marker

Well, really it's a combination of math and programmatic functions to replace values.

Line chart comparing yesterday's result with today's result in dashboard

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

They're back! Join the SplunkTrust and MVP at .conf24

Enterprise Security Content Update (ESCU) | New Releases