Solved: Timechart variation

iKate · ‎04-26-2012

Hello,

Lets say we have time period from Jan 2011 till Jan 2012. Is it possible to build a timechart that count events for each time span (e.g 1month) from the very beginning (Jan 2011)? Normaly timechart counts evants by a span. Also simple sum of previous months results doesn't cover the specific issues of dynamics needed.

At the moment I built up rather bulky search that uses append function for each span and different endtime in each appended search. I'm wondering is there more elegant decision of this issue?

iKate · ‎09-17-2012

This is a letter to myself in the past:

Dear Kate!
You should use Summary Indexing together with backfill mechanism! Here is a beginning of your road for convenient and fast work with historical data:
http://docs.splunk.com/Documentation/Splunk/4.3.3/Knowledge/Usesummaryindexing
http://docs.splunk.com/Documentation/Splunk/4.3.3/Knowledge/Managesummaryindexgapsandoverlaps
What a pity that you'll stumble upon this information that late. But still it's so good you'll got to know it.

View solution in original post

iKate · ‎09-17-2012

This is a letter to myself in the past:

Dear Kate!
You should use Summary Indexing together with backfill mechanism! Here is a beginning of your road for convenient and fast work with historical data:
http://docs.splunk.com/Documentation/Splunk/4.3.3/Knowledge/Usesummaryindexing
http://docs.splunk.com/Documentation/Splunk/4.3.3/Knowledge/Managesummaryindexgapsandoverlaps
What a pity that you'll stumble upon this information that late. But still it's so good you'll got to know it.

kristian_kolb · ‎04-26-2012

If this is too simple, please let us know more what you want to achieve.

 sourcetype=your_sourcetype earliest=-1y@y latest=@y|  timechart span=1mon count

UPDATE:

So what you want is a timechart with a one month granularity, over userIDs who registered but never bought anything (i.e. only occur once in your logs)? Assuming that there are 12 new inactive users registered per month, it would be something like ;

Month           Accumulated_Onetime_users
Jan 2011       12
Feb 2011       24
Mar 2011       36

sourcetype=your_sourcetype earliest=-1y@y latest=@y
| dedup 2 userID 
| transaction userID maxevents=2 
| search eventcount=1 
| bucket _time span=1mon 
| stats count AS Onetime_users by _time 
| accum Onetime_users AS Accumulated_onetime_users
| eval Month = strftime(_time, "%b %Y")
| table Month Accumulated_onetime_users

I'm thinking this should work 🙂 , but I'm doing it all in my head, since I don't have any relevant logs to play with. There probably more efficient ways of doing it, even if it actually does work.

UPDATE:

Ok now I have tested the search with some real data. The first three lines of the search above (after the sourcetype and time constraints) will find those events that only have a single unique event for the field value.

dedup 2 limits the initial set to two of each, since there is no point in knowing more than that.
transaction userID maxevents=2 groups events together based on the userID. Thus a transaction can have 1 or 2 events.
search eventcount=1 will keep the transactions that only have one event, i.e. the ones we want to find.

Then we treat all events from the same month as having the same timestamp. (bucket _time span=1mon), so we can make counts of events by month. (stats count)
accum keeps a running total of the events.
the eval Month is for presentation in a nicer way, and the table is just showing the results.

You could add the Onetime_users to the table if you wish, to show which were new for each month.

Let me know if this gives you anything meaningful.

Hope this helps,

Kristian

iKate · ‎09-17-2012

Kristian, some time ago finally I found the the core solving of the question that I felt should surely exist. The answer is below)

kristian_kolb · ‎04-27-2012

Sorry if I misunderstand what you want to achieve - but the search I provided does work the way I said. See update above.

/k

iKate · ‎04-26-2012

At the moment I explicitly state the time for each month that is not so well.

10 digits: converted to epochtime 'endtime':)

iKate · ‎04-26-2012

Kristian, this is a beautiful search! Thanks for it a lot, it also helped me to learn several new search functions:) I was sure it should work the way I need, but it shows just two or one first months while it can't be so. Moreover span=1mon option doesn't let us see filtered userIDs for each month from the needed time range: [earliest - ]. Sory if I explained not so detailed that the desired chart should show the dynamics of such users in time: how much of them were in each month, considering probable correlation with previous months and gathering accumulated result for each month.

iKate · ‎04-26-2012

Kristian, thanks for the reply! But your suggestion can not work for this issue as building of my timechart is preceeded with counting events by user id(for all the time) and selecting only entries with one occurance.
(In my case these are users that only registered and have purchased nothing)
So timechart that you suggest will take one month statistics and count events by users in this month then for another and so on till the 'earliest' time. And it is needed to count each month in the range [earliest - ].

Timechart variation

Announcing Scheduled Export GA for Dashboard Studio

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!