Hi!
I would like to have some help with summary indexing.
My situations is like following:
I have events that comes into splunk everday like following
day 1:
_time StartTemperature EndTemperature SequenceNum ID
2014/1/9 00:00 20 21 1 A
2014/1/10 00:00 23 24 2 A
2014/1/11 00:00 25 27 3 B
However, at some point there are some delayed incoming event.
day 2:
_time StartTemperature EndTemperature SequenceNum ID
2014/1/9 00:00 20 21 1 A
2014/1/10 00:00 23 24 2 A
2014/1/8 00:00 27 21 5 A **** late event
2014/1/11 00:00 25 27 3 B
Everday I have to create total average temperature of difference of EndTemperature and NextRecord's Start Temperature. The problem is I can easily do re ordering with sort and streamstats but
there are not enough memory and disks to handle this since there are millions of events.
Since there are millions events, I am considering to organize with
daily summary index like following:
_time SumOfDifferenc(EndTemperature and NextRecord's Start Temperature) ID
xxxxx A
However, if there are late incoming events, I have to recreate the records that
has been already created. I am not sure if this is possible or not..
This might take lots of scripting....
Has anyone has tackled with these kind of situation?
And I would appreciate if one can share their solution.
Thanks,
Yu
Report acceleration might make more sense for you as it can handle late arriving events. See this doc:
http://docs.splunk.com/Documentation/Splunk/5.0.5/Knowledge/Aboutsummaryindexing
In addition , my version of splunk is ver 5.0.5