Deployment Architecture

Summary index cron schedule to populate first then schedule

sajeeshpn
New Member

Hi,

I am creating a new summary index and scheduled it to run every 6 hours intervals. In savedsearches.conf, put like:-

cron_schedule = 0 */6 * * *

With this change, only after 6 hours I can expect for some data to get populated in the summary index right.

But I would like to know whether the summary search gets executed once (now) and then gets scheduled to run every 6 hours. So that there would be some data in summary index immediately.

Thanks,
Sajeesh

Tags (1)
0 Karma
1 Solution

gcusello
SplunkTrust
SplunkTrust

Hi sajeeshpn,
if you want to do this your have to be aware to the period of your searches because summarization isn't a normal Splunk Ingestion so there isn't the check on already ingested events and there is the risk to have duplicated events or to lose events.
So if your next run will be at 12.00 with period from the 6.00 to 12.00, to have the data you have to choose a period before 6.00, and it's better to take events not to now but until a safe period before now (e.g. from -370m@m to -10m@m).
In addition I suggest to you to verify the continuity of your logs because, if there is some large delay (e.g. 1 hour), you risk to lose your data and you have to consider this choosing the safe period.
To verify the continuity of your logs you have to verify what is the difference between _time and _indextime.
To be more sure you could take a larger time period (e.g. 12 hours) and insert in your search a check on the _indextime, excluding all logs with _indextime before 6 hours.
Bye.
Giuseppe

View solution in original post

0 Karma

gcusello
SplunkTrust
SplunkTrust

Hi sajeeshpn,
if you want to do this your have to be aware to the period of your searches because summarization isn't a normal Splunk Ingestion so there isn't the check on already ingested events and there is the risk to have duplicated events or to lose events.
So if your next run will be at 12.00 with period from the 6.00 to 12.00, to have the data you have to choose a period before 6.00, and it's better to take events not to now but until a safe period before now (e.g. from -370m@m to -10m@m).
In addition I suggest to you to verify the continuity of your logs because, if there is some large delay (e.g. 1 hour), you risk to lose your data and you have to consider this choosing the safe period.
To verify the continuity of your logs you have to verify what is the difference between _time and _indextime.
To be more sure you could take a larger time period (e.g. 12 hours) and insert in your search a check on the _indextime, excluding all logs with _indextime before 6 hours.
Bye.
Giuseppe

0 Karma

sajeeshpn
New Member

Thank you !

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...