Knowledge Management

Why am I getting a mismatch between Summary Index values and the original values?

nl65
Explorer

I have the following search which works fine:

sourcetype=my_sourcetype some_filter  |bucket _time span=1d |  timechart count by some_field

Verifying decomposition into 1h bins works fine as well and matches the above.

sourcetype=my_sourcetype some_filter  |bucket _time span=1h | sitimechart count by some_field | bucket _time span=1d|timechart count by some_field

Created saved search:

sourcetype=my_sourcetype some_filter |bucket _time span=1h | sitimechart count by calc_severity

Then back-filling summary index using:

./splunk cmd python fill_summary_index.py -app my_app -name my_search -et my_start_time -lt my_end_time  -dedup true -auth my_user:my_pass

populating the summary index.

However, trying to regenerate the chart using :

index=summary source=my_source | bucket _time span=1d  | timechart count by some_field

produces wrong chart as the values are much greater then the originals.

It seems to me that the values in summary index are factored by permutations of the values of some_field.

Please advise

Thanks

Tags (1)
0 Karma
1 Solution

halr9000
Motivator

A few observations:

  1. You don't need to call bucket explicitly on _time, because timcahrt does this as needed (docs). It may be confusing things here, so why don't you move such reporting commands out of your populating search? Or do you really want to be modifying _time twice?
  2. The search used to retrieve results from a summary index has to be the same as how it goes in, or the results will differ for sure. This answer explains it well.
  3. Make sure that summary is what you want! Report acceleration and data model acceleration are more flexible and often (but not always) replace what was previously done with summary indexing. This docs page talks about the differences.

View solution in original post

halr9000
Motivator

A few observations:

  1. You don't need to call bucket explicitly on _time, because timcahrt does this as needed (docs). It may be confusing things here, so why don't you move such reporting commands out of your populating search? Or do you really want to be modifying _time twice?
  2. The search used to retrieve results from a summary index has to be the same as how it goes in, or the results will differ for sure. This answer explains it well.
  3. Make sure that summary is what you want! Report acceleration and data model acceleration are more flexible and often (but not always) replace what was previously done with summary indexing. This docs page talks about the differences.

nl65
Explorer

Thanks much, switched to accelerated.

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...