Splunk Enterprise Security

Can I clean up the datamodel_summary directories that are growing by a couple dozen GB/day?

ericlarsen
Path Finder

We just implemented Splunk Enterprise Security about a month ago. We're new to data models, acceleration, and any implications they may have on our Splunk environment.

I noticed the datamodel_summary directory in our firewall logs index ($SPLUNK_HOME/var/lib/splunk/pan_logs/) is growing incredibly large (850GB and growing a couple dozen GB/day).

I need to understand why. We have the Palo Alto app installed as well and the Palo Alto Networks Firewall Logs datamodel (7 days acceleration) is 100 GB.

In ES, the Network Traffic datamodel (30 days acceleration), part of the Splunk_SA_CIM app, is 300+ GB!

There are approx. 350 dirs in the $SPLUNK_HOME/var/lib/splunk/pan_logs/datamodel_summary dir. Are all of these really necessary, or can I institute some kind of cleanup in this directory to recover space?

Any help in understanding how data models are stored/cleaned up would be greatly appreciated.
Thanks.

0 Karma

lguinn2
Legend

I don't think that you should manually delete any of the data model acceleration files. The size of these files is related to two things: the number of events in the associated index and the number of days acceleration. I am not surprised to find that your data model summary information is quite large.

To fix it, you may want to decrease the number of days acceleration for some (or all) data models. Clearly 30 days acceleration is going to be approximately 4x as large as 7 days acceleration for the same index.

The usual estimate for the size of the data model summary = Inbound data amount (GB or MB) * 3.4
You might want to take a look at this page of the documentation: Accelerate data models

0 Karma

ericlarsen
Path Finder

Thanks for the response.

How did you come up with the "data model summary = Inbound data amount (GB or MB) * 3.4" statement? Wouldn't it depend on the summary range of the data model?

0 Karma

lguinn2
Legend

That calculation is published in the Splunk® Enterprise Security Installation and Upgrade Manual in the section on Data model acceleration storage and retention

I just looked it up and it also says "This formula assumes that you are using the recommended retention rates for the accelerated data models." Here is the link:
http://docs.splunk.com/Documentation/ES/4.5.1/Install/Datamodels

If you are seeing something really different from what the documentation suggests, I think you should file a support ticket. If you just stop accelerating the data models, I am concerned that it might have a negative effect on your Enterprise Security correlation searches and alerts...

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...