Getting Data In

How do I remove host from Data Summary screen but keep data?

gph12
Explorer

Hello,

I'm looking for advice on how to handle systems that are removed from the network.

We have several hundred Windows systems with the UniversalForwarder installed, sending log data to our Splunk server. As systems are decommissioned, I want to keep the log data from those retired systems in Splunk for compliance reasons. But I no longer want the retired system's host name to appear in the Data Summary window in Splunk Search. I only want live production systems to appear on that screen.

Is it just a matter of deleting the client name from the Forwarder Management screen?

Thanks,

Greg

0 Karma
1 Solution

somesoni2
Revered Legend

When the data in indexed into Splunk, Splunk creates index files, also known as tsidx files and raw data files. The index files are what it makes the data searchable in Splunk and also feeds the Data Summary. By default the data is searchable as long as the raw data is retained in Splunk (before its frozen). So if you want to keep data in Splunk and keep it searchable, it will appear in the Data Summary page as well.

If you're in Splunk 6.4+, it provides an option to setup separate retention period for tsidx files. This way you want to keep the data in Splunk, but not searchable (at least right away) and it'll not appear in data summary page (haven't tested, may be someone else will confirm this). See this link for more details.

https://docs.splunk.com/Documentation/Splunk/6.4.0/Indexer/Reducetsidxdiskusage

Other option would be to archive the data in Splunk so it's not searchable at all and won't show up in data summaries. You can later restore the data if needed for audit.
http://docs.splunk.com/Documentation/Splunk/6.5.1/Indexer/Automatearchiving
http://docs.splunk.com/Documentation/Splunk/6.5.1/Indexer/Setaretirementandarchivingpolicy

View solution in original post

somesoni2
Revered Legend

When the data in indexed into Splunk, Splunk creates index files, also known as tsidx files and raw data files. The index files are what it makes the data searchable in Splunk and also feeds the Data Summary. By default the data is searchable as long as the raw data is retained in Splunk (before its frozen). So if you want to keep data in Splunk and keep it searchable, it will appear in the Data Summary page as well.

If you're in Splunk 6.4+, it provides an option to setup separate retention period for tsidx files. This way you want to keep the data in Splunk, but not searchable (at least right away) and it'll not appear in data summary page (haven't tested, may be someone else will confirm this). See this link for more details.

https://docs.splunk.com/Documentation/Splunk/6.4.0/Indexer/Reducetsidxdiskusage

Other option would be to archive the data in Splunk so it's not searchable at all and won't show up in data summaries. You can later restore the data if needed for audit.
http://docs.splunk.com/Documentation/Splunk/6.5.1/Indexer/Automatearchiving
http://docs.splunk.com/Documentation/Splunk/6.5.1/Indexer/Setaretirementandarchivingpolicy

gph12
Explorer

Thanks for the response. That's some good information. I'll have to consider leaving it as is or archiving the data of those systems.

The main reason I wanted to do this is because the Data Summary screen was very helpful in identifying which systems were offline. So I was using it in a way it was intended. But it may still work for me.

Thanks again.

0 Karma
Get Updates on the Splunk Community!

Introducing the Splunk Community Dashboard Challenge!

Welcome to Splunk Community Dashboard Challenge! This is your chance to showcase your skills in creating ...

Wondering How to Build Resiliency in the Cloud?

IT leaders are choosing Splunk Cloud as an ideal cloud transformation platform to drive business resilience,  ...

Updated Data Management and AWS GDI Inventory in Splunk Observability

We’re making some changes to Data Management and Infrastructure Inventory for AWS. The Data Management page, ...