Splunk Search

How to determine if logs are not being used?

JimSchlaker
New Member

Is there a way to determine which logs are not being used anymore, and therefore can be deleted? For example, maybe a team started logging something a year ago, but the team no longer uses that log for any reports/dashboards/etc... Is there a way to find these unused logs?

0 Karma

woodcock
Esteemed Legend
0 Karma

kellewic
Path Finder

In addition to the links mentioned by adonio, this search might get you some of the way there. However, things like macros can hide indexes/sourcetypes so it's not 100% but does also include data models/nodenames being used.

The search filters out all "*" and "_*" references as those aren't very useful. It prefixes data models with "DM-" and nodenames with "ND-" and treats those as an index/sourcetype combo. Macros are prefixed with "MC-" to easily identify and look at manually.

You could compare this against a REST call to the indexes or indexes-extended endpoint to get a starting point. BUT, you will want to confirm with data owners the indexes aren't actually being used since, again, this search is not 100%.

index=_internal sourcetype=splunkd_remote_searches
|dedup search

|eval
  search=replace(search, "(datamodel\s*=[\s\"]*)(.*?)([\|\s\"\)])", "\1DM-\2\3"),
  search=replace(search, "(eval\s+datamodel\s*=[\s\"]*)DM-", "\1"),
  search=replace(search, "(\|\s*pivot\s+)(.*?)(\s)", "\1DM-\2\3"),
  search=replace(search, "(nodename\s*=[\s\"]*)(.*?)([\|\s\"\)])", "\1ND-\2\3"),
  search=replace(search, "(eval\s+nodename\s*=[\s\"]*)ND-", "\1"),
  search=replace(search, "(search\s*`)(.*?)([`\(])", "\1MC-\2\3")

|rex field=search max_match=0 "index\s*=[\s\"]*(?<idx>.*?)[\|\s\"\)]"
|rex field=search max_match=0 "sourcetype\s*=[\s\"]*(?<st>.*?)[\|\s\"\)]"
|rex field=search max_match=0 "search\s*`(?<macro_index>MC-.*?)[`\(]"
|rex field=search max_match=0 "datamodel\s*=[\s\"]*(?<dm>DM-.*?)[\|\s\"\)]"
|rex field=search max_match=0 "nodename\s*=[\s\"]*(?<node>ND-.*?)[\|\s\"\)]"
|rex field=search max_match=0 "\|\s*pivot\s+(?<pv>.*?)\s"

|eval
  idx=mvdedup(mvappend(idx, macro_index, dm, pv)),
  idx=mvfilter(idx!="*" AND idx!="_*" AND NOT match(idx, "^_") AND NOT match(idx, "^\d+[\*_]")),
  st=mvdedup(mvappend(st, node))

|where isnotnull(idx) AND isnotnull(st)
|stats c by idx, st

|table idx, st

kellewic
Path Finder

One comment - I missed it when posting - in the mvfilter(), remove the last condition as that was specific to my use case when I made this - AND NOT match(idx, "^\d+[*_]"). We have indexes that start with a numeric ID for each customer and I wanted to ignore those.

0 Karma

adonio
Ultra Champion

hello JimSchlaker,
there are answers here around which indexes are used for reports / saved searches / dashboards and more that you can relay on. for example:
https://answers.splunk.com/answers/273176/how-can-i-determine-how-much-an-index-is-being-sea.html
https://answers.splunk.com/answers/186268/how-to-search-for-and-remove-indexes-in-splunk-tha.html
considering you mention also time span, meaning they might look at that particular index / source /sourcetype but not utilizing the old data, i would suggest an opposite way of approaching that challenge.
will suggest to either check the timerange on searches using | rest or the _audit index to determine. or verify with teams, how far back they need their data and set a hard time limit in indexes.conf on the index contains that data
hope it helps

Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...