Getting Data In

Whitelisting/Blacklisting files inside tgz files

wdhathaway
Explorer

I have a bunch of .tgz files that are being regularly uploaded to a directory and I'd like to only index a subset of the files inside the archive files.

Example archive files:

   tar tzvf archive.1.2.tgz 
     -rw-r--r--  0 wdh    wdh       948 Jan 10 09:24 app1.log
     -rw-r--r--  0 wdh    wdh       414 Jan 10 09:24 foo.log
     -rw-r--r--  0 wdh    wdh       770 Jan 10 09:24 splat.log

  tar tzvf archive.5.8.tgz 
     -rw-r--r--  0 wdh    wdh       148 Jan 10 09:24 app3.log
     -rw-r--r--  0 wdh    wdh       216 Jan 10 09:24 bad.log
     -rw-r--r--  0 wdh    wdh       789 Jan 10 09:24 splat.log

From the example above, I'd like only the "splat.log" file inside archive.*.tgz to be indexed. It appears to me that the whitelist/blacklist settings for an inputs.conf stanza only apply to the archive file name, not to files inside the archive.

While I know I can have some external batch process run and pull the 'splat.log' files out, is there any way I can use whitelist/blacklist, or some other Splunk configuration mechanism to filter based on the internal filenames inside the archive files?

Tags (2)

gelica
Communicator

Hi,
Did you ever find a way to do this? 🙂

0 Karma

robsenk
Engager

Is this an issue with 4.3 as well? Been beating my heat on this one as well.

0 Karma

southeringtonp
Motivator

Not quite what you're looking for, but if nothing else you could route the events to nullQueue to discard the events from the unwanted files at index time.

jstockamp
Communicator

I've just run into this issue myself and have been beating my head against the wall trying to figure it out. It's odd that splunk supports using the name of a file inside a tgz with regex to specify the hostname, but it can't look inside the tarball for the blacklist. Very frustrating!

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...