Getting Data In

Why is Splunk not indexing .gz files?

jwalzerpitt
Influencer

I created a folder on our dev Splunk server, and then copied over 12 .gz files (from our radius server).

As a test, I did a data input, pointed to the directory with the radius files, and created an index call radius. When I go to search the index, I'm not seeing any events. I checked S.O.S, and I'm not seeing any errors related to the data input/index. When I check data input, it sees the files, but the index says '0' events.

Any help as to why Splunk is not indexing the .gz files would be greatly appreciated.

Thx

0 Karma

JayJohns
Engager

Hey, I'm having similar issues. They previously worked when sentb but since the upgrade having a few kinks. MY config is pretty much the same .

Here is a sample of the log:

02-25-2015 10:34:29.283 -0500 INFO TailingProcessor - Archive file='/splunkapp/home/jam_it_apps_support_intec/Server_Uploads/INTL/DWF_20150225_073228_0_4_1.CSV.gz' upd
ated less than 10000ms ago, will not read it until it stops changing.'

02-25-2015 10:34:39.287 -0500 INFO TailingProcessor - Archive file='/splunkapp/home/jam_it_apps_support_intec/Server_Uploads/INTL/DWF_20150225_073228_0_4_1.CSV.gz' has
stopped changing, will read it now.

A similar path with the same configurations files were gobbled up perfectly.

0 Karma

Isaias_Garcia
Path Finder

you can also specify "*.gz" on the whitelist

0 Karma

aakwah
Builder

can you provide the monitor stanza that you are using?

jwalzerpitt
Influencer

[monitor:///data/syslog/nps]
disabled = false
index = radius
sourcetype = iis
crcSalt =

aakwah
Builder

the stanza seems to be correct, may be the files you are trying to ingest are already processed before, that could be the reason why splunk didn't index them again,
Splunk keep track of processed files through fishbucket index /opt/splunk/var/lib/splunk/fishbucket

You have 2 workarounds to process files again:

1-decompress files in a temp directory and make any change to the files, add a new line at the end of the file for example, then compress them again and add them to monitor directory

Or

2-delete the contents of fishbucket directory and restart splunk rm -rf /opt/splunk/var/lib/splunk/fishbucket/*

If this didn't work, please provide a sample from splunkd.log /opt/splunk/var/log/splunk/splunkd.log

Regards,

0 Karma

jwalzerpitt
Influencer

Tried both suggestions you listed, and even unzipped the files, did a quick touch of them to change the date, deleted the radius index and then re-created it, restarted Splunk a few times, but the files still aren't being indexed. Sample from splunkd.log (grepping radius):

02-17-2015 15:25:58.670 -0500 WARN IndexAdminHandler - Events from the following 1 inputs will now be discarded, since they had targeted index=radius:
02-17-2015 15:25:58.670 -0500 INFO IndexProcessor - idx=radius Removing; IP::deleteIndex
02-17-2015 15:25:58.670 -0500 INFO IndexProcessor - idx=radius Removing; wait for in-flights
02-17-2015 15:25:58.670 -0500 INFO IndexProcessor - idx=radius Removing; erasing DPP from lookups
02-17-2015 15:25:58.671 -0500 INFO databasePartitionPolicy - idx=radius Handling shutdown or signal, reason=3
02-17-2015 15:25:58.671 -0500 INFO IndexProcessor - idx=radius Deletion approved, start dir removal
02-17-2015 15:25:58.671 -0500 INFO HotDBManager - closing hot mgr for idx=radius
02-17-2015 15:25:58.719 -0500 INFO IndexProcessor - idx=radius Removing; erased directory='/opt/splunk/var/lib/splunk/radius/db' (param=homePath)
02-17-2015 15:25:58.721 -0500 INFO IndexProcessor - idx=radius Removing; erased directory='/opt/splunk/var/lib/splunk/radius/colddb' (param=coldPath)
02-17-2015 15:25:58.721 -0500 INFO IndexProcessor - idx=radius Removing; parameter=bloomHomePath has no assigned value

0 Karma

jwalzerpitt
Influencer

Additional log info:
02-17-2015 15:25:58.721 -0500 WARN IndexProcessor - idx=radius Removing; directory='/opt/splunk/var/lib/splunk/radius/summary' (param=summaryHomePath) not found
02-17-2015 15:25:58.721 -0500 INFO IndexProcessor - idx=radius Removing; parameter=tstatsHomePath has no assigned value
02-17-2015 15:25:58.734 -0500 INFO IndexProcessor - idx=radius Removing; erased directory='/opt/splunk/var/lib/splunk/radius/thaweddb' (param=thawedPath)
02-17-2015 15:25:58.737 -0500 INFO IndexProcessor - idx=radius Removing; erased directory='/opt/splunk/var/lib/splunk/radius' (param=index proper)
02-17-2015 15:25:58.737 -0500 INFO IndexProcessor - removing index=radius stanza from indexes.conf app=search
02-17-2015 15:25:58.737 -0500 INFO IndexProcessor - idx=radius Finished removing
02-17-2015 15:26:08.340 -0500 INFO HotDBManager - idx=radius Setting hot mgr params: maxHotSpanSecs=7776000 snapBucketTimespans=false maxHotBuckets=3 maxDataSizeBytes=786432000 quarantinePastSecs=77760000 quarantineFutureSecs=2592000
02-17-2015 15:26:08.340 -0500 INFO HotDBManager - closing hot mgr for idx=radius

0 Karma

jwalzerpitt
Influencer

More:
02-17-2015 15:26:08.340 -0500 INFO databasePartitionPolicy - idx=radius, Initializing, params='[300,period=60,frozenTimePeriodInSecs=188697600,coldToFrozenScript=,coldToFrozenDir=,warmToColdScript=,maxHotBucketSize=786432000,optimizeEvery=5,syncMeta=true,maxTotalDataSizeMB=500000,maxMemoryAllocationPerHotSliceMB=5,addressCompressBits=5,isReadOnly=false,maxMergizzles=6,signatureBlockSize=0,signatureDatabase=_blocksignature,maxHotSpanSecs=7776000,maxMetadataEntries=1000000,maxHotIdleSecs=0,maxHotBuckets=3,quarantinePastSecs=77760000,quarantineFutureSecs=2592000,maxSliceSize=131072,serviceMetaPeriod=25,partialServiceMetaPeriod=0,throttleCheckPeriod=15,homePath_maxDataSizeBytes=0,coldPath_maxDataSizeBytes=0,compressionLevel=-1,fsyncInterval=18446744073709551615,maxBloomBackfillBucketAge_secs=2592000,enableOnlineBucketRepair=true,maxUnreplicatedMsecWithAcks=60000,maxUnreplacatedMsecNoAcks=300000,alwaysBloomBackfill=false,minStreamGroupQueueSize=2000,streamingTargetTsidxSyncPeriodMsec=5000,repFactor=0,hotBucketTimeRefreshInterval=10]' isSlave=false needApplyDeleteJournal=false
02-17-2015 15:26:08.340 -0500 INFO DatabaseDirectoryManager - rescanning buckets for homepath='/opt/splunk/var/lib/splunk/radius/db' [gotManifest=true thawedModtime=1424204768 manifestModTime=0]
02-17-2015 15:26:08.340 -0500 INFO DatabaseDirectoryManager - Writing a bucket manifest in hotWarmPath='/opt/splunk/var/lib/splunk/radius/db'. Reason='Refreshing manifest at start-up.'

0 Karma

aakwah
Builder

Hello,

The above logs shows that the index radius is removed and created again,

Please do the following on terminal run tail -f /opt/splunk/var/log/splunk/splunkd.log

and on another terminal copy the modified files to monitoring directories and get the logs generated when you copied the files and share it with us.

if no logs generated please run the following and provide a sample from output:

grep ArchiveProcessor /opt/splunk/var/log/splunk/splunkd.log

and

grep TailingProcessor /opt/splunk/var/log/splunk/splunkd.log

Regards

0 Karma

jwalzerpitt
Influencer

Tail end of - grep ArchiveProcessor /opt/splunk/var/log/splunk/splunkd.log

02-17-2015 16:09:26.611 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-1800.gz (seek=0 len=23107383)
02-17-2015 16:09:28.522 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-1800.gz', removing from stats
02-17-2015 16:09:28.522 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-2300.gz
02-17-2015 16:09:28.523 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-2300.gz (seek=0 len=11834828)
02-17-2015 16:09:29.678 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-2300.gz', removing from stats
02-17-2015 16:09:29.678 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-1600.gz
02-17-2015 16:09:29.679 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-1600.gz (seek=0 len=24124658)
02-17-2015 16:09:31.854 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-1600.gz', removing from stats

0 Karma

jwalzerpitt
Influencer

Deleted files in the nps directory and then re-copied them:
02-17-2015 16:09:11.571 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-2200.gz
02-17-2015 16:09:11.571 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-2200.gz (seek=0 len=13512575)
02-17-2015 16:09:12.784 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-2200.gz', removing from stats
02-17-2015 16:09:12.784 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-2100.gz
02-17-2015 16:09:12.785 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-2100.gz (seek=0 len=15795772)
02-17-2015 16:09:14.197 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-2100.gz', removing from stats
02-17-2015 16:09:14.197 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-17-0000.gz

0 Karma

jwalzerpitt
Influencer

More:
02-17-2015 16:09:14.197 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-17-0000.gz (seek=0 len=10056257)
02-17-2015 16:09:15.210 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-17-0000.gz', removing from stats
02-17-2015 16:09:15.210 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-1300.gz
02-17-2015 16:09:15.211 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-1300.gz (seek=0 len=27207651)
02-17-2015 16:09:17.428 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-1300.gz', removing from stats
02-17-2015 16:09:17.428 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-1700.gz
02-17-2015 16:09:17.428 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-1700.gz (seek=0 len=24707267)

0 Karma

jwalzerpitt
Influencer

More:
02-17-2015 16:09:19.403 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-1700.gz', removing from stats
02-17-2015 16:09:19.403 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-2000.gz
02-17-2015 16:09:19.404 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-2000.gz (seek=0 len=17426471)
02-17-2015 16:09:20.947 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-2000.gz', removing from stats
02-17-2015 16:09:20.947 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-1400.gz
02-17-2015 16:09:20.947 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-1400.gz (seek=0 len=25706625)
02-17-2015 16:09:22.990 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-1400.gz', removing from stats
02-17-2015 16:09:22.990 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-1500.gz
02-17-2015 16:09:22.990 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-1500.gz (seek=0 len=26212198)
02-17-2015 16:09:25.054 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-1500.gz', removing from stats
02-17-2015 16:09:25.054 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-1900.gz

0 Karma

jwalzerpitt
Influencer

02-17-2015 16:09:25.054 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-1900.gz (seek=0 len=17926924)
02-17-2015 16:09:26.611 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-1900.gz', removing from stats
02-17-2015 16:09:26.611 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-1800.gz
02-17-2015 16:09:26.611 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-1800.gz (seek=0 len=23107383)
02-17-2015 16:09:26.770 -0500 ERROR ArchiveContext - From archive='/data/syslog/nps/nps.15-02-16-1800.gz': gzip: stdout: Broken pipe
02-17-2015 16:09:28.522 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-1800.gz', removing from stats
02-17-2015 16:09:28.522 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-2300.gz
02-17-2015 16:09:28.523 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-2300.gz (seek=0 len=11834828)
02-17-2015 16:09:29.678 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-2300.gz', removing from stats
02-17-2015 16:09:29.678 -0500 INFO ArchiveProcessor - handling file=/data/syslog/nps/nps.15-02-16-1600.gz
02-17-2015 16:09:29.679 -0500 INFO ArchiveProcessor - reading path=/data/syslog/nps/nps.15-02-16-1600.gz (seek=0 len=24124658)

0 Karma

jwalzerpitt
Influencer

02-17-2015 16:09:31.854 -0500 INFO ArchiveProcessor - Finished processing file '/data/syslog/nps/nps.15-02-16-1600.gz', removing from stats

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...