Getting Data In

Why is a particular index-volume (per day) increasing?

dban2005
New Member

Recently, I have added a file share system for indexing via "Universal Forwarder" at Windows server to the receiver/deployment server (Linux). Yesterday, the total volume of raw data for the file share was 20 G and today it is 21 G. The corresponding indexing (for that particular file share) was 7 G yesterday (per day volume) and today again 7.123 G (again another per day volume); even larger than the increment to the raw data. In the inputs.conf, I have mentioned the global parameter ignoreOlderThan = 7d. Is it indexing the last 7 days in every 24 hours? Or is it something else? How can I determine and avoid? Note: there is no zip file and .xml file has been excluded from indexing.

0 Karma

lguinn2
Legend

How much of the data was Splunk able to index in the first day? Of the 20GB, how much data should have Splunk indexed, and how much did it actually index? I wonder if Splunk is "catching up."

ignoreOlderThan=7d will not cause the data to be indexed twice.

I would turn on the Monitoring Console and look at the tabs on indexing for information, as a starting point.

0 Karma

dban2005
New Member

day 1: 7GB; day 2: 7.531GB; day 3: 1.45GB (as we disabled the index and changed 2d and redeployed); day 4: 6.421GB
How can I understand whether is index is duplicating?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...