Getting Data In

Why is a particular index-volume (per day) increasing?

dban2005
New Member

Recently, I have added a file share system for indexing via "Universal Forwarder" at Windows server to the receiver/deployment server (Linux). Yesterday, the total volume of raw data for the file share was 20 G and today it is 21 G. The corresponding indexing (for that particular file share) was 7 G yesterday (per day volume) and today again 7.123 G (again another per day volume); even larger than the increment to the raw data. In the inputs.conf, I have mentioned the global parameter ignoreOlderThan = 7d. Is it indexing the last 7 days in every 24 hours? Or is it something else? How can I determine and avoid? Note: there is no zip file and .xml file has been excluded from indexing.

0 Karma

lguinn2
Legend

How much of the data was Splunk able to index in the first day? Of the 20GB, how much data should have Splunk indexed, and how much did it actually index? I wonder if Splunk is "catching up."

ignoreOlderThan=7d will not cause the data to be indexed twice.

I would turn on the Monitoring Console and look at the tabs on indexing for information, as a starting point.

0 Karma

dban2005
New Member

day 1: 7GB; day 2: 7.531GB; day 3: 1.45GB (as we disabled the index and changed 2d and redeployed); day 4: 6.421GB
How can I understand whether is index is duplicating?

0 Karma
Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...