Installation

how to find Number of files failed to ingest for a specific Index

athorat
Communicator

How do I find the Number of files failed to ingest for a specific Index.
Trying to compare files ingested vs files failed to ingest for a specific Index in Splunk.

0 Karma

woodcock
Esteemed Legend

If you have a list of files and how many events are in them, then you can do something like this and cross-reference:

| tstats count AS EventsInThisFile WHERE index=YourIndexNameHere BY source
0 Karma

richgalloway
SplunkTrust
SplunkTrust

Are you looking for files that were once successfully ingested and are no longer being read or files that were never ingested at all?

The former case is a matter of searching some long period (like 30 days) to build a list of expected files then searching a short period (like today) to build a list of current files and comparing the two.

The latter case is more challenging. A source that was never read will not be in Splunk, but you may find an error message in _internal for files that could not be read, perhaps because of permissions. It's possible, of course, for a file to be silently skipped if it's not part of the monitor pattern, for instance.

Please clarify your requirements and we'll try to help.

---
If this reply helps you, Karma would be appreciated.
0 Karma

athorat
Communicator

Thanks for replying on this @richgalloway,aplogize for the delayed reply. yes we are trying to find the files which never reached splunk.
1 by permissions issue
2 OR by files not matching the whitelist pattern or Unknown reasons.

We have have got the count of files per index which are being read/indexed by Splunk UF
| tstats dc(source) WHERE host=10a*pd-* OR host=ew1a-* OR host=dub01pd-* OR host=uw2*-* OR host=ue1-* index=prod-online* by index

Failed attempt Below:
Now we want to list the number of files which are errored out / Not read by the Splunk UF. So for this the same hosts are being used to filter but how do we get that by Index name and have a bar chart comparing the above query?
index=_internal sourcetype=splunkd splunk_server=usw* host=10a*pd-* OR host=ew1a-* OR host=dub01pd-* OR host=uw2*-* OR host=ue1-* log_level=ERROR
| rex field=message "((?.*))"| stats dc(message) by host | sort – message

0 Karma

richgalloway
SplunkTrust
SplunkTrust

Sources that fail are not indexed so you can't get stats by index. I suggest generating stats by source or host.

---
If this reply helps you, Karma would be appreciated.
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...