Getting Data In

problem about Monitor Files

benjiminhugh
Explorer

I choose "Continuously index data from a file or directory this Splunk instance can access" to input a file.
Give a example,
there are a, b, c, three records in the file, when I add another d, the searching result in splunk is right, a,b,c,d
But when i delete one, from a,b,c to a,b then the searching result is a,b,a,b,c.
I want it to be a,b
How to fix it?

Tags (1)
0 Karma
1 Solution

Ayn
Legend

I think you're misunderstanding how Splunk stores events. Splunk reads events from the sources it monitors and adds them into its own index. So if you remove data from a file that Splunk is monitoring, you will not remove that data from Splunk's index. Instead, what is likely happening in your case is that Splunk detects that the file has changed, then because it can't search to the previous last position it read in the file (because you've made the file smaller) it will reindex the whole file instead.

So, first of all Splunk will add event a,b,c to its index. Then after your modification the source file will only have a,b. Splunk looks at the file, sees that it's changed, and because it can't find a valid position to search from (for the reason described above) it will reindex the whole file, which means event a,b. This will result in the events a,b,c,a,b in Splunk's index.

View solution in original post

Ayn
Legend

I think you're misunderstanding how Splunk stores events. Splunk reads events from the sources it monitors and adds them into its own index. So if you remove data from a file that Splunk is monitoring, you will not remove that data from Splunk's index. Instead, what is likely happening in your case is that Splunk detects that the file has changed, then because it can't search to the previous last position it read in the file (because you've made the file smaller) it will reindex the whole file instead.

So, first of all Splunk will add event a,b,c to its index. Then after your modification the source file will only have a,b. Splunk looks at the file, sees that it's changed, and because it can't find a valid position to search from (for the reason described above) it will reindex the whole file, which means event a,b. This will result in the events a,b,c,a,b in Splunk's index.

Ayn
Legend

No, you can't "avoid" it - it's an integral part of how Splunk works.

As for the second question, if you add an event to the file, Splunk will just carry on from its last known position (c) and read until the new end of the file (after d), so it will not have to reindex the whole file.

0 Karma

benjiminhugh
Explorer

And why when i add d to the file, it is a,b,c,d. why not be a,b,c,a,b,c,d

0 Karma

benjiminhugh
Explorer

Is there any way to avoid this?

0 Karma
Get Updates on the Splunk Community!

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...

Welcome to the Splunk Community!

(view in My Videos) We're so glad you're here! The Splunk Community is place to connect, learn, give back, and ...

Tech Talk | Elevating Digital Service Excellence: The Synergy of Splunk RUM & APM

Elevating Digital Service Excellence: The Synergy of Real User Monitoring and Application Performance ...