Getting Data In

problem about Monitor Files

benjiminhugh
Explorer

I choose "Continuously index data from a file or directory this Splunk instance can access" to input a file.
Give a example,
there are a, b, c, three records in the file, when I add another d, the searching result in splunk is right, a,b,c,d
But when i delete one, from a,b,c to a,b then the searching result is a,b,a,b,c.
I want it to be a,b
How to fix it?

Tags (1)
0 Karma
1 Solution

Ayn
Legend

I think you're misunderstanding how Splunk stores events. Splunk reads events from the sources it monitors and adds them into its own index. So if you remove data from a file that Splunk is monitoring, you will not remove that data from Splunk's index. Instead, what is likely happening in your case is that Splunk detects that the file has changed, then because it can't search to the previous last position it read in the file (because you've made the file smaller) it will reindex the whole file instead.

So, first of all Splunk will add event a,b,c to its index. Then after your modification the source file will only have a,b. Splunk looks at the file, sees that it's changed, and because it can't find a valid position to search from (for the reason described above) it will reindex the whole file, which means event a,b. This will result in the events a,b,c,a,b in Splunk's index.

View solution in original post

Ayn
Legend

I think you're misunderstanding how Splunk stores events. Splunk reads events from the sources it monitors and adds them into its own index. So if you remove data from a file that Splunk is monitoring, you will not remove that data from Splunk's index. Instead, what is likely happening in your case is that Splunk detects that the file has changed, then because it can't search to the previous last position it read in the file (because you've made the file smaller) it will reindex the whole file instead.

So, first of all Splunk will add event a,b,c to its index. Then after your modification the source file will only have a,b. Splunk looks at the file, sees that it's changed, and because it can't find a valid position to search from (for the reason described above) it will reindex the whole file, which means event a,b. This will result in the events a,b,c,a,b in Splunk's index.

Ayn
Legend

No, you can't "avoid" it - it's an integral part of how Splunk works.

As for the second question, if you add an event to the file, Splunk will just carry on from its last known position (c) and read until the new end of the file (after d), so it will not have to reindex the whole file.

0 Karma

benjiminhugh
Explorer

And why when i add d to the file, it is a,b,c,d. why not be a,b,c,a,b,c,d

0 Karma

benjiminhugh
Explorer

Is there any way to avoid this?

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...