Getting Data In

problem about Monitor Files

benjiminhugh
Explorer

I choose "Continuously index data from a file or directory this Splunk instance can access" to input a file.
Give a example,
there are a, b, c, three records in the file, when I add another d, the searching result in splunk is right, a,b,c,d
But when i delete one, from a,b,c to a,b then the searching result is a,b,a,b,c.
I want it to be a,b
How to fix it?

Tags (1)
0 Karma
1 Solution

Ayn
Legend

I think you're misunderstanding how Splunk stores events. Splunk reads events from the sources it monitors and adds them into its own index. So if you remove data from a file that Splunk is monitoring, you will not remove that data from Splunk's index. Instead, what is likely happening in your case is that Splunk detects that the file has changed, then because it can't search to the previous last position it read in the file (because you've made the file smaller) it will reindex the whole file instead.

So, first of all Splunk will add event a,b,c to its index. Then after your modification the source file will only have a,b. Splunk looks at the file, sees that it's changed, and because it can't find a valid position to search from (for the reason described above) it will reindex the whole file, which means event a,b. This will result in the events a,b,c,a,b in Splunk's index.

View solution in original post

Ayn
Legend

I think you're misunderstanding how Splunk stores events. Splunk reads events from the sources it monitors and adds them into its own index. So if you remove data from a file that Splunk is monitoring, you will not remove that data from Splunk's index. Instead, what is likely happening in your case is that Splunk detects that the file has changed, then because it can't search to the previous last position it read in the file (because you've made the file smaller) it will reindex the whole file instead.

So, first of all Splunk will add event a,b,c to its index. Then after your modification the source file will only have a,b. Splunk looks at the file, sees that it's changed, and because it can't find a valid position to search from (for the reason described above) it will reindex the whole file, which means event a,b. This will result in the events a,b,c,a,b in Splunk's index.

Ayn
Legend

No, you can't "avoid" it - it's an integral part of how Splunk works.

As for the second question, if you add an event to the file, Splunk will just carry on from its last known position (c) and read until the new end of the file (after d), so it will not have to reindex the whole file.

0 Karma

benjiminhugh
Explorer

And why when i add d to the file, it is a,b,c,d. why not be a,b,c,a,b,c,d

0 Karma

benjiminhugh
Explorer

Is there any way to avoid this?

0 Karma
Get Updates on the Splunk Community!

Detecting Remote Code Executions With the Splunk Threat Research Team

REGISTER NOWRemote code execution (RCE) vulnerabilities pose a significant risk to organizations. If ...

Observability | Use Synthetic Monitoring for Website Metadata Verification

If you are on Splunk Observability Cloud, you may already have Synthetic Monitoringin your observability ...

More Ways To Control Your Costs With Archived Metrics | Register for Tech Talk

Tuesday, May 14, 2024  |  11AM PT / 2PM ET Register to Attend Join us for this Tech Talk and learn how to ...