Getting Data In

monitor - File& directories - file is not getting updated on recent changes

alfiyashaikh
New Member

I have added file ABC.csv from my local directory and uploaded it on splunk by "monitor" adding data option.

source="C:\Alfiya\TASKS\MACRO_in_splunk\cases_and_feedback lookup\PICKUP_FOLDER\ABC.csv" host="P2B-H7TN882" index="test" sourcetype="csv"

after i run a search on my file it gives me 2000 events as result.

I deleted some data from ABC.csv file in my local machine . so when i rerun the search on my splunk instance , i still get 2000 event as result.
I should ideally get less number of event now as i have reduced the data.

I Don't know where I am going wrong or may be I am not properly using "Monitor" adding data option

Please guide me through .

0 Karma

alfiyashaikh
New Member

Hi
nickhillscpl,

In my case some fields of earlier events can have changed value . Will splunk re-run searches on those too.

0 Karma

Elsurion
Communicator

Hi alfiyashaikh

If you reread the whole file, yes. Only then are new values in your index.

But as julio19 mentioned better to delete the old data by adding the | delete

source="C:\Alfiya\TASKS\MACRO_in_splunk\cases_and_feedback lookup\PICKUP_FOLDER\ABC.csv" host="P2B-H7TN882" index="test" | delete

This will reduce duplicate data also any confusion about different named keys for the same values.

but you have to give yourself the can_delete role (capability delete_by_keyword) even when you are an admin.

0 Karma

nickhills
Ultra Champion

It seems to me you might be better off getting the data another way.

option 1:
Where is the data to start with - it sounds like the CSV is an export or a report of some kind. If this data is in a database to start with, maybe you could use DBX to take the data straight from source?

option 2:
Use a external tool to pre-process your csv file, and send the results to splunk.
An example might be a python (or even bash) script which monitors the file and reports any lines which have been changed, which you can setup as a scripted input.

  1. is probably more robust, 2 is probably easier/faster
If my comment helps, please give it a thumbs up!
0 Karma

nickhills
Ultra Champion

Super quick and inelegant way, more to illustrate the concept that a working example!

#!/bin/bash

#set the next line to the name of your input file
inputFile="ABC.csv"

#create a history file to compare against next run
historyFile="history"

#compare the two files, and look for any lines which have changed. On first run, output everything
d=$(diff -N $inputFile $historyFile)

#copy the new file to the history file
cp $inputFile $historyFile

#write any changes to stdout so splunk can read them
echo $d
If my comment helps, please give it a thumbs up!
0 Karma

julio19
Explorer

If you want index new data, try replace the file with new data and not delete it.

If you want delete data from Splunk using :

"Your Search from data delete" | delete
0 Karma

nickhills
Ultra Champion

That's not really how it works.

Monitoring a file means watching it for new data. Think about a log file - new lines are added to the end, and this is what Splunk is monitoring for.

If you remove lines from the file (or even delete the file entirely) this data is not removed from Splunk.

If my comment helps, please give it a thumbs up!

alfiyashaikh
New Member

Hi
nickhillscpl,
In my case some fields of earlier events can have changed value . Will splunk re-run searches on those too.

0 Karma
Get Updates on the Splunk Community!

Index This | Forward, I’m heavy; backward, I’m not. What am I?

April 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

A Guide To Cloud Migration Success

As enterprises’ rapid expansion to the cloud continues, IT leaders are continuously looking for ways to focus ...

Join Us for Splunk University and Get Your Bootcamp Game On!

If you know, you know! Splunk University is the vibe this summer so register today for bootcamps galore ...