monitor - File& directories - file is not getting...

alfiyashaikh · ‎12-14-2017

I have added file ABC.csv from my local directory and uploaded it on splunk by "monitor" adding data option.

source="C:\Alfiya\TASKS\MACRO_in_splunk\cases_and_feedback lookup\PICKUP_FOLDER\ABC.csv" host="P2B-H7TN882" index="test" sourcetype="csv"

after i run a search on my file it gives me 2000 events as result.

I deleted some data from ABC.csv file in my local machine . so when i rerun the search on my splunk instance , i still get 2000 event as result.
I should ideally get less number of event now as i have reduced the data.

I Don't know where I am going wrong or may be I am not properly using "Monitor" adding data option

Please guide me through .

alfiyashaikh · ‎12-14-2017

Hi
nickhillscpl,

In my case some fields of earlier events can have changed value . Will splunk re-run searches on those too.

Elsurion · ‎12-15-2017

Hi alfiyashaikh

If you reread the whole file, yes. Only then are new values in your index.

But as julio19 mentioned better to delete the old data by adding the | delete
source="C:\Alfiya\TASKS\MACRO_in_splunk\cases_and_feedback lookup\PICKUP_FOLDER\ABC.csv" host="P2B-H7TN882" index="test" | delete
This will reduce duplicate data also any confusion about different named keys for the same values.

but you have to give yourself the can_delete role (capability delete_by_keyword) even when you are an admin.

nickhills · ‎12-15-2017

It seems to me you might be better off getting the data another way.

option 1:
Where is the data to start with - it sounds like the CSV is an export or a report of some kind. If this data is in a database to start with, maybe you could use DBX to take the data straight from source?

option 2:
Use a external tool to pre-process your csv file, and send the results to splunk.
An example might be a python (or even bash) script which monitors the file and reports any lines which have been changed, which you can setup as a scripted input.

is probably more robust, 2 is probably easier/faster

If my comment helps, please give it a thumbs up!

nickhills · ‎12-15-2017

Super quick and inelegant way, more to illustrate the concept that a working example!

#!/bin/bash

#set the next line to the name of your input file
inputFile="ABC.csv"

#create a history file to compare against next run
historyFile="history"

#compare the two files, and look for any lines which have changed. On first run, output everything
d=$(diff -N $inputFile $historyFile)

#copy the new file to the history file
cp $inputFile $historyFile

#write any changes to stdout so splunk can read them
echo $d

If my comment helps, please give it a thumbs up!

julio19 · ‎12-14-2017

If you want index new data, try replace the file with new data and not delete it.

If you want delete data from Splunk using :

"Your Search from data delete" | delete

nickhills · ‎12-14-2017

That's not really how it works.

Monitoring a file means watching it for new data. Think about a log file - new lines are added to the end, and this is what Splunk is monitoring for.

If you remove lines from the file (or even delete the file entirely) this data is not removed from Splunk.

If my comment helps, please give it a thumbs up!

alfiyashaikh · ‎12-14-2017

Hi
nickhillscpl,
In my case some fields of earlier events can have changed value . Will splunk re-run searches on those too.

monitor - File& directories - file is not getting updated on recent changes

Index This | Forward, I’m heavy; backward, I’m not. What am I?

A Guide To Cloud Migration Success

Join Us for Splunk University and Get Your Bootcamp Game On!