I have a python script with runs daily and saves output in csv file
for example: if i run that script today it will get the data from april 1st to today date(04/21/2021) and if i run tomorrow it will get the data from april 1st to tomorrow date (04/22/2021) and with different file name every time we run
i want to onboard this data into splunk with out duplicate data
how can we do that?
we have a field name called start_time this field we are taking as time field
for example: start_time field value = 04/21/2021 10.30
example: start_time field value = 04/22/2021 10.30
Thanks in advance
Hi,
Then Splunk avoids re-indexing duplicate data which is built-in, have you configured the monitors then share inputs.conf and sample data files.
Hi @vikram1583
How the data looks like in both files they change every time script runs?
Instead index both files and remove duplicates using Splunk commands like - dedup, dc etc... depends on your use case.
----------------------------------------------
An upvote would be appreciated if it helps!
Hi @venkatasri thanks for your response. its not about only 2 files i will run that script every day if i inject those files everyday license usage will increase so i just want to inject new data
data will be same for previous dates it just adds new data for current date