Knowledge Management

Import data without duplicates

alucarddjin
Path Finder

I have a missing set of data. I've been given a new set of data to fill the gaps but there are some duplicates in the raw file to what is already in splunk and I need a way to import the non duplicate data.

So far I've managed to import the new data into a separate index and used a query to remove the items that are already in the main index then tired the collect command to put the values into another index (I've used a dummy one to start with so I don't mess up my main index). However when the data is copied it messes up some of the date formats (turns them to epoch) and doesn't pick up the _time field correctly.

Current code:

    (index=main sourcetype="sourcetype1") OR (index=sourceindex" ) 
    | eventstats count by deviceCustomDate1 fileName 
    | search count=1
    | collect index=sourceindex sourcetype=sourcetype1

Then if I look at the results for one of the record in both indexs I get this

  _time               deviceCustomDate1           index
  2019-05-16 23:47:21   2019/05/16 22:47:21 UTC     sourceindex
  2019-05-17 00:03:29   1558046841000                 destinationindex

Am I missing something? Is Collect the right tool to use?

Thanks in advance.

0 Karma
Get Updates on the Splunk Community!

Built-in Service Level Objectives Management to Bridge the Gap Between Service & ...

Wednesday, May 29, 2024  |  11AM PST / 2PM ESTRegister now and join us to learn more about how you can ...

Get Your Exclusive Splunk Certified Cybersecurity Defense Engineer at Splunk .conf24 ...

We’re excited to announce a new Splunk certification exam being released at .conf24! If you’re headed to Vegas ...

Share Your Ideas & Meet the Lantern team at .Conf! Plus All of This Month’s New ...

Splunk Lantern is Splunk’s customer success center that provides advice from Splunk experts on valuable data ...