Problem with csv file import: how to prevent event...

BastianSchlaak · ‎01-14-2013

Hello,

I am importing a csv file (database dump) with the following format:

Header:

FirstName; LastName; EntryDate; ExitDate; InternalName; Remarks; Description; Phone; PhoneMobile; City; Building; Floor; Room; CentralAccount; DefaultEmailAddress; IsInActive; IsTerminalServerAllowed; IsExternal; PersonalTitle; PersonnelNumber; XDateInserted; XDateUpdated

Example Event:

XXXX;XXXX;XX.XX.XXXX XX:XX:XX;XX.XX.XXXX XX:XX:XX;XXXX, XXXX;;TER, 2012-02-27;;;;;;;XXXXX;XXXX@XXXX.de;True;True;False;;00430160;XX.XX.XXXX;XX.XX.XXXX

Splunkd does the import and slo indexes the data, but it only adds the data and does not compare it or delete the old dump data. So after four/n import/index-cycles i do have every event four/n times in splunk.

I configured the import with the GUI but found no way to prevent my data from being added instead of actualized. What do i have to do?

Lucas_K · ‎01-14-2013

You need to craft your database query so that your only exporting newer events.

You could do an initial dump to get historical data populated but after that use a more refined query.

Looking at your example data it doesn't seem to be a time series so it is probably better if you used Ayn's suggestion of just using it as a lookup.

BastianSchlaak · ‎01-14-2013

the dump file has about 120k events in it. i do not want to eidt the data with spunk. But i think it must be possible that spunk either compares still imported events with new events and only imports new events or that it deletes old events before importing new events.

Ayn · ‎01-14-2013

You're free to think whatever you want, but there are no mechanisms within Splunk to compare new data with existing data during indexing.

Ayn · ‎01-14-2013

You can't modify existing data in the index. Splunk isn't a general-purpose database where you can do something like that.

If you're just working with CSV data and it's not large volumes of data, you could use the CSVs as lookups and work with them directly that way using inputlookup, without going via a Splunk index.

Problem with csv file import: how to prevent event doubling?

Adoption of RUM and APM at Splunk

Routing logs with Splunk OTel Collector for Kubernetes

Welcome to the Splunk Community!