Hello,
I am importing a csv file (database dump) with the following format:
Header:
FirstName; LastName; EntryDate; ExitDate; InternalName; Remarks; Description; Phone; PhoneMobile; City; Building; Floor; Room; CentralAccount; DefaultEmailAddress; IsInActive; IsTerminalServerAllowed; IsExternal; PersonalTitle; PersonnelNumber; XDateInserted; XDateUpdated
Example Event:
XXXX;XXXX;XX.XX.XXXX XX:XX:XX;XX.XX.XXXX XX:XX:XX;XXXX, XXXX;;TER, 2012-02-27;;;;;;;XXXXX;XXXX@XXXX.de;True;True;False;;00430160;XX.XX.XXXX;XX.XX.XXXX
Splunkd does the import and slo indexes the data, but it only adds the data and does not compare it or delete the old dump data. So after four/n import/index-cycles i do have every event four/n times in splunk.
I configured the import with the GUI but found no way to prevent my data from being added instead of actualized. What do i have to do?
You need to craft your database query so that your only exporting newer events.
You could do an initial dump to get historical data populated but after that use a more refined query.
Looking at your example data it doesn't seem to be a time series so it is probably better if you used Ayn's suggestion of just using it as a lookup.
the dump file has about 120k events in it. i do not want to eidt the data with spunk. But i think it must be possible that spunk either compares still imported events with new events and only imports new events or that it deletes old events before importing new events.
You're free to think whatever you want, but there are no mechanisms within Splunk to compare new data with existing data during indexing.
You can't modify existing data in the index. Splunk isn't a general-purpose database where you can do something like that.
If you're just working with CSV data and it's not large volumes of data, you could use the CSVs as lookups and work with them directly that way using inputlookup
, without going via a Splunk index.