Getting Data In

Problem with csv file import: how to prevent event doubling?

BastianSchlaak
New Member

Hello,

I am importing a csv file (database dump) with the following format:

Header:

FirstName; LastName; EntryDate; ExitDate; InternalName; Remarks; Description; Phone; PhoneMobile; City; Building; Floor; Room; CentralAccount; DefaultEmailAddress; IsInActive; IsTerminalServerAllowed; IsExternal; PersonalTitle; PersonnelNumber; XDateInserted; XDateUpdated

Example Event:

XXXX;XXXX;XX.XX.XXXX XX:XX:XX;XX.XX.XXXX XX:XX:XX;XXXX, XXXX;;TER, 2012-02-27;;;;;;;XXXXX;XXXX@XXXX.de;True;True;False;;00430160;XX.XX.XXXX;XX.XX.XXXX

Splunkd does the import and slo indexes the data, but it only adds the data and does not compare it or delete the old dump data. So after four/n import/index-cycles i do have every event four/n times in splunk.

I configured the import with the GUI but found no way to prevent my data from being added instead of actualized. What do i have to do?

Tags (1)
0 Karma

Lucas_K
Motivator

You need to craft your database query so that your only exporting newer events.

You could do an initial dump to get historical data populated but after that use a more refined query.

Looking at your example data it doesn't seem to be a time series so it is probably better if you used Ayn's suggestion of just using it as a lookup.

0 Karma

BastianSchlaak
New Member

the dump file has about 120k events in it. i do not want to eidt the data with spunk. But i think it must be possible that spunk either compares still imported events with new events and only imports new events or that it deletes old events before importing new events.

0 Karma

Ayn
Legend

You're free to think whatever you want, but there are no mechanisms within Splunk to compare new data with existing data during indexing.

0 Karma

Ayn
Legend

You can't modify existing data in the index. Splunk isn't a general-purpose database where you can do something like that.

If you're just working with CSV data and it's not large volumes of data, you could use the CSVs as lookups and work with them directly that way using inputlookup, without going via a Splunk index.

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...