Getting Data In

Event logs getting duplicated every collection

clintla
Contributor

I have a duplication problem w/ splunk- I know why but not how to fix it.

I have an event log that I have to extract every hour (entire log every time).

so if I get an error im searching for.

10:00am target event.

An hour later my script runs & I get a few more events it counts the previous
events with my time chart.

10:00am target event

11:05am target event.

Now my chart is incorrect due to now the previous error count -1 is now 2 since it has been re-indexed (3 events total). these logs go back a year (I cant clear them every time on the
device which would make it easy)
It should read 1 target event per hour but every hour- the previous events get doubled making
the chart incredibly inaccurate.

10:00am (2 events)

11:05am (1 event)

Seems like this would be something easy to fix- things like followtail dont seem to apply though.

If I could dedup on a chart that would be good. Dont think that works in a chart though. Anyone have another solution?

Tags (3)
0 Karma

kristian_kolb
Ultra Champion

I think you should rather try to fix your script so that it does not read the entire file each time it runs (if possible). By getting the input data correct, you don't have to worry about 'fixing' the output of your searches. Also, re-indexing the entire file each time consumes part of your license. That may not be a big issue if you read a small log file once an hour, but someday you might be asked to run the script once a minute....

/kristian

Takajian
Builder

Could you try following dedup command? I think this remove duplicated events. Please let me know if this help in your environment.

... | dedup _raw

Takajian
Builder

In your case, how it works? The dedup will be before timechart command.

sourcetype="getlog" | dedup _raw| timechart count by host useother="f"

But dedup command is a kinds of workaround. The ideal is to fix your script.

0 Karma

clintla
Contributor

sourcetype="getlog" | timechart count by host useother="f" | dedup _raw

add "| dedup _raw" & it goes from that sloped output to "No results"- that seems like it should have worked but I dont understand fully the command I guess.

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...