Splunk Search

Adding dedup _raw before timechart returns 0 results

tommy_o
Explorer

I apologize if this is asked already but I search to no avail.

When writing a Splunk query that will eventually be used for summary indexing using sitimechart, I have this query:

index=app sourcetype=<removed> host=<removed> earliest=-10d
    | eval Success_Count=if(scs=="True",1,0)
    | eval Failure_Count=if(scs=="False",0,1)
    | timechart span=1d sum(Success_Count) as SuccessCount sum(Failure_Count) as FailureCount count as TotalCount by host

Results are as expected. However, some data was accidentally indexed twice, so I need to remove duplicates. If I'm doing a regular search, I just use | dedup _raw to remove the identical events. However, if I run the following query, I get zero results returned (no matter where I put | dedup _raw😞

index=app sourcetype=<removed> host=<removed> earliest=-10d
    | dedup _raw
    | eval Success_Count=if(scs=="True",1,0)
    | eval Failure_Count=if(scs=="False",0,1)
    | timechart span=1d sum(Success_Count) as SuccessCount count(Failure_Count) as FailureCount count as TotalCount by host

What am I doing wrong? I'm using Splunk 4.3.2.

0 Karma

tommy_o
Explorer

They have the same timestamp

0 Karma

somesoni2
Revered Legend

Try following

index=app sourcetype=<removed> host=<removed> earliest=-10d
    | fields _time, scs,host
    | dedup _time, scs,host
    | timechart span=1d count(eval(scs="True")) as SuccessCount count(eval(scs="False")) as FailureCount count as TotalCount by host

somesoni2
Revered Legend

Is your duplicate records issue resolved?

0 Karma

tommy_o
Explorer

Okay, thank you for the confirmation. That was written in an internal corporate document and I wasn't getting any summary data in my index -- I was thinking my use of eval on the same line as sitimechart may have been causing that problem (but glad to hear that it shouldnt be). Thanks again.

0 Karma

somesoni2
Revered Legend

I am able to use eval() in the sameline as sitimechart command (and don't see any restriction about same in the documentation).

tommy_o
Explorer

I was under the impression that I couldn't use eval() on the same line as sitimechart (which I will be switching over to once I've ironed out this duplicate problem). Is that not correct? This is essentially what my query looked like originally.

0 Karma

somesoni2
Revered Legend

True, best approach would to be to include all the fields which make an event unique in the "fields" and "dedup" clause, so that all those legit events are not getting filtered out.

0 Karma

lukejadamec
Super Champion

Careful with that. Depending on the volume and timestamp extraction you can have many legit non-duplicate events with the same timestamp that will hidded by deduping _time.

0 Karma

somesoni2
Revered Legend

When you said the data was duplicated, duplicate events have same timestamp or different?

0 Karma

tommy_o
Explorer

There's a type-o on the eval Failure_Count line, but the reCaptcha blocked me from editing 😞 Edit: there should have been sum(), sum(), count but again, captcha is keeping me from fixing that.

0 Karma
Get Updates on the Splunk Community!

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...

New in Observability Cloud - Explicit Bucket Histograms

Splunk introduces native support for histograms as a metric data type within Observability Cloud with Explicit ...