Splunk Search

Stats DC With Table

IRHM73
Motivator

Hi, I wonder whether someone may be able to help me please.

I'm using the the search below to return values in a table.

auditSource="cato-filing" auditType="FilingStarted" | dedup "detail.Filing ID" | table "detail.*"

What I'd like to do is make this more efficient by changing the dedup to stats dc. Now I have used stats dc before, so I tried auditSource="cato-filing" auditType="FilingStarted" | stats dc(detail.Filing ID) By "detail.*" , but for the life of me, I can't work out how to return the table of results, because my effort returns no results.

I just wondered whether someone may be able to look at this please and let me know where I've gone wrong.

Many thanks and kind regards

Chris

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

So just to throw in my two cents:

auditSource="cato-filing" auditType="FilingStarted"  | rename detail.* as * | stats dc("Filing ID") as "Filing ID" by CompanyName Department Location
0 Karma

IRHM73
Motivator

HI @alacercogitatus, thank you for taking the time to reply to my post.

I tried the query you kindly provided, but unfortunately this create a table.

Many thanks and kind regards

Chris

0 Karma

somesoni2
SplunkTrust
SplunkTrust

Try something like this

auditSource="cato-filing" auditType="FilingStarted"  | table "detail.*" | stats latest(*) as * by "detail.Filing ID" 

This will do same function as dedup

IRHM73
Motivator

Hi @somesoni2, thank you for taking the time to my post.

I've tried the query you kindly provided but it doesn't work. Rather than displaying the "detail.*" fields it just displays the "detail.Filing ID".

Many thanks and kind regards

Chris

0 Karma

woodcock
Esteemed Legend

Like this:

 auditSource="cato-filing" auditType="FilingStarted" | stats values(detail.Filing ID)

woodcock
Esteemed Legend

To make it more flexible to cover more fields, you can do this:

auditSource="cato-filing" auditType="FilingStarted" | stats values(detail.*)
0 Karma

IRHM73
Motivator

Hi @woodcock, thank you for taking the time to reply to my post.

I've tried the query you kindly provided, and although it does extract all the fields which is great, rather than each record being on it's own row, they are all contained within one.

Many thanks and kind regards

Chris

0 Karma

woodcock
Esteemed Legend

OK, then just add this:

... | foreach values* [ mvexpand <<FIELD>> ]
0 Karma

IRHM73
Motivator

Hi @woodcock, thank you for coming back to me with this, but unfortunately this doesn't change the layout of the results.

I think I'll have to say with the 'dedup' because I think what I want to achieve isn't possible.

Many thanks and kind regards

Chris

0 Karma

woodcock
Esteemed Legend

There is no doubt that it is possible but if you'd like to keep your original solution, that is up to you. The problem is that your original search implied some things about your dataset that clearly aren't true (else our solutions would have worked). Given this, it is likely that your original search is not doing exactly what you think it is doing either, so beware! It might be best if you back all the way up and show us some sample events including fields:

... | eval raw=_raw | table *
0 Karma

woodcock
Esteemed Legend

What makes you think that dedup is inefficient? What makes you think that dedup is any different from stats latest(_raw) AS _raw (which will be pretty much the same work as stats dc)?

0 Karma

alacercogitatus
SplunkTrust
SplunkTrust

dedup compared to a stats dc by is indeed less efficient. When using dedup most (if not all) of the _raw events must be returned to the search head. When using stats, only the dc and the by clause field are returned to the search head - which gives a better performance when re-assembling on the search head to do the final stats.

0 Karma

immortalraghava
Path Finder

As per my understanding from your question, you could rename before stats something like this.

| rename "detail.Filing ID" as FilingID | stats dc(FilingID)

Hope this helps !

0 Karma

IRHM73
Motivator

Hi @immortalraghaven,

Thank you for taking the time to reply to my post.

Unfortunately the rename won't work because I'm trying to use a 'stats dc' for multiple fields as per my last post to @MuS.

Many thanks and kind regards

Chris

0 Karma

MuS
SplunkTrust
SplunkTrust

Hi IRHM73,

two things that I can think of:

  1. auditSource="cato-filing" auditType="FilingStarted" | stats dc(detail.Filing ID) By "detail.*" will not work because the dc(detail.Filing ID) contains a space and should be dc("detail.Filing ID")
  2. auditSource="cato-filing" auditType="FilingStarted" | stats dc(detail.Filing ID) By "detail.*" will not work because by "detail.*" contains a wild card; take a field which is unique like detail.foo or detail.baz

You could try auditSource="cato-filing" auditType="FilingStarted" | stats dc("detail.Filing ID") BY "detail.Filing ID" | table "detail.Filing ID" or just use dedup .

Hope this helps ... and this is un-tested 😉

cheers, MuS

IRHM73
Motivator

Hi @MuS thank you for coming back to me with this. I'm trying to get away from dedup because of it's over use of resources.

I have tried the query you kindly provided and I appreciate that this was untested, but unfortunately this doesn't work.

However I've found that rather than using "detail.*" I can use the fields:

detail.CompanyName
detail.Department
detail.Location

I'm not sure whether that helps.

Many thanks and kind regards

Chris

0 Karma

MuS
SplunkTrust
SplunkTrust

Just for the fun of it, I ran all searches as run everywhere examples with fixed time range and leave you to choose the fastest/best for your use case:

index=_internal earliest=-2d@d latest=-1d@d | dedup sourcetype | table sourcetype
This search has completed and has returned 8 results by scanning 2,907,591 events in 249.892 seconds.
index=_internal earliest=-2d@d latest=-1d@d | stats dc(sourcetype) by sourcetype
This search has completed and has returned 8 results by scanning 2,907,591 events in 229.72 seconds.
index=_internal earliest=-2d@d latest=-1d@d | stats latest(sourcetype) by sourcetype
This search has completed and has returned 8 results by scanning 2,907,591 events in 228.705 seconds.
index=_internal earliest=-2d@d latest=-1d@d | table sourcetype | stats latest(*) as * by sourcetype
This search has completed and has returned 8 results by scanning 2,907,591 events in 382.519 seconds.
index=_internal earliest=-2d@d latest=-1d@d | table sourcetype | stats values(sourcetype)
This search has completed and has returned 1 result by scanning 2,907,591 events in 388.672 seconds.

cheers, MuS

0 Karma

IRHM73
Motivator

Hi @Mus, thank you very much for this.

It provide some interesting reading.

Kind Regards

Chris

0 Karma

MuS
SplunkTrust
SplunkTrust

just to be complete, here is the latest command ran on the same server as yesterday:

index=_internal earliest=-3d@d latest=-2d@d | stats values(sourcetype) | foreach values* [ mvexpand <<FIELD>> ]
This search has completed and has returned 8 results by scanning 2,907,591 events in 248.982 seconds.
0 Karma

IRHM73
Motivator

Hi, thank you very much for this.

I think I'll have to stay with the dedup because what I want to achieve doesn't seem possible.

Kind Regards

Chris

0 Karma
Get Updates on the Splunk Community!

What's new in Splunk Cloud Platform 9.1.2312?

Hi Splunky people! We are excited to share the newest updates in Splunk Cloud Platform 9.1.2312! Analysts can ...

What’s New in Splunk Security Essentials 3.8.0?

Splunk Security Essentials (SSE) is an app that can amplify the power of your existing Splunk Cloud Platform, ...

Let’s Get You Certified – Vegas-Style at .conf24

Are you ready to level up your Splunk game? Then, let’s get you certified live at .conf24 – our annual user ...