Splunk Search

dedup within timechart

sloshburch
Splunk Employee
Splunk Employee

I have several searches that I am trying to optimize now that our platform is on splunk 5+. My preference is to leverage report acceleration because of its ability to dynamically back-fill the way it efficiently runs in the background.

Unfortunately, several of my searches use a dedup on multiple fields (ie: dedup field1 field2 field3) and then runs timechart against one of those fields (ie: timechart span=1d field1). The use of dedup before timechart prevents report acceleration from being used as its not a streamable command.

I'm trying to find a way to eliminate to enable this search for report acceleration.

I've tried removing the dedup and playing with distinctcount(field1 field2 field3) but that failed. I also tried timechart span=1d dc(field1) by field2 field3 but that also is not allowed. I'm suspicious that I'm overlooking a trivial way to do this. Perhaps the community can enlighten me?

Tags (2)
0 Karma
1 Solution

sloshburch
Splunk Employee
Splunk Employee

I believe I found a solution: do a stats count by field1 field2 field3 where field3 is the timepan (in this case, just the day of the _time). If I'm thinking clearly, that will dedup by those three fields. Then, if I want a total count, I can do another stats count. This results in a distinct count. I believe this should be more efficient than the eval approach.

View solution in original post

0 Karma

sloshburch
Splunk Employee
Splunk Employee

I believe I found a solution: do a stats count by field1 field2 field3 where field3 is the timepan (in this case, just the day of the _time). If I'm thinking clearly, that will dedup by those three fields. Then, if I want a total count, I can do another stats count. This results in a distinct count. I believe this should be more efficient than the eval approach.

0 Karma

sloshburch
Splunk Employee
Splunk Employee

The goal is to enable report acceleration on a pre-existing saved search - but the saved search was designed with dedup on several fields before the timechart command. So the folks that use the saved search want to timechart some distinct values. Is that more clear? Thanks for the clarifying questions.

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

And what are you trying to achieve here, in natural language rather than SPL?

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Notice the use of dc(eval(fieldD.";".fieldE.";".Date1)). My thought was to create a unique string from the fields I would dedup against, then get a distinct count of those. Again, I assume I'm hacking this and there's probably a more trivial approach I should be using.

0 Karma

sloshburch
Splunk Employee
Splunk Employee

Fair request. I tried to abstract the company stuff so hopefully this still is clear for the community:

index=a ( sourcetype="b" OR sourcetype="c" ) ( source="/path/file1*" OR source="/path/file2*" ) fieldA=* ( fieldB=val1 OR fieldC=val2 )
| convert timeformat="%m/%d/%y" ctime(_time) as Date1
| timechart span=1day dc(eval(fieldD.";".fieldE.";".Date1)) as countFieldName
| convert timeformat="%m/%d/%y" ctime(_time) as Date
| table Date countFieldName

0 Karma

martin_mueller
SplunkTrust
SplunkTrust

Post your search and what you want to achieve with it - maybe there is an entirely different approach.

0 Karma
Get Updates on the Splunk Community!

Stay Connected: Your Guide to May Tech Talks, Office Hours, and Webinars!

Take a look below to explore our upcoming Community Office Hours, Tech Talks, and Webinars this month. This ...

They're back! Join the SplunkTrust and MVP at .conf24

With our highly anticipated annual conference, .conf, comes the fez-wearers you can trust! The SplunkTrust, as ...

Enterprise Security Content Update (ESCU) | New Releases

Last month, the Splunk Threat Research Team had two releases of new security content via the Enterprise ...