Splunk Search

Subsearch with stats first not working as expected

robertlabrie
Path Finder

My dataset has multiple events for a single _time. Batches get loaded whenever it's sent by a 3rd party. I have 25 unique sites that gets data sent. I have a query which finds the most recent _time for the data by site_code:

index=tgem sourcetype=tgem | stats first(_time) as _time by site_code

It works fine, and returns the expected value, which is 2014-01-12 for the site_code=BR. So far so good, so now I want all matching records for that _time, so I glue it into a subquery like this:

index=tgem sourcetype=tgem [search index=tgem sourcetype=tgem | stats first(_time) as _time by site_code | return 25 _time] | search site_code=BR

It returns all records for 2014-01-04, which is the earliest _time in my period (month to date). Any ideas? I've been pulling my hair out all day.

My requirement is to do some stats (min, max, avg) of a field in the dataset, by site_code, for wahtever the most recent set of data is for that site_code

Tags (3)
0 Karma

robertlabrie
Path Finder

So in the end two things happened.

  1. _time never worked for me as expected. I used another field in the data which uniquely identified the batch.
  2. I used format to customize the subsearch expansion. The job inspector was king to sorting that out. Seeing how the subsearch expanded.

The format sting I used was

format "(" "(" "AND" ")" "OR" ")"

Thanks to everyone for your replies.

lguinn2
Legend

The first and last functions are relative to the order of events - when you are dealing with time, earliest and latest are usually better. Also, you need to return the site code as well as the time; you don't need the return comand.

index=tgem sourcetype=tgem [search  index=tgem sourcetype=tgem | stats latest(_time) as _time by site_code ]

Use the search job inspector to see how the subsearch values are returned.

martin_mueller
SplunkTrust
SplunkTrust

For debugging subsearches you can take a look at the job inspector. That will show you what the subsearch evaluated to, giving you a chance to narrow the problem to either the subsearch or the outer search.

A different thought, merge the site_code filter into the first search command for potentially greater performance - unrelated to the results though.

0 Karma

somesoni2
Revered Legend

Use max(_time) instead of first(_time)

Get Updates on the Splunk Community!

Webinar Recap | Revolutionizing IT Operations: The Transformative Power of AI and ML ...

The Transformative Power of AI and ML in Enhancing Observability   In the realm of IT operations, the ...

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

ICYMI - Check out the latest releases of Splunk Edge Processor

Splunk is pleased to announce the latest enhancements to Splunk Edge Processor.  HEC Receiver authorization ...