Is this the fastest way? (sub search?)

howyagoin · ‎02-21-2012

Hi,

I get the feeling that there's a better/faster way for me to do what I'm doing. I have a query such as this:

index=bigger [ search index=smaller source="stuff.txt" | fields data | rename data as query ] | table EventTime,info

What I'm looking for is the "data" field in the "smaller" index and "stuff.txt" source anywhere in the "bigger" index. The "data" field happens to occur in the "info" fields in the "bigger" database...

Is this the best way to do such a query, or is there a better option that I'm overlooking? I thought about a "join" but fields are not named consistently - probably fixable, but, wasn't sure which approach is the fastest.

Thanks.

gkanapathy · ‎02-21-2012

almost.

provided there are no duplicate values of data in stuff.txt, and that there are no more than 10,000 distinct values, then this is the fastest way:

index=bigger [ search index=smaller source="stuff.txt" | dedup data | fields data | rename data as info ] | table EventTime,info

Note the dedup and the rename of the field to info, which you said was the field you were looking for. In version 4.3+, you can do the following with the return command, which is slightly easier to read:

index=bigger [ search index=smaller source=stuff.txt | return 10000 info=data ] | table EventTime,info

howyagoin · ‎02-22-2012

Thanks - the data was already unique in the "stuff.txt" file, so the dedup didn't add much. For some reason "return" seems to take significantly longer than my existing approach - as does the renaming of "data" as "info"...huh.

Is this the fastest way? (sub search?)

Extending Observability Content to Splunk Cloud

More Control Over Your Monitoring Costs with Archived Metrics!

New in Observability Cloud - Explicit Bucket Histograms