Hi,
I've got a sourcetype which has around 100,000 values to a field across 225,000,000 events per day, and another sourcetype which has a total of around 5000 values/events and is static (very little change over the course of a year).
What is the most efficient way to find out IF the second sourcetype has any occurrence in the first, possibly going back 30+ days? I was leaning towards a summary-index based query conducted every few hours, to extract the unique values of the large sourcetype, then check the smaller against that - but even that would take a while.
Looking at the various options, such as "return" and "join" - or others - not sure what is the most efficient.
I don't want all of the values from the larger source that contain the smaller, indeed, I just want a list of the smaller sourcetype values that also occur in the much larger sourcetype.
Thanks!
Subsearch should be most efficient here:
sourcetype=big [ sourcetype=small | return 6000 Value ]
| dedup Value
assuming that Value
is the field name containing the value and is the same in both sourcetypes. If not, there are little tweaks to the return
command to handle it.
Subsearch should be most efficient here:
sourcetype=big [ sourcetype=small | return 6000 Value ]
| dedup Value
assuming that Value
is the field name containing the value and is the same in both sourcetypes. If not, there are little tweaks to the return
command to handle it.
Thought so - was doing that, but it's still going to take many hours (days?) to run. Likely I'll have to build a better mousetrap here, as the data is just too vast to do the full 30 days worth of querying I need to.