I am looking to identify the earliest event for each field-value pair. For example, given a list of usernames from AD, I want to locate the very first email they sent without having to search all emails from all users throughout all time. My thinking is perhaps to do a search on AD for the username, email, and account creation date (whenCreated), and then feed the emails and whenCreated into a map function onto a search of index=msexchange .. Like the following:
... my AD search for username, email and whenCreated | eval time_b4=relative_time(time_anchor, "-1h") | eval time_l8r=relative_time(time_anchor, "+1h") | map search="search index=index1 sourcetype=sourcetype1 field1=$field$ earliest=$time_b4$ latest=$time_l8r$"
Is this the most efficient approach? Will map join events found to the original events populating the $field$ and $time_b4$/$time_l8r$? Or will it simply add events to the results? Or will it filter the existing events based on the events found?
map
rarely is the most efficient approach. Consider something like this:
index=msexchange [search that produces the user ids or email addresses of the users you want to investigate | fields user] | stats earliest(_time) as _time by user
That will only go through the events for the users you're interested in, going through every bucket once. With map
you will inevitably go through all the data once per user, making that much less efficient.