I have two indexes:
index 1 contains a list of domains and event_timestamp, index 2 contains a description for every domain.
I want to join the two and enrich all domains in index 1 with their description in index 2.
Now when I enrich I want to look at all of the information on index 2 and enrich all domains in the time frame I choose in search
I tried to do that with join:
source="events" | join query.domain [search index="events_enrich_with_desc" | rename event_domain AS query.domain ] earliest=-10y@y | search ...
But this join doesn't work properly (only 25% are enriched).
What am I missing here? if it can be done in a different way than join than I'm OK with it.
1) That earliest=
is in a strange place. What are you trying to do there?
Try this...
source="events"
| join type=left query.domain [ search index="events_enrich_with_desc" earliest=-10y@y
| dedup event_domain
| rename event_domain AS query.domain ]
| eval enrichfield=coalesce(enrichfield,"((not found))")
| search ...
That assumes that you want up to 10 years back from your enriched index, and that you only want the first answer for each domain.
1) That earliest=
is in a strange place. What are you trying to do there?
Try this...
source="events"
| join type=left query.domain [ search index="events_enrich_with_desc" earliest=-10y@y
| dedup event_domain
| rename event_domain AS query.domain ]
| eval enrichfield=coalesce(enrichfield,"((not found))")
| search ...
That assumes that you want up to 10 years back from your enriched index, and that you only want the first answer for each domain.
also, after that join i don't see the fields from Index 2
why is that? and how can i add those fields?
That join
should contain every field that is on either record. If not, please post your exact search language. It might help you to explicitly list the fields you want from each record, then take an example value for query.domain and verify that the fields are present.
source="events"
| fields query.domain ... and list all the fields from this one you want ...
| join type=left query.domain [ search index="events_enrich_with_desc" earliest=-10y@y
| dedup event_domain
| rename event_domain AS query.domain
| fields query.domain enrichfield ... and list all the fields from this one you want ...
]
| eval enrichfield=coalesce(enrichfield,"((not found))")
| search ...
OK ill be more specific, may index 2 contain 4 fields:
after the join i notice something very interesting, if i search all of domain description than after the:
eval description=coalesce(description,"((not found))")
i get only 3K domains description out of 27K (24L or "not found")
but if i search for a specific domain description wild card: like description="dga"
i get all domains with that description, about 17K domains
why is that?
it worked, thanks a lot!!!