The two csv files I have are customers (fields= customerName,customerID,region,IsActive) with one row per customer and the second is Purchases (fields=customerName, customerID, productName, purchaseDate) with one row per purchase. I am trying to create a list of all the active customers in Northwest region and when they first purchased 5 specific products (the purchases csv include 100+ different products but I only care about these 5). Customers could have bought 0, 1, 2, 3, 4 or all 5 of the target products but after the join I want the customer record to show up five times (once for each product) regardless of it was purchased.
currently this is what I have:
| from inputlookup:"customer.csv"
| where (Region="NW") AND (IsActive="True")
|join customerID type=left max=0
[|inputlookup Purchases.csv
|where (productName="Train" OR productName="Car" OR productName="Truck" OR productName="Bike" OR productName="Scooter")]
This gets me (Tom bought 2 products, Jill bought all 5 and Ken bought none):
customerName,customerID,region,IsActive,customerName, customerID, productName, purchaseDate
Tom,1,NW,True,Tom,1,Train,7/25/12
Tom,1,NW,True,Tom,1,Car,7/2/12
Jill,2,NW,True,Jill,2,Train,5/3/13
Jill,2,NW,True,Jill,2,Car,7/12/15
Jill,2,NW,True,Jill,2,Truck,8/9/10
Jill,2,NW,True,Jill,2,Bike,1/2/13
Jill,2,NW,True,Jill,2,Scooter,4/5/7
What I want is (all customers to show up 5 times so Tom would have 3 rows without a purchase date, Jill would stay the same and Ken would have 5 rows without dates):
Tom,1,NW,True,Tom,1,Train,7/25/12
Tom,1,NW,True,Tom,1,Car,7/2/12
Tom,1,NW,True,Tom,1,Truck,
Tom,1,NW,True,Tom,1,Bike,
Tom,1,NW,True,Tom,1,Scooter,
Jill,2,NW,True,Jill,2,Train,5/3/13
Jill,2,NW,True,Jill,2,Car,7/12/15
Jill,2,NW,True,Jill,2,Truck,8/9/10
Jill,2,NW,True,Jill,2,Bike,1/2/13
Jill,2,NW,True,Jill,2,Scooter,4/5/7
Ken,3,NW,True,Ken,3,Train,
Ken,3,NW,True,Ken,3,Car,
Ken,3,NW,True,Ken,3,Truck,
Ken,3,NW,True,Ken,3,Bike,
Ken,3,NW,True,Ken,3,Scooter,
Give this a try
| from inputlookup:"customer.csv"
| where (Region="NW") AND (IsActive="True")
| eval productName=split("Train##Car##Truck##Bike##Scooter","##")
| mvexpand productName
| lookup Purchases.csv customerID productName OUTPUT purchaseDate
Give this a try
| from inputlookup:"customer.csv"
| where (Region="NW") AND (IsActive="True")
| eval productName=split("Train##Car##Truck##Bike##Scooter","##")
| mvexpand productName
| lookup Purchases.csv customerID productName OUTPUT purchaseDate
Thanks!! This is almost perfect. There are some customers who bought two different versions of the same product and I only want most recent version. I tried to remove it with:
|eval CustProd= customerName+productName
|dedup UserCert sortby -purchaseDate
but it didn't work. Do you have any ideas of how to get rid of the extra values?