Hi!
I am trying to run a search where it counts the number of new users who have made purchases in the previous day, and the number of old users who have purchased in the previous day. A sample log is:
"2015-07-27 05:15:34" splunk_index_order=20150727052534, id=1182317, number="R539008334", item_total=1000.00, total=1084.25, state="complete", adjustment_total=14.25, user_id=845817, completed_at=1437974134993, bill_address_id=12502, ship_address_id=12501, payment_total=84.25, shipping_method_id=0, shipment_state="ready", payment_state="paid", email="insertemailhere@yahoo.com", special_instructions="NULL", created_at=1437973787491, updated_at=1437974145529, currency="USD", last_ip_address="NULL", created_by_id=1178357, oms_id="NULL", campaign_source="NULL", has_gift_box=## NOT SUPPORTED TYPE ##, abandoned_email_sent_at=NULL
Basically, the issue is, the only way to check if a user is distinct is by their user_id tag. How can I find the unique user_id fields in the past day?
Before anyone writes
... | dc(user_id) | ...
that doesn't work, as it doesn't do a true distinct count because the user could have ordered two days previously or three years previously, and would still show up as a unique user as the time range isn't constricted.
Is this search possible in Splunk? I can't seem to figure it out.
Thanks for any and all answers. 🙂
Try something like this (probably the fastest)
Update- Following search may give incorrect result as I forgot to add constraint to consider only yesterday's data for. Try the updated query*
Your base search earliest=-30d@d latest=@d | eval PurchaseDate=strftime(_time,"%Y-%m-%d") | stats first(PurchaseDate) as First last(PurchaseDate) as LastPurchaseDate by user_id | where LastPurchasedDate=strftime(now()-86400,"%Y-%m-%d")| eval UserType=if(First=LastPurchaseDate ,"New User","Old User") | table user_id, UserType, LastPurchaseDate
Try something like this (probably the fastest)
Update- Following search may give incorrect result as I forgot to add constraint to consider only yesterday's data for. Try the updated query*
Your base search earliest=-30d@d latest=@d | eval PurchaseDate=strftime(_time,"%Y-%m-%d") | stats first(PurchaseDate) as First last(PurchaseDate) as LastPurchaseDate by user_id | where LastPurchasedDate=strftime(now()-86400,"%Y-%m-%d")| eval UserType=if(First=LastPurchaseDate ,"New User","Old User") | table user_id, UserType, LastPurchaseDate
Oh wow... that was amazingly simple. I should have thought of it that way. How are you so damn good at splunk...
You've literally answered every single one of the questions I have asked lol.
I see you already you have accept an answer but here is another way.
... | eval rt=relative_time(now(),"-1d@d")| eval ts=_time| eval purchasedToday=if(rt=<ts, userID, null()) | purchasedYesterday=(rt>ts, userID, null()) | table userID, purchasedToday, purchasedYesterday | selfjoin userID | stats dc(userID) as TotalDC, dc(purchasedToday) as purchasedTodayDC, dc(purchasedYesterday) as purchasedYesterdayDC
Doesn't work. purchasedYesterday isn't a command is the error thrown. What command did you want before purchasedYesterday?
typo you need a eval. *eval purchasedYesterday=(rt>ts, userID, null()) *
it didn't work still! I tried that initially. No worries though, the answer that's accepted does it beautifully!
This should give you what you need to start; let's say you run this for "Last 30 days":
latest=0d@d YourBaseSearchHere | stats values(user_id) AS user_id | eval type=Last30Days | append [search earliest=0@d latest = now YourBaseSearchHere | stats values(user_id) AS user_id | eval type=Yesterday] | stats dc(type) AS numTypes values(*) AS * BY user_id | eval userType=if((numTypes==1), "New User", "Old User") | where type=Yesterday
In this search if a user is in the logs for the last 29 days, he is an "old user" but if he is only today, he is a "new user". This is probably not exactly what you need but you can modify the mechanics easily enough now that you have a starting point.
Amazing. Perfect Answer. One issue. It throws an error at the "earliest" part.
success! But if you look above at the accepted answer, he does it really nicely and simply instead of in a separate nested search.
Yes, sometimes I answer the question as asked and sometimes I go smarty-pants and answer the question that you should have asked. In this case, since I was already making several broad assumptions, I figured I would stick closer to what you asked. I do thank you for the up-vote, though.
Also do note that you should be able to "Accept" more than one answer if more than one of them works. This would be totally appropriate in this case.
ACK! I had a typo. Try it now (edited and resaved).