Splunk Search

Question regarding join

therockhead
Path Finder

Beginner here, I'm trying to do the following in one query
1) Get all unique users and the count of users
2) Using all the unique users from query 1, search for who has done action "shop"
3) Show % of people who did the action "shop"

This is what i have tried so far ..

sourceTag=test index=test | stats dc(user_id) AS "Total Users" | JOIN user_id [search sourceTag=test index=test ACTION="shop" | stats dc(user_id) AS "Shoppers Users" ] | eval PercentageOfUsers=round(tonumber('Shopper Users') * 100 / tonumber('Total Users'))

Any ideas whats wrong?
Thanks,
Bill

Tags (2)
0 Karma
1 Solution

masonmorales
Influencer

Okay, sorry about that. Try this instead:

 sourceTag=test index=test 
 | table user_id 
 | dedup user_id
 | JOIN user_id type=outer [search sourceTag=test index=test ACTION="shop" | table user_id ACTION] 
 | fillnull value=NULL ACTION 
 | stats count(eval(ACTION="shop")) as ShopperCount, count(eval(ACTION!="shop")) as NonShopperCount
 | eval TotalUsers=ShopperCount+NonShopperCount
 | eval PercentageOfUsers=round(ShopperCount/TotalUsers*100,2)

View solution in original post

masonmorales
Influencer

Okay, sorry about that. Try this instead:

 sourceTag=test index=test 
 | table user_id 
 | dedup user_id
 | JOIN user_id type=outer [search sourceTag=test index=test ACTION="shop" | table user_id ACTION] 
 | fillnull value=NULL ACTION 
 | stats count(eval(ACTION="shop")) as ShopperCount, count(eval(ACTION!="shop")) as NonShopperCount
 | eval TotalUsers=ShopperCount+NonShopperCount
 | eval PercentageOfUsers=round(ShopperCount/TotalUsers*100,2)

masonmorales
Influencer

If this worked for you, please click "Accept Answer"

0 Karma

therockhead
Path Finder

Just trying it out now .. its looking better, the join is returning data that looks correct but the stats is not working.

0 Karma

masonmorales
Influencer

Could you provide more detail on the behavior you are seeing with stats please?

0 Karma

therockhead
Path Finder

So if it dont include stats the data format looks correct. If it add the following for example - stats count(eval(ACTION="shop")) as ShopperCount the result will be 0

0 Karma

masonmorales
Influencer

I extracted ACTION in my test environment with your data and it worked fine. I am guessing you have a different field name for ACTION than what was posted, or that it's not all uppercase. Field names are case sensitive. Could you please verify that the case/spelling of your extracted field for "ACTION" matches that of the example? If it does not, replace all occurrences of "ACTION" in my example with exactly how the "action" field name appears in your environment.

0 Karma

therockhead
Path Finder

Ok the issue was my Action was nested and the eval didnt like that .... so
stats count(eval(body.ACTION="shop")) would not work
I renamed body.ACTION to ACTION
and changed my query to
count(eval(ACTION="shop"))
and it now works great. Thanks for all your help!!

0 Karma

masonmorales
Influencer

Alternatively, you could do:

 sourceTag=test index=test 
  | table user_id 
  | dedup user_id
  | JOIN user_id type=outer [search sourceTag=test index=test | rex "(?i)action: (?P<ACTION>[^ ]+)" | search ACTION="shop" | table user_id ACTION] 
  | fillnull value=NULL ACTION 
  | stats count(eval(ACTION="shop")) as ShopperCount, count(eval(ACTION!="shop")) as NonShopperCount
  | eval TotalUsers=ShopperCount+NonShopperCount
  | eval PercentageOfUsers=round(ShopperCount/TotalUsers*100,2)
0 Karma

therockhead
Path Finder

Also tried with extra equals - stats count(eval(ACTION=="shop"))

0 Karma

therockhead
Path Finder

Also count(eval(ACTION!="shop")) as NonShopperCount returns 0 as well.

0 Karma

masonmorales
Influencer

Yes, your first command is only going to return a number of distinct user_ids. This means that you no longer have user_id to join on, since the output of your stats command is just a number. Same for the output of your JOIN command. The output is a number, so there are no user_id field values to join on.

Does this work for you?

sourceTag=test index=test 
| table user_id 
| dedup user_id 
| JOIN user_id [search sourceTag=test index=test ACTION="shop" | table user_id] 
| fillnull 
| stats count(eval(ACTION="shop")) as ShopperCount, count(eval(ACTION!="shop")) as NonShoppers
| eval TotalUsers=ShopperCount+NonShoppers
| eval PercentageOfUsers=round(ShopperCount/TotalUsers*100,2)

You can also add renames at the end to give the fields your desired labels. i.e.
| rename PercentageOfUsers as "% of Users that Shopped"

0 Karma

therockhead
Path Finder

Thanks for response but it did not work for me. The join is only returning the user_id and no action data so that stats always return 0.

0 Karma

masonmorales
Influencer

Any chance you could post a few lines of sample data from each source? (Feel free to anonymize field data)

0 Karma

therockhead
Path Finder

Sure .. for example here are two separate events

message: {
context: {
user_id: 443d5c6c-32ad-460e-a387-b077e9c47f3f
....
}
time: 2015-03-24T14:51:58.721Z
action: shop
}

message: {
context: {
user_id: 443d5c6c-32ad-460e-a387-b077e9c47f3f
.....
}
time: 2015-03-24T14:51:58.721Z
action: login
}

0 Karma
Get Updates on the Splunk Community!

Announcing Scheduled Export GA for Dashboard Studio

We're excited to announce the general availability of Scheduled Export for Dashboard Studio. Starting in ...

Extending Observability Content to Splunk Cloud

Watch Now!   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to leverage ...

More Control Over Your Monitoring Costs with Archived Metrics GA in US-AWS!

What if there was a way you could keep all the metrics data you need while saving on storage costs?This is now ...