Splunk Search

Joining two queries using Append and results are different

gopiven
Explorer

Hello Experts

Actually I am trying to join the results of two queries and show in dashboard.

There are 3 indexes 1a,2b and 3c with many source types.
In index=1a the field ( say "ClientId" which I required is directly there I am doing the lookup against the file. ( since in the index 1a, both userid and clientId fields are there I Can evaluate the Userid and then join the ClientId through the lookup.
But in index=2b, index= 3c I have to evaluate the field "Userid" from different sourcetypes and do input lookup and join the "ClientId" from the same input lookup.
But when I am charting the results with index 2b and 2c , the values for index 1a is not showing however it is showing the huge volume.

Inputlookup Filename: UserId.csv
Inputlookup file format:
Userid Clientid
User1 Client1
User2 Client2

index= "1a" OR index="2b" OR index="3c"
| eval
Platform = case(
index="1a", "Online",
index="2b", "Mobile",
index="3c", "OtherPlatforms")

[ search sourcetype="onlineindex" AND CATEGORY="{signin}" [inputlookup UserId.csv] ] | append [
| eval Userid=case(sourcetype=="type1",user,sourcetype=="type2",userids,sourcetype=="type3",useridvalue)
| lookup Userid.csv Userid AS Userid | join ClientId [inputlookup UserId.csv] ]

| Stats dc(clientId) as total_clients by date_hour,date_wday,Platform | chart avg(ClientId) over date_hour by platform

ANOTHER METHOD:

As mentioned earlier, for the index-="1a" both userid and clientId fields are there I Can evaluate the Userid and then join the ClientId through the lookup ( instead I am looking for direct Clientid field in the events)
In the below query I am evaluating the clienId like other sourcetypes and joining "ClientId" thru input lookup.

index= "1a" OR index="2b" OR index="3c"

| eval
Platform = case(
index="1a", "Online",
index="2b", "Mobile",
index="3c", "OtherPlatforms")

| eval Userid=case(sourcetype="onlineindex" AND CATEGORY="{signin}",Userid, sourcetype=="type1",user,sourcetype=="type2",userids,sourcetype=="type3",useridvalue)

| lookup Userid.csv Userid AS Userid | join ClientId [inputlookup UserId.csv] ]

| Stats dc(clientId) as total_clients by date_hour,date_wday,Platform | chart avg(ClientId) over date_hour by platform

In the above queries I am suspecting only the sourcetype="onlineindex" AND CATEGORY="{signin}" , the AND operator is giving issue. Please assist me.
Are there any better way to redefine the query.
Multisearch also I tried but it is giving some sublimit search error because I am pulling millions of records.

0 Karma

ololdach
Builder

Hi gopiven,

I agree to Rich: you don't need the join. The "error" you see most likely comes from the missing quotes around the values in the CASE command: case( expression ,"value" ) <<< values need to be quoted and the case sensitivity of variable names: ClientId != CLIENTID != Clientid; Platform != platform

Here's my shot at the answer:

index= "1a" OR index="2b" OR index="3c"
| eval Platform = case(index="1a", "Online",index="2b", "Mobile",index="3c", "OtherPlatforms")
| lookup Userid.csv Userid AS Userid output Clientid  
| stats dc(Clientid) as total_clients by date_hour,date_wday,Platform | chart avg(Clientid) over date_hour by Platform

The trick is the use of "Field Alias" names. Going to Settings...Fields... Field aliases lets you define an alias name for a field by sourcetype. Add Userid as an alias for the respective fields to the sourcetypes 1...n

https://docs.splunk.com/Splexicon:Alias

Hope it helps
Oliver

0 Karma

Richfez
SplunkTrust
SplunkTrust

OK, this may end up being the answer, but I'll start out with a comment:

I took a look at the searches, and they're very complicated given what they apparently need to do.

I see no reason for 'join' at all, and you have like 4 subsearches and I don't think any of those are needed.

(index= "1a" OR index="2b" OR index="3c") sourcetype="onlineindex" CATEGORY="{signin}"
| eval Platform = case(index="1a", "Online", index="2b", "Mobile", index="3c", "OtherPlatforms") 
| eval Userid=case(sourcetype=="type1", user, sourcetype=="type2", userids, sourcetype=="type3", useridvalue) 
| lookup Userid.csv Userid AS Userid 

What does that give you? NOTE I did NOT include the stats or chart - the important thing to see right now is if the lookup of Userid works and returns what you want. Because if that part works, all the rest is just copying/pasting your old stats (with typos fixed?) and chart onto the end.

Some general notes - join is probably never necessary. It's the worst command in Splunk, and unfortunately named perfectly to draw people - especially but not only SQL folks - to it like flies on honey. Subsearches are very useful, but ... in this case I'm not sure why they're needed. The usual Splunk way to do things like this is to toss all the events into a big pile, then eval your new fields like you did, then just stats group them by the thing you need to group them by.

Let me know what you find! It is highly probably it'll need some tweaking and changes, but if you reply back with some of the events, and an accurate description of what's not working right we can keep helping!

Happy Splunking!
-Rich

0 Karma
Get Updates on the Splunk Community!

Index This | I am a number, but when you add ‘G’ to me, I go away. What number am I?

March 2024 Edition Hayyy Splunk Education Enthusiasts and the Eternally Curious!  We’re back with another ...

What’s New in Splunk App for PCI Compliance 5.3.1?

The Splunk App for PCI Compliance allows customers to extend the power of their existing Splunk solution with ...

Extending Observability Content to Splunk Cloud

Register to join us !   In this Extending Observability Content to Splunk Cloud Tech Talk, you'll see how to ...