Splunk Search

Emulating FULL OUTER JOIN

andreafebbo
Communicator

Hi,
I have to tables:

ID name
1..A
2..B

ID Error
1..bla1
1..bla2

so Id like a table which is like the following

ID name error
1..A........bla1
1..A........bla2
2..B........(empty)

In SQL this is called FULL OUTER JOIN but i cannot find a way to replicate it since the parameter

type=outer

in the command join means LEFT JOIN in the reality.

I found some questions that try to resolve the issue with the search command but in my case it does not work.

Thank you

sandrosov_splun
Splunk Employee
Splunk Employee

The author needed a left join as mentioned before, but I see demand for this task is still high.

I spent a little bit of time on this. So, try the following.

| inputlookup tableA | eval key=localfield1
| append [|inputlookup tableB | eval key=localfield2]
| selfjoin max=0 keepsingle=yes key

 

0 Karma

tdiestel
Path Finder

This seems like you just want to do a Left join. so

index=A|table ID Name | join type=left max=0 ID [search index=B |table ID Error]

arlington
Explorer

Thanks, man, that's indeed the great answer, which helped me a lot. But what is the max=0 indicates here? Event If there is not any data in the index B, still it shows 1, without max=0 it calculates all of them as 1.

0 Karma

cmerriman
Super Champion

try something like this, maybe:

UPDATED

 index=MYINDEX source=MYSOURCE 
                   | eval Day = strftime(_time,"%F")
                   | eval DateTime = strftime(_time,"%Y-%m-%d %H:%M:%S")
                   | stats Latest(LogType) as status Latest(Result) as Result Latest(DateTime) as When by PackageName  ,ExecutionInstanceGUID, Day
                   | sort When
                   | streamstats  count as "Execution Nr" by PackageName, Day
                   | sort - When
                   | table When, PackageName, "Execution Nr",  status, Result, ExecutionInstanceGUID
                   | eval AlertLevel = case(Result=="OK",1,Result=="WARNING",2,Result=="KO",3)
                   | rangemap field=AlertLevel low=1-1 elevated=2-2 severe=3-3 default=guarded
                   | fields - AlertLevel
|appendpipe [stats count by ExecutionInstanceGUID | join type=outer ExecutionInstanceGUID [
                   search index=MYINDEX source=MYSOURCE
                   | rename ExplodedPackages{}.Error AS Error, ExplodedPackages{}.Package AS Package, ExplodedPackages{}.TimeStamp AS TimeStamp
                   | eval x=mvzip(TimeStamp,mvzip(Package,Error))
                   | mvexpand x
                   | eval y=mvzip(ExecutionInstanceGUID,x)
                   | mvexpand y
                   | eval z=split(y,",") 
                   | eval ExecutionInstanceGUID=mvindex(z,0)
                   | eval TimeStamp=mvindex(z,1)
                   | eval Package=mvindex(z,2)
                   | eval Error=mvindex(z,3)
                   | table ExecutionInstanceGUID,TimeStamp, Package, Error]]
|stats values(PackageName) as PackageName by ExecutionInstanceGUID Error
0 Karma

andreafebbo
Communicator

THe result of this query is the following:

1..A
2..B
1.......bla1
1.......bla2

and not:

1..A..bla1
1..A..bla2
2..B........

0 Karma

cmerriman
Super Champion

try my updated syntax. I used a stats command at the bottom to bring back the values of the PackageName by the ID and Error, so it should join the 1s and bla1/bla2s.

0 Karma

andreafebbo
Communicator

Nothing: now it shows just 3 columns:
ExecutionInstanceGUID
Error
PackageName (empty column).

In the example, what i call the column name(A, B) is one column but in the reality are many columns(as you can see in the query). I need all of them

0 Karma

cmerriman
Super Champion

you might need to play with the stats command at the bottom, join by any of the column names that might be in common, or do a latest instead of values.

|stats values(*) as * by ExecutionInstanceGUID 
0 Karma

cmerriman
Super Champion

can you provide syntax with how you get current tables? You might be able to use appendpipe.

0 Karma

davebrooking
Contributor

I understand your requirement is more involved, but given the basic datasets from above the following search produces similar results to that shown:

index=MYINDEX source=MYSOURCE | stats values(error) as error values(name) as name  by id | eval error=if(isnull(error),"NULL",error)  | mvexpand error | table id name error

Dave

0 Karma

andreafebbo
Communicator
index=MYINDEX source=MYSOURCE 
                  | eval Day = strftime(_time,"%F")
                  | eval DateTime = strftime(_time,"%Y-%m-%d %H:%M:%S")
                  | stats Latest(LogType) as status Latest(Result) as Result Latest(DateTime) as When by PackageName  ,ExecutionInstanceGUID, Day
                  | sort When
                  | streamstats  count as "Execution Nr" by PackageName, Day
                  | sort - When
                  | table When, PackageName, "Execution Nr",  status, Result, ExecutionInstanceGUID
                  | eval AlertLevel = case(Result=="OK",1,Result=="WARNING",2,Result=="KO",3)
                  | rangemap field=AlertLevel low=1-1 elevated=2-2 severe=3-3 default=guarded
                  | fields - AlertLevel
                  | join type=outer ExecutionInstanceGUID[
                         search index=MYINDEX source=MYSOURCE
                         | rename ExplodedPackages{}.Error AS Error, ExplodedPackages{}.Package AS Package, ExplodedPackages{}.TimeStamp AS TimeStamp
                         | eval x=mvzip(TimeStamp,mvzip(Package,Error))
                         | mvexpand x
                         | eval y=mvzip(ExecutionInstanceGUID,x)
                         | mvexpand y
                         | eval z=split(y,",") 
                         | eval ExecutionInstanceGUID=mvindex(z,0)
                         | eval TimeStamp=mvindex(z,1)
                         | eval Package=mvindex(z,2)
                         | eval Error=mvindex(z,3)
                         | table ExecutionInstanceGUID,TimeStamp, Package, Error
                  ]

With this syntax I get just one error for each row in the first table

0 Karma
Get Updates on the Splunk Community!

Introducing the 2024 SplunkTrust!

Hello, Splunk Community! We are beyond thrilled to announce our newest group of SplunkTrust members!  The ...

Introducing the 2024 Splunk MVPs!

We are excited to announce the 2024 cohort of the Splunk MVP program. Splunk MVPs are passionate members of ...

Splunk Custom Visualizations App End of Life

The Splunk Custom Visualizations apps End of Life for SimpleXML will reach end of support on Dec 21, 2024, ...