Splunk Search

join table outer search

kailun92
Communicator

Hi all, I need to join two table up and do a count of rain. Below is my search query is there anything wrong ? I can't seems to display my count of rain over places.

sourcetype="ltaTraffic" OR sourcetype="CurrentWeatherSGMap" Type="Accident" Location="AYE" OR Location="BKE" OR Location="CTE" OR Location="KJE" OR Location="PIE" OR Location="SLE" OR Location="TPE" OR Location="ECP" earliest=-7d latest=now |
transaction locationaccident maxspan=5s |
bucket span=1h _time |
dedup locationaccident |
fields onexpressway, locationaccident, current_area |
join current_area [search sourcetype="CurrentWeatherSGMap" |
rename Message as current_area | fields current_area current_summary] |
stats list(locationaccident) as Accidents, count(eval(current_summary="Rain" OR current_summary="Thundery Showers" OR current_summary="Showers" OR current_summary="Passing Showers" OR current_summary="Heavy Thundery Showers")) as Rain by onexpressway _time

alt text

I have updated the query to this (filter in the subsearch), but it return the exact same count values as the accident count. What is wrong with the query ? I changed it to a timechart.

sourcetype="ltaTraffic" OR sourcetype="CurrentWeatherSGMap" Type="Accident" Location="AYE" OR Location="BKE" OR Location="CTE" OR Location="ECP" OR Location="KJE" OR Location="PIE" OR Location="SLE" OR Location="TPE" earliest=-7d latest=now | transaction locationaccident maxspan=20s | bucket span=1m _time | dedup _time | fields onexpressway, locationaccident, current_area | join current_area [search sourcetype="CurrentWeatherSGMap" current_summary="Rain" OR current_summary="Thundery Showers" OR current_summary="Showers" OR current_summary="Passing Showers" OR current_summary="Heavy Thundery Showers" | rename Message as current_area | fields current_area current_summary] | timechart count(locationaccident) as Accidents, count(current_summary) as Rain by onexpressway

alt text

I have tried this outer search but it wont work:

sourcetype="ltaTraffic" Type="Accident" Location="AYE" OR Location="BKE" OR Location="CTE" OR Location="ECP" OR Location="KJE" OR Location="PIE" OR Location="SLE" OR Location="TPE" earliest=-7d latest=now | transaction locationaccident maxspan=20s | bucket span=1h _time | dedup locationaccident | fields onexpressway, locationaccident, current_area | join type=outer current_area [search sourcetype="CurrentWeatherSGMap" current_summary="Rain" OR current_summary="Thundery Showers" OR current_summary="Showers" OR current_summary="Passing Showers" OR current_summary="Heavy Thundery Showers" | rename Message as current_area | fields current_area current_summary] | timechart span=1d count(locationaccident) as Accidents, list(current_summary) as Rain by onexpressway

alt text

Tags (3)
0 Karma
1 Solution

cphair
Builder

The second search isn't working because you're not joining any events with non-rain current_summary values; non-rain locations are discarded. Every event in the subsearch has a current_summary field, and if every event in the main search has a locationaccident field (no matter what the value is), then the two counts will be the same. One way to fix this is to modify the join command with the parameter type=outer--that will preserve the main search results that do not have a rain value in current_summary and leave the current_summary field null, so you should then see the correct counts, assuming there's not already a current_summary field in your base search.

Also, I'm not sure what your source data looks like, but it looks odd to search for the CurrentWeatherSGMap sourcetype in the base search and then seemingly not using it until the subsearch. Either you don't need it in the base search, or it's likely you can get away without using a subsearch, which is a big performance hit. I'd have to see a sample event from both sourcetypes to say for sure.

View solution in original post

cphair
Builder

The second search isn't working because you're not joining any events with non-rain current_summary values; non-rain locations are discarded. Every event in the subsearch has a current_summary field, and if every event in the main search has a locationaccident field (no matter what the value is), then the two counts will be the same. One way to fix this is to modify the join command with the parameter type=outer--that will preserve the main search results that do not have a rain value in current_summary and leave the current_summary field null, so you should then see the correct counts, assuming there's not already a current_summary field in your base search.

Also, I'm not sure what your source data looks like, but it looks odd to search for the CurrentWeatherSGMap sourcetype in the base search and then seemingly not using it until the subsearch. Either you don't need it in the base search, or it's likely you can get away without using a subsearch, which is a big performance hit. I'd have to see a sample event from both sourcetypes to say for sure.

kailun92
Communicator

Didn't know subsearch search for time as well as join. Thank you so much !!!

0 Karma

cphair
Builder

Does the current_summary rain value change by time? If so, you'll need to preserve that in your subsearch as well--bucket _time by the same range you use in the outer search and join on that as well as current_area.

0 Karma

kailun92
Communicator

I have tried join type=outer, it didn't change anything. Am I using it the wrong way ? Do I have to put a where clause to define the null value ?

0 Karma

kailun92
Communicator

I filter the results in the subsearch, the data came out but it is exactly the same as the accidents count. I cannot find other error in the search query. Will continue doing that.

0 Karma

linu1988
Champion

I think it's because of the assignment in count for the STATS. Filter the values before you count rather than in the count function, you should have the result. Thanks

0 Karma
Get Updates on the Splunk Community!

Introducing Splunk Enterprise 9.2

WATCH HERE! Watch this Tech Talk to learn about the latest features and enhancements shipped in the new Splunk ...

Adoption of RUM and APM at Splunk

    Unleash the power of Splunk Observability   Watch Now In this can't miss Tech Talk! The Splunk Growth ...

Routing logs with Splunk OTel Collector for Kubernetes

The Splunk Distribution of the OpenTelemetry (OTel) Collector is a product that provides a way to ingest ...