Splunk Search

How to create a multistage Sankey diagram with a single search without needing to "append"?

doweaver
Path Finder

I have a dataset where each event summarizes a workflow, using the fields Foo->Bar->Baz, and I'm looking to create a Sankey diagram to visualize the flow. The only way I've come up with to get the output I want is to run one search, do a stats call, and then append the same query with a different stats call, like:

index=myIndex | stats count BY Foo, Bar | rename Foo AS source, Bar AS target | append [search index=myIndex | stats count BY Bar, Baz | rename Bar AS source, Baz AS target]

This works, but it's incredibly inefficient, and MUCH slower than it needs to be. Is there a way to get the output I'm looking for with a single search that I'm missing?

The output table would look something like:

source | target | count
foo1   | bar1   | 3
foo1   | bar2   | 12
bar1   | baz1   | 1
bar1   | baz2   | 2
bar2   | baz1   | 12
1 Solution

aljohnson_splun
Splunk Employee
Splunk Employee

If you can count by all three fields, maybe using appendpipe would be less resource intensive than using append:

sourcetype="access_combined" 
| stats count by host categoryId product_name
| appendpipe [stats count by host categoryId | rename host as source, categoryId as target]
| appendpipe [stats count by categoryId product_name | rename categoryId as source, product_name as target]
| search source=*
| fields source target count

gives me

alt text

View solution in original post

jmurata
Loves-to-Learn Everything

Hi @doweaver  . @aljohnson_splun  @fulldanad  A newbie question, I posted a thread at https://community.splunk.com/t5/Dashboards-Visualizations/Modified-Sankey-visualization-for-path-ana...  regarding (IMHO) the same issue as described above. I would like to replicate the final solution to check if I could apply it to my task but I can't create the dataset (external or inline) required for this search:  

sourcetype="access_combined"
| table host categoryId product_name
| appendpipe [stats count by host categoryId | rename host as source, categoryId as target]
| appendpipe [stats count by categoryId product_name | rename categoryId as source, product_name as target]
| search source=*
| fields source target count

could you help re-assemble it with a minimum number of lines to replicate the solution? BTW, Is it working on the sankey 1.6.0 app (the last version)?

Thanks a lot

 

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

If you can count by all three fields, maybe using appendpipe would be less resource intensive than using append:

sourcetype="access_combined" 
| stats count by host categoryId product_name
| appendpipe [stats count by host categoryId | rename host as source, categoryId as target]
| appendpipe [stats count by categoryId product_name | rename categoryId as source, product_name as target]
| search source=*
| fields source target count

gives me

alt text

spisiakmi
Communicator

Hi aljohnson. I want to thank you very much for this solution. I applied it on my problem and it worked very well. Well done.

0 Karma

doweaver
Path Finder

Hmm - I tried to post your comment as the answer, but Splunk is saying I can't make more than 2 posts per day until I hit 40 points. Pretty sure I've only made one post today, but...

/shrug

If you paste that same thing as the answer, I'll mark it solved 🙂

fulldanad
Path Finder

Hi aljohnson,

Thanks for your answer, it would greatly help to have it integrated in the documentation...

Find below a little amendment that helps to size correctly the lines :

sourcetype="access_combined"
| table host categoryId product_name
| appendpipe [stats count by host categoryId | rename host as source, categoryId as target]
| appendpipe [stats count by categoryId product_name | rename categoryId as source, product_name as target]
| search source=*
| fields source target count

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Glad it worked. Converted 🙂

0 Karma

doweaver
Path Finder

Yes! Perfect!

Didn't realize appendpipe was a thing. Thanks for your help!

doweaver
Path Finder

...I have no idea why a random "5." is showing up in the middle of the table...

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Cool question @doweaver. How many distinct values are there of foo bar and baz? As a solution for dc(foo) = 2 might be a lot simpler than all of those distinct values being an unknown variable.

doweaver
Path Finder

There are probably ~5 distinct values for each.

I'm not sure I understand what you're getting at here:

As a solution for dc(foo) = 2 might be a lot simpler than all of those distinct values being an unknown variable.

0 Karma

aljohnson_splun
Splunk Employee
Splunk Employee

Sorry, that wasn't well worded. I just meant that if there is a smaller number of distinct values, you might be able to get a simpler answer (I'm more thinking out loud haha, sorry).

So obviously foo and bar occur together, and bar and baz occur together, but do foo and baz NOT occur together, that is, is there a reason you can't search

index=myIndex | stats count by foo bar baz

doweaver
Path Finder

No worries 🙂

Unfortunately, they all three occur in a single event 😞 Technically, it's a transaction that links A -> B, with A containing Foo, and B containing Bar and Baz. I don't THINK there's a way to split things up in a way that will make that work... but I'll keep thinking about that.

0 Karma

ppablo
Retired

Hi @doweaver

That's just automatic numbering with anything in code blocks so people can help users point out where they've identified errors in syntax when people are sharing multiple lines of sample data/code.

doweaver
Path Finder

Oh, that makes sense 🙂 That was the best way I could figure out to put in a table (HTML table markup didn't seem to work).

0 Karma

ppablo
Retired

heh yeah, that's the best way to display a table format on here. you're doin it right 🙂

Get Updates on the Splunk Community!

.conf24 | Registration Open!

Hello, hello! I come bearing good news: Registration for .conf24 is now open!   conf is Splunk’s rad annual ...

Splunk is officially part of Cisco

Revolutionizing how our customers build resilience across their entire digital footprint.   Splunk ...

Splunk APM & RUM | Planned Maintenance March 26 - March 28, 2024

There will be planned maintenance for Splunk APM and RUM between March 26, 2024 and March 28, 2024 as ...