Black Lives Matter. Please consider donating to Black Girls Code today.

Big sankey diagram (100 sources and 100 targets)

I’d like to plot a sankey diagram with many nodes (100 sources and 100 targets).

I faced 2 problems:

  1. Plotly visualisation crashes when I try to plot >300 connections ( Is there a limit amount nodes and connections that I am able to plot? Or I am doing smth wrong?

my code:

`data = dict(
domain = dict(
x = [0,1],
y = [0,1]
orientation = “h”,
node = dict(
pad = 10,
thickness = 30,
line = dict(
color = “black”,
width = 0.5
label = list(tr[‘trend’])
link = dict(
source = list(rate_t2_t1[‘number2’]),
target = list(rate_t2_t1[‘number1’]),
value = list(rate_t2_t1[‘p(2|1)’])

layout = dict(
title = ‘’,
height = 1200,
width = 1200,
font = dict(
size = 10

fig = dict(data=[data], layout=layout)
plotly.offline.plot(fig, validate=False)`

  1. The maximum amount of connections that I’ve been able to plot were 300. The result looked messy
    Any ideas/suggestions how to make it look better?

I realised, that the mess of lines was mostly due to cycles in the data, so I deceided to prepare data to eliminate cycles.

What i’ve done:
I selected nodes with the biggest outflow and labeled them as sources. These source nodes had no incoming flow.
I duplicated these nodes, labeled them as targets and added to other target nodes with the incoming flow.
This helped to improve the picture.

But I have a problem with arranging nodes. I would like to arrange nodes in spectfic order and have no idea how to do this

Hi @completebasis,

see for some discussion of this in the plotly.js project.


I could plot with 350 sources and tagets but looks unusably messy. Less than 100 is readable.

Do you have a picture of it? I had to aggregate nodes to make the graph readable by human… )

Too much mess :slight_smile:
I filtered based on most important connections, for example filter by selecting those connections that have only more than 500 weight. The rendering looks better when we reduce the nodes.
Best approach would be to segment the data though I wish I could get one huge diagram

See our tutorial to get an idea about defining the position of nodes in Sankey. Please follow this issue for sorting feature.

1 Like