Sankey Diagram Data Preprocessing - Python

Andrewash46 · February 20, 2018, 1:46pm

I am having a hard time understanding how to preprocess my Python Pandas dataframe to pull into a Sankey diagram script. I have a df of online consumer data with multiple rows pertaining to one consumer. The dataframe has over 50,000 consumer sample paths. Does anyone know how to approach this?

Thanks!

Andrew

mihalw28 · November 14, 2018, 10:22am

I hope you did the right approach. I’am dealing with a similar challange now (5000 rows x 10 cols data frame) and it’s definitely not a piece of cake.
Best!

jmmease · November 15, 2018, 10:35am

Hi @Andrewash46 and @mihalw28,

The pre-processing approach depends on the form of the data you’re starting with. A Sankey diagram is designed to represent a weighted graph (nodes, edges, and weights).

You may also want to take a look at the new Parallel Categories diagram (See https://plot.ly/python/parallel-categories-diagram/). This has a superficially similar appearance to a Sankey diagram, but it’s designed to represented multi-dimensional categorical datasets.

Feel free to share more details on your dataset if you want to talk through it more,
-Jon

mihalw28 · November 19, 2018, 9:58am

Thanks for your response @jmmease.

Parallel categories diagram is new one for me and it looks very interesting on examples. I`ll try it with my data. Once again, thanks.

Topic		Replies	Views
Sankey Diagram Not showing correctly 📊 Plotly Python	0	933	September 1, 2021
Creating a Sankey Plot Dash Python	3	675	August 12, 2020
Just published an article on Medium re Sankey diagrams 📊 Plotly Python	10	1587	April 10, 2023
Skip null values in Parallel Categories or Sankey diagram 📊 Plotly Python	0	291	September 25, 2021
Resolved - Sankey Chart generated only with the chart title 📊 Plotly Python	1	905	October 29, 2019

Sankey Diagram Data Preprocessing - Python

Related topics