Learn how to use Dash Bio for next-gen sequencing & quality control. 🧬 Register for the Oct 27 webinar.

Sunburst chart : repeated labels + missing values

Hello,

I am playing a bit with sunburst plot both in js and python.

I’ve run some test on this data coming out from a tools,
but I’m facing issue due to repeated labels and missing values

import plotly.express as px
import pandas as pd

# create a matrix of 3 list each one of 19 element
n_ls, n_el = 3, 19;
Matrix = [["" for x in range(n_el)] for y in range(n_ls)] 

Matrix[0][0]  = "";         Matrix[1][0]  = "TOP";     Matrix[2][0]  = 1.1452;
Matrix[0][1]  = "TOP";      Matrix[1][1]  = "TOP_1";   Matrix[2][1]  = 1.1047;
Matrix[0][2]  = "TOP_1";    Matrix[1][2]  = "ST_1";    Matrix[2][2]  = 0.2686;
Matrix[0][3]  = "TOP_1";    Matrix[1][3]  = "ST_2";    Matrix[2][3]  = 0.1719;
Matrix[0][4]  = "ST_2";     Matrix[1][4]  = "ST_A";    Matrix[2][4]  = 0.1719;
Matrix[0][5]  = "ST_A";     Matrix[1][5]  = "A1";      Matrix[2][5]  = 0.0473;
Matrix[0][6]  = "ST_A";     Matrix[1][6]  = "A2";      Matrix[2][6]  = 0.0471;
Matrix[0][7]  = "TOP_1";    Matrix[1][7]  = "ST_3";    Matrix[2][7]  = 0.1709;
Matrix[0][8]  = "ST_3";     Matrix[1][8]  = "ST_A";    Matrix[2][8]  = 0.1708;
Matrix[0][9]  = "ST_A";     Matrix[1][9]  = "A1";      Matrix[2][9]  = 0.0470;
Matrix[0][10] = "ST_A";     Matrix[1][10] = "A2";      Matrix[2][10] = 0.0469;
Matrix[0][11] = "TOP_1";    Matrix[1][11] = "ST_4";    Matrix[2][11] = 0.1129;
Matrix[0][12] = "ST_4";     Matrix[1][12] = "ST_4_1";  Matrix[2][12] = 0.1129;
Matrix[0][13] = "ST_4_1";   Matrix[1][13] = "C1";      Matrix[2][13] = 0.0326;
Matrix[0][14] = "ST_4_1";   Matrix[1][14] = "C2";      Matrix[2][14] = 0.0325;
Matrix[0][15] = "TOP_1";    Matrix[1][15] = "ST_5";    Matrix[2][15] = 0.0977;
Matrix[0][16] = "TOP_1";    Matrix[1][16] = "ST_6";    Matrix[2][16] = 0.0955;
Matrix[0][17] = "TOP";      Matrix[1][17] = "TOP_2";   Matrix[2][17] = 0.0400;
Matrix[0][18] = "";         Matrix[1][18] = "<other>"; Matrix[2][18] = 0.0134;

df = pd.DataFrame(dict(parent=Matrix[0],names=Matrix[1],value=Matrix[2]))
print(df)

fig = px.sunburst(df, names='names', parents='parent', values='value', branchvalues="total")

fig.update_layout(margin = dict(t=0, l=0, r=0, b=0))

fig.write_html("./file.html")

test1: starting point, due to repeated labels it’s not displayed;
test2: test1 but with manual fixes just for check, it works.
not usable as default because we have output from tools more complicated than this.

I’ve then moved to try another approach using “missing values”

import plotly.express as px
import pandas as pd

# create a matrix of 6 list each one of 19 element
n_ls, n_el = 6, 19;
Matrix = [["" for x in range(n_el)] for y in range(n_ls)] 

Matrix[0][0]  = "TOP";     Matrix[1][0]  = "None";  Matrix[2][0]  = "None"; Matrix[3][0]  = "None";   Matrix[4][0]  = "None"; Matrix[5][0]  = 1.1452;
Matrix[0][1]  = "TOP";     Matrix[1][1]  = "TOP_1"; Matrix[2][1]  = "None"; Matrix[3][1]  = "None";   Matrix[4][1]  = "None"; Matrix[5][1]  = 1.1047;
Matrix[0][2]  = "TOP";     Matrix[1][2]  = "TOP_1"; Matrix[2][2]  = "ST_1"; Matrix[3][2]  = "None";   Matrix[4][2]  = "None"; Matrix[5][2]  = 0.2686;
Matrix[0][3]  = "TOP";     Matrix[1][3]  = "TOP_1"; Matrix[2][3]  = "ST_2"; Matrix[3][3]  = "None";   Matrix[4][3]  = "None"; Matrix[5][3]  = 0.1719;
Matrix[0][4]  = "TOP";     Matrix[1][4]  = "TOP_1"; Matrix[2][4]  = "ST_2"; Matrix[3][4]  = "ST_A";   Matrix[4][4]  = "None"; Matrix[5][4]  = 0.1719;
Matrix[0][5]  = "TOP";     Matrix[1][5]  = "TOP_1"; Matrix[2][5]  = "ST_2"; Matrix[3][5]  = "ST_A";   Matrix[4][5]  = "A1";   Matrix[5][5]  = 0.0473;
Matrix[0][6]  = "TOP";     Matrix[1][6]  = "TOP_1"; Matrix[2][6]  = "ST_2"; Matrix[3][6]  = "ST_A";   Matrix[4][6]  = "A2";   Matrix[5][6]  = 0.0471;
Matrix[0][7]  = "TOP";     Matrix[1][7]  = "TOP_1"; Matrix[2][7]  = "ST_3"; Matrix[3][7]  = "None";   Matrix[4][7]  = "None"; Matrix[5][7]  = 0.1709;
Matrix[0][8]  = "TOP";     Matrix[1][8]  = "TOP_1"; Matrix[2][8]  = "ST_3"; Matrix[3][8]  = "ST_B";   Matrix[4][8]  = "None"; Matrix[5][8]  = 0.1708;
Matrix[0][9]  = "TOP";     Matrix[1][9]  = "TOP_1"; Matrix[2][9]  = "ST_3"; Matrix[3][9]  = "ST_B";   Matrix[4][9]  = "A1";   Matrix[5][9]  = 0.0470;
Matrix[0][10] = "TOP";     Matrix[1][10] = "TOP_1"; Matrix[2][10] = "ST_3"; Matrix[3][10] = "ST_B";   Matrix[4][10] = "A2";   Matrix[5][10] = 0.0469;
Matrix[0][11] = "TOP";     Matrix[1][11] = "TOP_1"; Matrix[2][11] = "ST_4"; Matrix[3][11] = "None";   Matrix[4][11] = "None"; Matrix[5][11] = 0.1129;
Matrix[0][12] = "TOP";     Matrix[1][12] = "TOP_1"; Matrix[2][12] = "ST_4"; Matrix[3][12] = "ST_4_1"; Matrix[4][12] = "None"; Matrix[5][12] = 0.1129;
Matrix[0][13] = "TOP";     Matrix[1][13] = "TOP_1"; Matrix[2][13] = "ST_4"; Matrix[3][13] = "ST_4_1"; Matrix[4][13] = "C1";   Matrix[5][13] = 0.0326;
Matrix[0][14] = "TOP";     Matrix[1][14] = "TOP_1"; Matrix[2][14] = "ST_4"; Matrix[3][14] = "ST_4_1"; Matrix[4][14] = "C2";   Matrix[5][14] = 0.0325;
Matrix[0][15] = "TOP";     Matrix[1][15] = "TOP_1"; Matrix[2][15] = "ST_5"; Matrix[3][15] = "None";   Matrix[4][15] = "None"; Matrix[5][15] = 0.0977;
Matrix[0][16] = "TOP";     Matrix[1][16] = "TOP_1"; Matrix[2][16] = "ST_6"; Matrix[3][16] = "None";   Matrix[4][16] = "None"; Matrix[5][16] = 0.0955;
Matrix[0][17] = "TOP";     Matrix[1][17] = "TOP_2"; Matrix[2][17] = "None"; Matrix[3][17] = "None";   Matrix[4][17] = "None"; Matrix[5][17] = 0.0400;
Matrix[0][18] = "<other>"; Matrix[1][18] = "None";  Matrix[2][18] = "None"; Matrix[3][18] = "None";   Matrix[4][18] = "None"; Matrix[5][18] = 0.0134;

df = pd.DataFrame(dict(Level_1=Matrix[0], Level_2=Matrix[1], Level_3=Matrix[2], Level_4=Matrix[3], Level_5=Matrix[4], value=Matrix[5]))
print(df)

fig = px.sunburst(df, path=['Level_1', 'Level_2', 'Level_3', 'Level_4', 'Level_5'], values='value', branchvalues="total")

fig.update_layout(margin = dict(t=0, l=0, r=0, b=0))

fig.write_html("./file3.html")

test3: using ‘path’ and ‘None’ but it’s far from what I’m looking for
test4: trying to remove some ‘None’, no success
test5: removed all ‘None’, no success

Could you help on solve this ?

Thanks !

Should I’ve posted this in “plotly.python” instead of “Graphing Library” ?
It yes, could some admin move this ?

Thanks!