Data frame and codes needed to get a plotly sunburst chart

I got this EXCEL data frame, intended to build a sunburst chart using plotly. EXCEL data frame intended to get a sunburst chart

I am just new with plotly, and I never tried before to get a sunburst chart.

There’s no way I can get the expected sunburst, but plotly online is giving me no error message, neither anything strange seems to happen: the chart just and simply doesn’t show up, and I don’t know if maybe the data frame wasn’t properly built… I really don’t know, but it would be very helpful if anybody could give me a hand on this matter. Thanks all.

Hi @bernatsn and welcome to the forum! Your dataframe should include one line per sector of the sunburst, not only for the outer ring but also for the different levels of the hierarchy (ie you should add a line for “LOGISTICS - Ingridients” etc.). All the names of the “parents” should have their own line, except for the top level corresponding to an id of "" (see for example https://plot.ly/python/sunburst-charts/).

Since we realize that it can be cumbersome to build the arrays for ids and parents we are currently working on a different API which would accept tables such as the one you linked to. Please stay tuned, and in the meantime I hope the doc examples and what I wrote will be enough so that you can plot your sunburst charts!

Hello Emmanuelle, and first of all thanks so much for your kindness and availability!!!

I think I understood what you mean, so I tried to manipulate my data frame this way:

https://www.dropbox.com/s/7sllatipu75slt6/cr_%20changed.xlsx?dl=0

I tried to get the chart both ways so from Plotly site, and from rstudio, running this command:

library(plotly)

plot_ly(cr, ids=cr$ids, labels=cr$labels,

parents=cr$parents, values=cr$values, type=‘sunburst’,

branchvalues = ‘total’)

:frowning: no way, I am sorry, but again, no error message and nothing out of order, but simply no chart…

Apologies if I am disturbing and pushing too much, but it would be awesome if I could get the chart for tomorrow.

Again, MANY THANKS, and kind regards.

Have a nice day.

Hi @bernatsn, the problem is that you have trailing spaces in the ids or parents column (I checked this by writing a loop checking that every parent was also in the id columns). When you remove the trailing spaces it works as shown as the attached picture. I also copy here the Python code I used for this, I hope you can adapt it for R or remove the spaces directly in Excel.

import plotly.graph_objects as go
import pandas as pd
df = pd.read_excel('/home/emma/Downloads/cr_ changed.xlsx')
df.loc[df['parents'].isna(), 'parents'] = ''
df.parents = df.parents.str.strip()
df.ids = df.ids.str.strip()
fig = go.Figure(go.Sunburst(ids=df.ids, labels=df.labels, parents=df.parents))
fig.show()

Ufffff… thanks again, that’s maddening…

Again, MANY THANKS.

I applied the equivalent function in rstudio to remove the spaces, end that is the best I got, but only after deleting the order branchvalues=’total’

From plotly site, there’s no way, even changing the “parent value mode”… what a pity, because having that the way it look from your Python script would be great for a presentation that I have this next week…

So at the point I am here, my two questions would be:

  • How to get the sunburst chart the way you got it, but in Plotly site, with branchvalues=’total’.

Or:

  • How to get the chart with rstudio but with branchvalues=’total’.

Sorry for bothering you too much, but I really appreciate you help, kindness and availability.

Kind regards!!!

Hey @bernatsn, could you share the link to your chart studio plot with the corrected spreadsheet? In the Jupyter notebook I could get it working using the values from your data and branchvalues='total', see below

import plotly.graph_objects as go
import pandas as pd
df = pd.read_excel('/home/emma/Downloads/cr_ changed.xlsx')
df.loc[df['parents'].isna(), 'parents'] = ''
df.parents = df.parents.str.strip()
df.ids = df.ids.str.strip()
fig = go.Figure(go.Sunburst(ids=df['ids'], labels=df['labels'], parents=df['parents'], 
                            values=df['values'], 
                            branchvalues='total'
                           ))
fig.show()

https://www.dropbox.com/s/60yv2brgbv8s2s6/cr_csv.xlsx?dl=0

Sure!!!

There you go…

Even if it is called cr_csv, it is not a .csv but an EXCEL file.

Once imported to rstudio, just running this command, I am getting the following chart, as told before:

plot_ly(cr_csv, ids=cr_csv$ids, labels=cr_csv$labels,

parents=cr_csv$parents, values=cr_csv$values, type=‘sunburst’)

image003.jpg

But just adding branchvalues=’total’, it is returning nothing to me.

plot_ly(cr_csv, ids=cr_csv$ids, labels=cr_csv$labels,

parents=cr_csv$parents, values=cr_csv$values, type=‘sunburst’,

branchvalues=‘total’)

Exactly the same file uploaded to Plotly site, it is returning nothing to me, it doesn’t matter the Parent Value Mode I select:

Thanks one more time :blush:

Regards.

Hi Emmanuelle,

I am working through the same problem. I tried your code and I get a blank screen as well.

#df.loc[df[‘parents’].isna(), ‘parents’] = ‘’
df.loc[df[‘topic’].isna(), ‘topic’] = ‘’
df[‘parents’] = df.topic.str.strip() + ’ - ’ + df.subtopic1.str.strip()
df[‘ids’] = df.parents + ’ - ’ + df.code.str.strip()
df.code = df.code.str.strip()
df.desc = df.desc.str.strip()

#fig = go.Figure(go.Sunburst(ids=df.ids, labels=df.labels, parents=df.parents))
fig = go.Figure(go.Sunburst(ids=df.ids, labels=df.code, parents=df.parents))
fig.show()

Here is the data that I am using …

Here is the same data using plotly.express and “path” - it works fine …

fig = px.sunburst(df2, path=[‘topic’,‘subtopic1’,‘desc’], values=‘score’, maxdepth=2, hover_data = [‘code’], hover_name = “http://csq1.org/info/” + df2[‘code’]+".htm")

fig.show()

PS. I also want to click on a label and open an html url or other external onclick - if that is possible…

click-leaf

Is there a quick, generic for loop that can create the extra blank parent rows as needed? That function would transpose Go. to PATH compliant tables for sunbursts - essentially.

I spent some time automating loops to build sunburst data similar to the data that @bernatsn used for his chart.

My data is here… https://chart-studio.plot.ly/~etilley/44

but still no chart is being rendered … the only thing missing is values for parents but that should be unimportant I think…

Here is the code that turns the “path” spreadsheet into this go. Sunburst formatting …

import plotly.graph_objects as go
import pandas as pd

df=pd.DataFrame()

df = pd.read_csv(’…/CSV/Indicators50.csv’)
df = df[[“code”, “desc”, “score”, “topic”, “subtopic1”, “subtopic2”, “subtopic3”]].drop_duplicates()
#df.loc[df[‘parents’].isna(), ‘parents’] = ‘’

df[‘parents’] = df.topic.str.strip() + ’ - ’ + df.subtopic1.str.strip()
df[‘ids’] = df.parents + ’ - ’ + df.code.str.strip()
df[‘labels’] = df.code.str.strip()
#df[‘labels’] = df.desc.str.strip()

create the dedicated IDs rows

df_ids = df.sort_values(‘subtopic1’).drop_duplicates([‘subtopic1’], keep=‘last’)
#print(“Length of df_inds:”, len(df_ids))
for i in range(len(df_ids)):
df = df.append({‘ids’:df_ids.iloc[i,3] + ’ - ’ + df_ids.iloc[i,4], ‘parents’:df_ids.iloc[i,3], ‘labels’:df_ids.iloc[i,4], ‘score’:df_ids.iloc[i,2]}, ignore_index=True)

#create the dedicated Parents rows in dataframe
df_parents = df.topic.drop_duplicates().reset_index(drop=True)

don’t count the newly added Parents records

#df_parents = df_parents.dropna((subset=[‘code’]) )
df_parents = df_parents.dropna()
for i in df_parents:
df = df.append({‘parents’:"", ‘ids’:i, ‘labels’:i}, ignore_index=True)
#print(“Shape of df_parents:”, df_parents.shape)
#print(“Shape of df:”, df.shape)

df.to_csv(’…/CSV/Sunburst50.csv’)

fig = go.Figure()

fig.add_trace(go.Sunburst(
ids=df.ids,
labels=df.labels,
parents=df.parents,
domain=dict(column=0)
))

fig.update_layout(
grid= dict(columns=2, rows=1),
margin = dict(t=0, l=0, r=0, b=0)
)

fig.show()