Ignore "Non-leaves rows" for sunburst diagram?

edent · February 7, 2022, 3:43pm

I have a DataFrame of hierarchical data:

       0    1      2      3     4
0  alice  bob  chuck  david  ella
0  alice  bob  chuck  david  fred
0  alice  bob  chuck    NaN   NaN

If I try to create a Sunburst plot, I get told

Non-leaves rows are not permitted in the dataframe

The same thing occurs if I replace the NaN with None

I’m aware that I could replace the NaNs with some dummy text, but that will distort the diagram I’m drawing.

Is there a way to skip these non-leave rows?

Thanks!

edent · February 7, 2022, 4:17pm

I commented out the check in plotly/express/_core.py and it worked.

See:

def _check_dataframe_all_leaves(df):
    df_sorted = df.sort_values(by=list(df.columns))
    null_mask = df_sorted.isnull()
    df_sorted = df_sorted.astype(str)
    null_indices = np.nonzero(null_mask.any(axis=1).values)[0]
    for null_row_index in null_indices:
        row = null_mask.iloc[null_row_index]
        i = np.nonzero(row.values)[0][0]
        if not row[i:].all():
            raise ValueError(
                "None entries cannot have not-None children",
                df_sorted.iloc[null_row_index],
            )
    df_sorted[null_mask] = ""
    row_strings = list(df_sorted.apply(lambda x: "".join(x), axis=1))
    #for i, row in enumerate(row_strings[:-1]):
        #if row_strings[i + 1] in row and (i + 1) in null_indices:
            #raise ValueError(
            #    "Non-leaves rows are not permitted in the dataframe \n",
            #    df_sorted.iloc[i + 1],
            #    "is not a leaf.",
            #)

My diagram renders perfectly. It would be great if there was an ignore_non_leaves=True option, rather than my horrible hack!

ddavo · February 28, 2022, 10:59am

These non-leave rows are automatically created by Plotly, so you can safely delete them

Instead of commenting code from the library (which won’t work if you update or go to another machine), you can modify the dataframe:

df = df.dropna()

yhs · April 13, 2022, 7:13am

Thanks @ddavo , you are the man!

ameyakambli · April 21, 2022, 1:26am

But won’t this cause an issue as both column 3 and 4 would dropped if we use df.dropna()

ddavo · April 21, 2022, 8:21am

dropna() by default will drop rows with Null values, not columns

ameyakambli · April 22, 2022, 7:54pm

>        0    1      2      3     4
0  alice  bob  smith    NaN   NaN
0  alice  bob  smith   rocky  NaN
0  alice  bob  chuck   david  ella
0  alice  bob  chuck   david  fred
0  alice  bob  chuck    NaN   NaN

This is more specific to my current scenario, by doing dropna() i will lose rows branching out from bob to smith. Smith and Chuck are two children belonging to Bob, how do i visualize a treemap/sunburst in such scenario without filling NaN with some dummy text

ddavo · April 23, 2022, 9:10am

This is because the DataFrame expected needs to be rectangular. With each column having a value to group by. In the docs, you have an example with a dataframe grouped by day, then by time and then by sex.

In your case, I think it would be easier to just use names and parents instead of passing a DataFrame.

fig = px.treemap(
    names =   ["Alice", "Bob"  , "Smith", "Rocky", "Chuck", "David", "Ella" , "Fred" ],
    parents = [""     , "Alice", "Bob"  , "Smith", "Bob"  , "Chuck", "David", "David"]
)

noLogoInTheFoam · December 1, 2022, 5:39pm

Thanks @edent! I’ve also had to use this hack.

Topic		Replies	Views
Feature Request : Functionality to hide empty leaves in tree map 📊 Plotly Python community-components , tips-and-tricks , question	0	665	April 21, 2022
I'm trying to create a plotly sunburst plot but get error message: 'dtype: object, 'is not a leaf.' 📊 Plotly Python	1	994	April 13, 2022
Sunburst diagram Flexible end to path? 📊 Plotly Python	4	2671	November 13, 2020
Sunburst - None value problem 📊 Plotly Python	1	795	September 7, 2021
Sunburst Chart - Can't handle None values 📊 Plotly Python	2	6457	February 25, 2020

Ignore "Non-leaves rows" for sunburst diagram?

Related topics