Black Lives Matter. Please consider donating to Black Girls Code today.
https://www.blackgirlscode.com

Df.plot(kind='box') with plotly returns error of wide-form data

I was trying to use px to create a boxplot for all the columns in a dataset (890k x 85) since df.plot(kind=‘box’) doesn’t give me interactivity.

The dataset has not been clean, so it has empty columns, ordinal, categorical columns… but is similar as:

A B C D E F G H I J K L M N O P Q R S T
0 NaN 2.0 1 2.0 3 4 3 5 5 3 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
1 NaN 1.0 2 5.0 1 5 2 5 4 5 2.0 3.0 2.0 1.0 1.0 5.0 4.0 3.0 5.0 4.0
2 NaN 3.0 2 3.0 1 4 1 2 3 5 3.0 3.0 1.0 0.0 1.0 4.0 4.0 3.0 5.0 2.0

I was using px.box(df) and returns “ValueError: Plotly Express cannot process wide-form data with columns of different type.”, but I think my data is in a tidy form, so I suppose is an error? I suppose I could fallback to traces but I wanted to know if it was an error that I was doing since it needs a transformation or it was a bug of plotly.

This is intentional: as the error message says, wide-form data is only accepted as is so long as all the columns are the same type, so you must select, say, only the numerical columns, as would make sense for a box plot.

Follow-up: I do see that Pandas’ built-in backend does this filtering automatically for kind="box" and this is something we can consider baking in to our backend, but at the moment, you’ll have to explicitly provide either a dataframe with only numerical columns, i.e. df[["A","B","C"]].plot(kind="box") or a mixed dataframe with a specified list of columns to plot i.e. df.plot(kind="box", y=["A","B","C"])

I thought it could be a bug because the data is not in wide-form no? I thought wide form was only when it had 2 axis like the wide_df here:


But like they say, You learn something new every day :smiley: . So thanks for taking your time to answer and investigate, and all the hard work you do to answer all of us, that most of the time are basic questions.

I think this is a bug. In pandas I saw what you said, it takes out categorical columns, but it used the rest of columns, mainly float64 and int64.

Instead with plotly if I try to plot the boxplot of A, B it works, if I try to px.box(aux.iloc[:,:3]) it returns again “ValueError: Plotly Express cannot process wide-form data with columns of different type.”. And okay, is true that there are different types, but in pandas works, and here I suppose it should work too since both are numeric types.

I used the data from the first post, being A&B float64, and C int64.

@set92 thanks for the feedback! as of version 4.8.2, all numeric types are considered “the same type” for the purposes of wide-form processing.