Wide format CSV with Plotly Express

Hi everyone
I have a wide dataset with COVID-19 data.
I have the following code in Python:

# Import Libraries

import pandas as pd 
import plotly.express as px 

# Read CSV file 

df = pd.read_csv("covid19pt_data.csv")

# Plot

fig = px.bar(df)

fig.show()

After running it I get the following error message:

ValueError: Plotly Express cannot process wide-form data with columns of different type

If I change my code to

fig = px.bar(df, x='data',y='novos_casos_t')

fig.show()

the code works, showing me a bar graph for that column. The same for any other column.

However, after reading this post I was under the impression that plotly now supports wide format frames so I don’t understand what I’m clearly doing wrong.

My experience level is close to zero and any help in making me understand what I’m doing wrong is much appreciated.

The error message is basically pointing you to the problem… your df has multiple columns, and they’re not all of the same type (some are strings, some are numbers, most likely) and Plotly Express doesn’t know what to do with that: you can’t make a bar chart of a mix of strings and numbers :slight_smile:

Hi Nicholas, but all my columns, as you can see in the sample dataset, are numbers. Is this happening because of the headers?

Your “data” column doesn’t contain numbers, no :slight_smile:

If you add a line like df = df.set_index("data") above your PX call things should work better, because in that case all your remaining columns will be numerical and PX will automatically use your index for the X axis.

1 Like

True, thank you for for your time, @nicolaskruchten

Sorry the error message wasn’t clearer! Maybe it should list the first two differing-type columns.

1 Like

That would help, particularly to those - like me - that are new to this.
My assumption was that plotly would assume that the first column would be the x axis by default, and that the other columns would be the different y axis values and the first row would be the label for those values.
by using df = df.set_index('data') I was able to make it work, even if now I have a total novel problem, that I’m trying to figure out. If I can’t I will probably open a new thread.
Once more thank you for time and patience.