Plotly Express: error when using to plot an unstacked data frame

My goal is to create ratios from two filtered columns in a Pandas data frame, then use Plotly Express to create a bar chart using . I’m able to do so using the base plot() function in Pandas, but not the function in Plotly Express .

One problem that I ran into was that some of the columns contain duplicate values. This resulted in my having to do some Pandas gymnastics.

Here is my data:

test_df = pd.DataFrame({'Manufacturer':['Ford', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW'],
                          'Metric':['Orders', 'Orders', 'Orders', 'Orders', 'Orders', 'Orders', 'Orders', 'Sales', 'Sales', 'Sales', 'Sales', 'Sales', 'Sales', 'Warranty', 'Warranty', 'Warranty', 'Warranty', 'Warranty', 'Warranty'],
                          'Sector':['Germany', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA'],
                          'Value':[45000, 70000, 90000, 65000, 40000, 65000, 63000, 2700, 4400, 3400, 3000, 4700, 5700, 1500, 2000, 2500, 1300, 2000, 2450],
                          'City': ['Frankfurt', 'Bremen', 'Berlin', 'Hamburg', 'New York', 'Chicago', 'Los Angeles', 'Dresden', 'Munich', 'Cologne', 'Miami', 'Atlanta', 'Phoenix', 'Nuremberg', 'Dusseldorf', 'Leipzig', 'Houston', 'San Diego', 'San Francisco']

Due to some duplicate values, I create a temporary table:

temp_table = test_df.reset_index().pivot_table(values = 'Value', index = ['Manufacturer', 'Metric', 'Sector'], aggfunc='sum')

Then, reset the index:

df_new = temp_table.reset_index()


s1 = df_new.set_index(['Manufacturer','Sector']).query("Metric=='Orders'").Value
s2 = df_new.set_index(['Manufacturer','Sector']).query("Metric=='Sales'").Value

Then, unstack and plot:

temp_frame = s1.div(s2).unstack()

This works perfectly and produces the following bar plot using the standard Pandas plot() function:


Now, I attempt to plot using the function in Plotly Express :, x='Sector', y='Value', color='Exchange',
                            title='Order to Sales Ratio)

This code results in the following error message:

ValueError: Value of 'x' is not the name of a column in 'data_frame'. Expected one of ['Germany', 'USA'] but received: Sector

This error looks related to the issue reported in Use Pandas index in Plotly Express. But, I think my data frame is not configured in such a way that I can implement the “ugly fix” solution suggested by @Laurens Koppenol and validated by @nicolaskruchten .

Can anyone help me resolve this error so that I can create the bar plot above using Plotly Express ?

Thanks in advance!

Hi @ryana if you are used to the df.plot syntax did you know that it is now available for plotly as well? You just need to change the pandas plotting backend as below

fig = temp_frame.plot(kind='bar')

As for the plotly express command, you must use column names which exist in the transformed dataframe.
So the correct code is, x=temp_frame.index, y=temp_frame.columns, barmode='group')

Also note that using the index for x and all columns for y is the default behaviour so you can just do, barmode='group')

You can read the tutorial on plotly express and wide mode for more examples.

Thank you, @Emmanuelle! I was not aware of


This is very useful, indeed!