✊🏿 Black Lives Matter. Please consider donating to Black Girls Code today.
🐇 Announcing Dash VTK for 3d simulation graphics. Check out the March webinar.

Plotly Express: error when using px.bar() to plot an unstacked data frame

My goal is to create ratios from two filtered columns in a Pandas data frame, then use Plotly Express to create a bar chart using px.bar() . I’m able to do so using the base plot() function in Pandas, but not the px.bar() function in Plotly Express .

One problem that I ran into was that some of the columns contain duplicate values. This resulted in my having to do some Pandas gymnastics.

Here is my data:

test_df = pd.DataFrame({'Manufacturer':['Ford', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW', 'Ford', 'Mercedes', 'BMW'],
                          'Metric':['Orders', 'Orders', 'Orders', 'Orders', 'Orders', 'Orders', 'Orders', 'Sales', 'Sales', 'Sales', 'Sales', 'Sales', 'Sales', 'Warranty', 'Warranty', 'Warranty', 'Warranty', 'Warranty', 'Warranty'],
                          'Sector':['Germany', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA', 'Germany', 'Germany', 'Germany', 'USA', 'USA', 'USA'],
                          'Value':[45000, 70000, 90000, 65000, 40000, 65000, 63000, 2700, 4400, 3400, 3000, 4700, 5700, 1500, 2000, 2500, 1300, 2000, 2450],
                          'City': ['Frankfurt', 'Bremen', 'Berlin', 'Hamburg', 'New York', 'Chicago', 'Los Angeles', 'Dresden', 'Munich', 'Cologne', 'Miami', 'Atlanta', 'Phoenix', 'Nuremberg', 'Dusseldorf', 'Leipzig', 'Houston', 'San Diego', 'San Francisco']
                       })

Due to some duplicate values, I create a temporary table:

temp_table = test_df.reset_index().pivot_table(values = 'Value', index = ['Manufacturer', 'Metric', 'Sector'], aggfunc='sum')

Then, reset the index:

df_new = temp_table.reset_index()

Then,

s1 = df_new.set_index(['Manufacturer','Sector']).query("Metric=='Orders'").Value
s2 = df_new.set_index(['Manufacturer','Sector']).query("Metric=='Sales'").Value

Then, unstack and plot:

temp_frame = s1.div(s2).unstack()
temp_frame.plot(kind='bar')

This works perfectly and produces the following bar plot using the standard Pandas plot() function:

image

Now, I attempt to plot using the px.bar() function in Plotly Express :

px.bar(temp_frame, x='Sector', y='Value', color='Exchange',
                            barmode='group',
                            text='Value',
                            title='Order to Sales Ratio)

This code results in the following error message:

ValueError: Value of 'x' is not the name of a column in 'data_frame'. Expected one of ['Germany', 'USA'] but received: Sector

This error looks related to the issue reported in Use Pandas index in Plotly Express. But, I think my data frame is not configured in such a way that I can implement the “ugly fix” solution suggested by @Laurens Koppenol and validated by @nicolaskruchten .

Can anyone help me resolve this error so that I can create the bar plot above using Plotly Express ?

Thanks in advance!

Hi @ryana if you are used to the df.plot syntax did you know that it is now available for plotly as well? You just need to change the pandas plotting backend as below

pd.options.plotting.backend='plotly'
fig = temp_frame.plot(kind='bar')
fig.update_layout(barmode='group')
fig.show()


As for the plotly express command, you must use column names which exist in the transformed dataframe.
image
So the correct code is

px.bar(temp_frame, x=temp_frame.index, y=temp_frame.columns, barmode='group')

Also note that using the index for x and all columns for y is the default behaviour so you can just do

px.bar(temp_frame, barmode='group')

You can read the tutorial on plotly express and wide mode for more examples.

Thank you, @Emmanuelle! I was not aware of

pd.options.plotting.backend=‘plotly’

This is very useful, indeed!