Setting multiple error bars with new plotly express 'Wide data' feature

RoryL · June 1, 2020, 12:52pm

Hello,

I’ve been enjoying the Plotly update, especially the wide-form support for plotly express. My question relates to the new functionality - if I define a plot with;

px.scatter(dataframe, x = ‘xaxis_column_name’ , y= [‘yaxis_col_1’, ‘y_axis_col_2’])

It doesn’t appear to be possible to add error bars in a similar, columnar fashion, i.e.

px.scatter(dataframe, x = ‘xaxis_column_name’ , y= [‘yaxis_col_1’, ‘y_axis_col_2’], error_y = [‘yaxis_col_1_error’, ‘y_axis_col_2_error’])

Since error_y must have the same length as the x axis.
If I do set error_y with a single column or errors, it is applied to all traces simultaneously, which is not always appropriate.

Are there any plans to add the ability to define columns of errors in the wide-form input? I think this would be very convenient. What are the suggested workarounds at present? My workflow is such that I’d prefer to call the figure first, inspect it, and then add error bars to the traces post declaration.

I understand that traces could be added one by one with .add_trace(), but I like the px API as it seems much more elegant and less verbose.

Thanks,
Rory

nicolaskruchten · June 1, 2020, 3:18pm

Unfortunately there’s no way to do this kind of “correlated wide-form” at the moment… Your best bet would be to try do use a long-form input here.

nicolaskruchten · June 1, 2020, 3:20pm

It’s a good idea though! I’ve logged an issue here for further thought and discussion: https://github.com/plotly/plotly.py/issues/2522

RoryL · June 1, 2020, 7:42pm

Hi Nicolas! Thanks for taking a look. I’ve tried a bunch of different data transformations to get this working, but the most elegant solution I could find for now was to simply create the go.Scatter objects with the correct link between data and error bars.

I’m glad to hear you like the suggestion however. I’d love to see this feature added as it would simplify the whole process greatly. Anyway, the new plotly express wide data format has been hugely helpful, great update!

nicolaskruchten · June 1, 2020, 8:46pm

I’ll see if I can come up with a little recipe for you… Just to confirm, your data is in a Pandas data frame and has something like x, y1, error_y1, y2, error_y2, ... ?

RoryL · June 1, 2020, 9:03pm

That’s very kind. Yes, that’s exactly right

nicolaskruchten · June 1, 2020, 9:22pm

It feels like something like this would work: https://stackoverflow.com/questions/55403008/pandas-partial-melt-or-group-melt

RoryL · June 2, 2020, 7:49am

Maybe I am missing something, but I don’t understand how the long form data format gets me closer to plotting the data. Here’s an MWE I’ve been playing with to understand your SO link:

import numpy as np
import pandas as pd
import plotly.express as px

def f(x, m, c):
    return m*x + c

x = np.arange(10)
y = f(x, 3, 1)
y2 = f(x, 5, 2)


y_err, y2_err = [np.random.random(len(x)) for _ in [y, y2]]


df=pd.DataFrame(data=[x,y,y2,y_err, y_err]).T
df.columns= ['x','y','y2','y_err','y2_err']


# this is basically the graph I want, just with the correct error bars
# setting y_error applies it to both
f1 = px.scatter(df, x='x', y=['y', 'y2'], error_y='y_err')

df_long = df.stack().reset_index().rename(columns = {'level_1': 'Variable'})

#how best to achieve the plot from here is still unclear

nicolaskruchten · June 2, 2020, 12:28pm

Here’s how I would approach this: first you melt() just the y data, the usual way. Then you unstack the error values into a single column, which should still be well-aligned with the main dataset. Then you add the unstacked error values to the main dataset, and you plot the data in long-form, so you don’t pass any lists to y.

import pandas as pd
import numpy as np

df = pd.DataFrame(dict(
    x=range(20),
    y1=np.random.rand(20).cumsum(),
    y_error1=np.random.rand(20),
    y2=np.random.rand(20).cumsum(),
    y_error2=np.random.rand(20)
))

print(df.head())

long_df = df.melt(id_vars="x", value_vars=["y1", "y2"], value_name="y", var_name="y_name")
long_df["y_error"] = df[["y_error1", "y_error2"]].unstack().values

print(long_df.head())

px.scatter(long_df, x="x", y="y", error_y="y_error", color="y_name")

RoryL · June 4, 2020, 7:51am

Thanks for this solution. I’m sure this will work perfectly for my data

Appreciate you taking the time to have a look at this. I was struggling to reformat the data correctly, but this is a great example!

Cheers!

Topic		Replies	Views
Critical error when using plotly express scatter with wide dataframe containing mixed data 📊 Plotly Python	2	580	July 29, 2022
Px.scatter ValueError: Plotly Express cannot process wide-form data with columns of different type 📊 Plotly Python	0	1234	September 10, 2021
Wide format CSV with Plotly Express 📊 Plotly Python	6	2474	April 26, 2021
Error bars for two groups in a stacked bar graph...? 📊 Plotly Python	0	2250	July 22, 2021
How to implement error bars for only particular values in type- scatter and mode- line chart plotly.js	1	270	January 14, 2024

Setting multiple error bars with new plotly express 'Wide data' feature

Related topics