Here is a script that plots scatter plots from two data frames. df_1
has n_row
rows with random x
and y
values, while df_2
has 20 rows with random x
and y
values. I use plotly express (version 5.20.0) to first create a scatter plot for df_1
, and then use add_trace()
to add a scatter plot for df_2
.
n_row = 1000
df_1 = pd.DataFrame(dict(x=np.random.rand(n_row), y=np.random.rand(n_row)))
df_2 = pd.DataFrame(dict((x=np.random.rand(20), y=np.random.rand(20)))
fig = px.scatter(df_1, 'x', 'y').update_traces(marker_color='red', marker_size=4)
fig.add_trace(px.scatter(df_2, 'x', 'y').update_traces(marker_color='blue', marker_size=20).data[0])
First, I set n_row
equal to 100 and created the plot. As expected, it plots the second scatter plot (from df_2
) on top of the first scatter plot (from df_1
).
Next, I set n_row
to 1001 and ran the same code.
n_row = 1001
df_1 = pd.DataFrame(dict(x=np.random.rand(n_row), y=np.random.rand(n_row)))
df_2 = pd.DataFrame(dict((x=np.random.rand(20), y=np.random.rand(20)))
fig = px.scatter(df_1, 'x', 'y').update_traces(marker_color='red', marker_size=4)
fig.add_trace(px.scatter(df_2, 'x', 'y').update_traces(marker_color='blue', marker_size=20).data[0])
This time, however, it plots the first scatter plot (from df_1
) on top of the second scatter plot (from df_2
).
I’ve tried a bunch of values for n_row
. When n_row
is less than or equal to 1000, the plot order is as expected (the second scatter plot is plotted on top of the first). When n_row
is greater than 1000, the first scatter plot (from df_1
) is plotted on top of the second (from df_2
).
Is this a bug or am I missing something?