Scatter color not match with its legend

Here’s my aggregated dataframe for the visualization
image

Here’s the code I use to visualize

px.scatter(df_scatter_agg, x="single_model", y="multi_model", color="deal_status",
           color_discrete_sequence=[mint, azure])

Here’s the resulting scatter plot

Yesterday I run the above code snippet and the scatter plot worked just fine. Yet when I rerun it just now, the scatter color just doesn’t match with the legend color (as it’s shown above). E.g. lost should be marked as color green, but it displays as blue on the plot. Is this a UI bug? Noted that won still display the correct blue color as the legend

Thanks in advance for the help

HI @soapycat99 welcome to the forums.

Could you post an extract of your DataFrame as code df=pd.DataFrame(your_data) instead of an image so people can copy&paste?

If everything worked fine yesterday, make sure the DataFrame has not changed unintentionally. This happens quite easily if you are working in a jupyter notebook.

Sorry for the late response. Please refer to the .csv file below.

https://raw.githubusercontent.com/triet-lq-holistics/dummy_repo/main/deal.csv

I didn’t change the DataFrame, just turn off my PC and turn on run the same code. THe result is still the same, I’m pretty sure the incorrect color display is the problem. In the scatter plot above , when zooming in more closely, you can see that the hovered node with value lost display blue , where it should display green as its legend on the right side

Hi @soapycat99, I do not think, this is a plotly issue.

The thing is, that you plot various points on top of each other. You are using single_model and multi_model as axis coordinates.

Check your data with

df_scatter_agg[(df_scatter_agg.single_model==0) & (df_scatter_agg.multi_model==4)]

and you will see this:

Okay, I understand what you mean, the description of the hovered node has confused me. Could you suggest a way or a workaround to display all the overlapping nodes, if there’s any?

*Update:
I just realize that using a strip would make more sense than scatter in this situation, with opacity custom.

from

px.scatter(df_scatter_agg, x="single_model", y="multi_model", color="deal_status",
           color_discrete_sequence=[mint, azure])

to

px.strip(df_scatter_agg, x='multi_model',y='single_model',color='deal_status',
         color_discrete_sequence=[mint, azure]).update_traces(jitter=1, opacity=0.5, marker_size=20

Output:

One more thing I would like to tune up. How do I change the opacity of one specific value but not the other? e.g. opacity={‘won’:0.5, ‘lost’: 1}

Hi,

you can do so using the selector argument of figure.update_traces()

fig.update_traces({'opacity': 0.5}, selector={'name': 'won'})
fig.update_traces({'opacity': 1.0}, selector={'name': 'lost'})

You could even change the size or marker color the same way:

fig.update_traces({'opacity': 0.5}, selector={'name': 'won'})
fig.update_traces({'opacity': 1.0, 'marker':{'color':'blue','size':10}}, selector={'name': 'lost'})

newplot (25)
mrep traces

This is super helpful. Thanks a lot!