Varying opacity in scatter_3d

Morning

I am trying to create a 3d scatter plot with the opacity of each point drawn from a column in my dataframe.

I have a dataframe with columns for x,y,z, sample_id, patient_id, diagnosis and sample_purity.

I can create a plot in plotly express like this:

fig = px.scatter_3d(df, x="x", y="y", z="z",
                 color=df["diagnosis"], 
                 labels={'color': 'Diagnosis'}, 
                 hover_data=df[["sample_id", "patient_id", "diagnosis"]],
                 width = 1600,
                 height = 800,
                 title = plotTitle,
                 hover_name="patient_id"
                 )

but then when I iterate over the traces in the figure like this:

for trace in fig.data:
    marker = trace.marker
    hover_name = trace.hovertext
    print(hover_name)

the traces each contain multiple samples, and I’d like to change the opacity on a per sample basis. Doing it this way, I could set an opacity for the markers of each diagnosis, but I don’t think I could do it for each sample.

The other way that I tried was to construct the figure from graph objects like this:

fig = go.Figure()

categories = df['diagnosis'].unique()
colors = px.colors.qualitative.Plotly[:len(categories)]
color_mapping = dict(zip(categories, colors))
df['color'] = df['diagnosis'].map(color_mapping)
print(df.shape)

for index, row in df.iterrows():
    #print(row[["sample_id", "patient_id", "diagnosis", "cancer_type", "purity"]])
    sampleId, patientId, diagnosis, cancerType = row[["sample_id", "patient_id", "diagnosis", "cancer_type"]]
    fig.add_trace(go.Scatter3d(mode='markers',
        x=[row["x"]],
        y=[row["y"]],
        z=[row["z"]],
        marker=dict(
            color=row["color"],
            size=4,
           opacity = row["purity"
        ),
        name=row["diagnosis"]
    )
)

And this sort of works. It gives me variable transparency on a per sample basis, but it also results in multiple traces, one for each sample, which means they are then not color coded or grouped in the plot and legend, Which is less than ideal.

The way that I initially tried, which makes the most sense to me, would have been to pass a column from the dataframe to the opacity variable when I first made the plot, but opacity accepts only a single value. Which I can see being a useful things for some scenarios I guess, but not this one.

Any ideas on how I would go about doing this, would be much appreciated.

Thanks
Ben.

HI @tirohia1 welcome to the forums.

There is a way to do so, but might work only in special cases. The example:

import plotly.express as px
import pandas as pd
import numpy as np

# create DataFrame
a = np.repeat(np.arange(1, 10), 4, axis=-1)
a = a.reshape((9,4))
df = pd.DataFrame(a, columns=['x', 'y', 'z', 'opacity'])

# convert opacity values into string
df.opacity = df.opacity.apply(str)

# create figure
fig = px.scatter_3d(
    df, 
    x="x", 
    y="y", 
    z="z",
    color=df.opacity, 
    color_discrete_map={val: f'rgba(255, 0, 255, {int(val) / 10})' for val in df.opacity},
    width = 600,
    height = 600,
)
fig.show()

The thing is, that the opacity value column can’t be numeric for this approach. Basically, you are mapping a color to each value of your color column, in this case the color is the same but the opacity value of the rgba string is changed.

mrep

That almost works, but you still end up with a different category label for each point.

I think I’m leaning towards either Plotly can’t do the thing I want it to do, or I’m thinking entirely the wrong way about it. I am effectively after another axis of information within the plot.

As I’m reading things, you can’t have points in the plot that are part of the same trace having different opacity - you can only set marker attributes for an entire trace.

Or, you can set the marker attributes for each point, but they then become their own trace, and you can’t group them in the legend.

To expand slightly on the example you gave, what I am effectively after is the following, with the color being set by the category (in my cases, diagnosis) and each category having different levels of opacity within it. i.e. the yellow points having different levels of opacity, and the blue points having different levels of opacity.

import plotly.express as px
import pandas as pd
import numpy as np

# create DataFrame
a = np.repeat(np.arange(1, 10), 4, axis=-1)
a = a.reshape((9,4))
df = pd.DataFrame(a, columns=['x', 'y', 'z', 'opacity'])
df.category = ["A", "A","A", "A","A", "A","B", "B","B"]
# convert opacity values into string
df.opacity = df.opacity.apply(str)

# create figure
fig = px.scatter_3d(
    df, 
    x="x", 
    y="y", 
    z="z",
    color="category", 
    color_discrete_map={val: f'rgba(255, 0, 255, {int(val) / 10})' for val in df.opacity},
    width = 600,
    height = 600,
)
fig.show()

I suspect it just can’t be done.

Hi, I think I still don’t understand what you are after. You get two categories because you have two categories, namely A and B.

Maybe you could your data (or equivalent data) ? In general, the legend in plotly is somehow limited IMHO. A possible solution could be the use of graph_objects and legendgroups

Take a look at this, maybe it helps:

I want to be able to show variation within each category, by having variable opacity. While still grouping at the category level.

I want to have a plot that has a category for High Grade Glioma, and another for low Grade Glioma and so on. And then within the each category we would be able to identify samples with low tumor purity (tending towards transparent) and high tumor purity (tending towards solid). Tumor purity is on a scale of 0-1 which would make it ideal for opacity.

I could bin the opacity and split the categories up, for a HGG low, medium and high, a LGG how medium and high, and so on, but I already have ~30 different diagnoses, and losses resolution on sample purity.

Construction via plotly express won’t let me do anything useful (for my purposes) with the opacity. Construction using individual traces vi graph objects lets me set opacity but gives me a separate category for each sample.

I’ll have a look at the legend group parameter, thanks. At a quick glance, looks like it might be useful.

Hello,

I’ve had a similar problem and eventually found the solution, so I thought I’d share it with the community. The working code is below. The trick is to add the opacity column as custom_data when creating the figure, and then use fig.for_each_trace(set_opacity) to update the opacity for each trace of the figure based on the previously defined set_opacity function.

Hope this helps!

# example code with opacity depending on dataframe column

import plotly.express as px
import pandas as pd
import numpy as np
import plotly

# create DataFrame
a = np.repeat(np.arange(1, 10), 4, axis=-1)
a = a.reshape((9,4))
df = pd.DataFrame(a, columns=['x', 'y', 'z', 'opacity'])
df['category'] = ["A", "A","A", "A","A", "A","B", "B","B"]
df['opacity'] = df['opacity'].apply(lambda x: round(x/10, 2))

# create figure
fig = px.scatter_3d(
    df, 
    x="x", 
    y="y", 
    z="z",
    color="category", 
    width = 600,
    height = 600,
    custom_data=['opacity']
)

def set_opacity(trace):
    opacities = trace.customdata
    r, g, b = plotly.colors.hex_to_rgb(trace.marker.color)
    trace.marker.color = [
        f'rgba({r}, {g}, {b}, {a})'
        for a in map(lambda x: x[0], opacities)]

fig.for_each_trace(set_opacity)


fig.show()

2 Likes