Selectedpoint highlights data points for each category instead of from the dataset as a whole

Hi, I’m fairly new to plotly/dash. My apologies if this is a trivial question, but I’ve been looking over the documentation and I can’t find an answer.

I have a data frame where each row falls into one of three categories (setosa, versicolor, virginia). I use a dropdown to create a list of numbers (points) corresponing to the indices in the underlying data frame. I pass this list to fig.update_traces(selectedpoints=points). This selects the points with the corresponding indices for each category rather than from the dataset as a whole. E.g.
fig.update_traces(selectedpoints=[0,1,2])
highlights 9 points: 3 for each species, rather than only 3 points–the first 3 points regardless of species.

Is there some way to achieve the second behaviour?

Current behaviour in example below:

import dash # 2.0.0
from dash import dcc
from dash import html
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output, State

import plotly # 5.3.1
import plotly.express as px
import plotly.graph_objects as go

# read in iris dataset and plot the initial graph, colored by species
df = px.data.iris()
fig = px.scatter(df, x='sepal_length', y='sepal_width', color='species')

# define the app layout
app = dash.Dash(__name__)
app.layout = dbc.Container([
    dbc.Row([
        dbc.Col([
            html.H1("Update Selected Points"),
            html.H4("Minimal working example--Iris"),
            html.H6("""Choosing points from the dropdown selects those points
for each species. Intended behaviour is to select only those points irrespective of species. 
e.g. point 0 should select only the first flower, but currently selects the first of each species"""),
            html.Hr()
            ])
        ]),
    dbc.Row([
        dbc.Col([
            # similified dropdown to select df indices
            dcc.Dropdown(
                id='iris-dropdown',
                options = [{'label': x, 'value': x} for x in range(10)],
                multi=True,
                placeholder="Select point indices"                
                ),
            # the graph to be updated
            dcc.Graph(
                id='iris-scatter',
                figure=fig
                )
            ])
        ])
    ])

# simplified callback
@app.callback(
    Output('iris-scatter', 'figure'),
    Input('iris-scatter', 'figure'),
    Input('iris-dropdown', 'value')
    )
def update_plot(fig_old, selection):
    """
    Take in the point(s) to be selected from the dropdown and highlight them on the graph
    If no points are selected, then highlight all points
      ARGS:
            fig_old -- the current figure, to be updated
            selection -- the points to be highlighted
    RETURN:
            the plot with updated selection
    
    """

    # read in the old figure for modification--"Don't Change" the zoom/pan on update
    fig = go.Figure(fig_old)
    fig.update_layout(uirevision="Don't Change")

    if (selection is None) or (len(selection) == 0):
        # no points are selected: highlight all
        points = [i for i in range(len(df))]

    else:
        # choose the selected points--more complicated in real app
        points = selection

    # update `selectedpoints`
    fig.update_traces(selectedpoints=points)

    return fig
    

if __name__ == "__main__":
    app.run_server(debug=True)

UPDATE:
I’ve worked out a way to do it, but I’d be interested to know if there’s a better (maybe built-in) way.

Solution: update the figure dictionary directly by mapping the group indices to the absolute indices as per the following updated callback. Note that in this solution, the DataFrame must be sorted by it’s grouping: i.e. all Setosa together, then all Versicolour, then, all Virginica (although ordering of the groups shouldn’t matter).


@app.callback(
    Output('iris-scatter', 'figure'),
    Input('iris-scatter', 'figure'),
    Input('iris-dropdown', 'value')
    )
def update_plot(fig_old, selection):
    """
    Take in the point(s) to be selected from the dropdown and highlight them on the graph
    If no points are selected, then highlight all points
      ARGS:
            fig_old -- the current figure (dict) to be updated
            selection -- the absolute indices of the points to be highlighted

    RETURN:
            the fig with updated selection
    
    """

    # select all points if selection not specified/empty
    if (selection is None) or (len(selection) == 0):
        points = [i for i in range(len(df))]        # all points
        fig = go.Figure(fig_old)
        fig.update_traces(selectedpoints=points)

    # select *only* the indented points
    else:        
        # get the dict for each group for the `fig_old` dict
        groups = [i for i in fig_old['data']]
        k = None
        
        # loop through the groups (species)
        for gi, group in enumerate(groups):
            xi = [i for i in range(len(group['x']))]                    # indices of points within the group (use x-coords)
            k = xi if k is None else [i + k[-1] + 1 for i in xi]        # add indices from last group to current xi

            # choose the group indices corresponding to the absolute indices
            group_points = [xi[i] for i, p in enumerate(k) if p in selection]   

            # update `selectedpoints` in the group dict
            fig_old['data'][gi]['selectedpoints'] = group_points

        # read as figure for further changes
        fig = go.Figure(fig_old)

    fig.update_layout(uirevision="Don't Change")

    return fig

Cheers!

Your approach is very nice and there isn’t to my knowledge any built-in method to do it. Just for performance reasons, I would suggest you to stick with the dictionary updates instead of recreate an object with go.Figure at each callback.

You make a good point in keeping the dictionary rather than recreating the go.Figure() on each callback. I’ll definitely bear this in mind in future applications :slight_smile: