Matching Lasso Selected Points to Figure Data

I have a dash app that I’m working on with multiple graphs. Each graph is a scatter plot and has a color by of ‘Included’ or ‘Excluded’.

We are showing the excluded points for visual purposes of outliers, but we don’t want to actually use them in any calculations.

A user can lasso select a group of points and recalculate a new average. However if they accidentally select an ‘Excluded’ point, we don’t want that to be included in the average.

Does anyone have any idea on how to do this? In the lasso select data it doesn’t show the color by classification and I’ve tried matching it up with the figure data to no avail. The way my app is set up doesn’t have direct access to the dataframe but I guess I could change that if I needed to.

Any thoughts?

Here is some sample code if this helps.

data = pd.DataFrame({'x': [1, 2, 3, 4, 5],
                     'y': [5, 4, 3, 2, 1],
                     'color': ['Included', 'Excluded', 'Included', 'Included', 'Excluded']})

fig_warm = px.scatter(
        data ,
        x='x',
        y='y',
        color='color',
        color_discrete_map={
            'Excluded': 'lightgrey',
            'Included': '#FF7F0F'
        }
    )


@app.callback(
    Output('avg-start-up-duration', 'rowData'),
    Input('warm-plot', 'selectedData'),
    Input('warm-plot', 'figure')
)
def update_avg_table_from_lasso_select(warm_selected_data, warm_plot):

    if not warm_selected_data:
        return dash.no_update

    if warm_selected_data:
        selected_indices = [point['pointIndex'] for point in warm_selected_data['points']]
        current_data = warm_plot['data'][0]

        selected_x = [current_data['x'][i] for i in selected_indices]
        selected_y = [current_data['y'][i] for i in selected_indices]

It’ll eventually update a Ag Grid, I just haven’t incorporated that yet since I can’t even get the data I want right now.

I figured out a way to do it if it helps anyone. Not sure if it’s the best/cleanest way so open to tweaks and adjustments if anyone has done this before.

My real data included dates which is why I have the formatted piece going on.

    if warm_selected_data:
        selected_x_y_pairs = [(item['x'], item['y']) for item in warm_selected_data['points']]
        formatted_x_y_pairs = [(datetime.strptime(x, '%Y-%m-%d %H:%M').strftime('%Y-%m-%dT%H:%M:%S'), y)
                               for x, y in selected_x_y_pairs]

        color_type = warm_plot['data'][0]['legendgroup']
        if color_type == 'Included':
            excluded_data = warm_plot['data'][1]
        else:
            excluded_data = warm_plot['data'][0]

        excluded_x_y_pairs = list(zip(excluded_data['x'], excluded_data['y']))

        values_to_include_in_avg_calc = [pair for pair in formatted_x_y_pairs if pair not in excluded_x_y_pairs]
1 Like