Black Lives Matter. Please consider donating to Black Girls Code today.

How to Filter Scatter Plot Data by 3rd (or more) Columns in Pandas Dataframe Offline?

Hello. I am trying to plot a scatter graph offline that will show up in my browser. I would like to have a drop-down menu that will allow me to filter my data by a third column from my data frame that is associated with my two columns chosen for my x and y values in the scatter plot.

I am having 2 problems.

  1. First, I am having trouble coming up with a way to build the list of values in the dropdown by unique values in the 3rd column that I would like to filter my data from. For example, I am trying build the names in the drop-down from a list assigned to a variable like this:
    labelnames = list(df.name.unique())

  2. Second, I am having trouble figuring out how to update my scatter based on what value would be selected from the drop-down menu.

Any help would be greatly appreciated!

Here is an example of the code I am working with so far:

import plotly.plotly as py
from plotly.graph_objs import *
import plotly.offline
import pandas as pd


my_dict = {
    'name': ["a", "a", "c", "c", "c", "f", "g"],
    'age': [20, 27, 35, 55, 18, 21, 35],
    'score': [33, 11, 9, 12, 44, 15, 25]
}

df = pd.DataFrame(my_dict)


trace1 = Scatter(
    y=df['age'],
    x=df['score'],
    name='Data Set 1',
    mode='markers',
)

labelnames = list(df.name.unique())

data = Data([trace1])
layout = Layout(
    updatemenus=list([
        dict(
            x=-0.05,
            y=1,
            buttons=list([
                dict(
                    args=[],
                    options=[{'label': i, 'value': i} for i in labelnames],
                    value='',
                    label='Data Set 4',
                    method='restyle'
                )
            ]),
            yanchor='top'
        )
    ]),
)
fig = Figure(data=data, layout=layout)
plotly.offline.plot(fig, filename='name.html')

Hi @philalethes,

I’d recommend tackling this by creating a separate trace for each unique name, and then changing the trace visibility using the dropdown.

Here’s a full example (note that this requires plotly.py version 3)

import plotly.graph_objs as go
from plotly.offline import init_notebook_mode, iplot
import pandas as pd

init_notebook_mode()


my_dict = {
    'name': ["a", "a", "c", "c", "c", "f", "g"],
    'age': [20, 27, 35, 55, 18, 21, 35],
    'score': [33, 11, 9, 12, 44, 15, 25]
}

df = pd.DataFrame(my_dict)

fig = go.Figure()
names = []
for name, name_df in df.groupby('name'):
    names.append(name)
    fig.add_scatter(x=name_df.score,
                    y=name_df.age,
                    mode='markers',
                    name=name,
                    showlegend=False,
                    visible=name == 'a')
    
    
buttons = []
for i, name in enumerate(names):
    visible = [False]*len(names)
    visible[i] = True
    buttons.append(
        dict(
            method='restyle',
            args=[{'visible': visible}],
            label=name
        ))


fig.layout = go.Layout(
    xaxis=dict(
        range=[df.score.min() - 1, df.score.max() + 1]
    ),
    yaxis=dict(
        range=[df.age.min() - 1, df.age.max() + 1]
    ),
    updatemenus=list([
        dict(
            x=-0.05,
            y=1,
            buttons=buttons,
            yanchor='top'
        )
    ]),
)
iplot(fig)

Hope that helps!
-Jon

Thank you so much for your help with this, Jon! This worked brilliantly.

A follow-up question I have then is how could I go about adding more dropdown menus to filter this data by other columns in my dataframe. I have tried a few things without success. As an example, if I added one more pandas column to my dataframe, how would I add a second dropdown menu that would filter the data further?

my_dict = {
    'name': ["a", "a", "c", "c", "c", "f", "g"],
    'class': ["first", "second", "third", "fourth", "first", "second", "third"],
    'age': [20, 27, 35, 55, 18, 21, 35],
    'score': [33, 11, 9, 12, 44, 15, 25]
}

Hi @philalethes,

Unfortunately I don’t think you’ll be able to filter on multiple drowdowns using the built-in widgets. The trouble is that the command executed by each dropdown doesn’t have a way of referencing the state of the other widgets in the figure. To add more sophisticated interactions, you would need to head down the path of using a FigureWidget in the JupyterNotebook or using Dash in a standalone webapp.

-Jon

I was just beginning to look at Dash as an option as well, so I will likely continue to pursue that avenue. Thank you again for your help above. Your answer worked great with the single dropdown filter and will likely be a viable option for me in future applications.

1 Like