Announcing Dash Bio 1.0.0 🎉 : a one-stop-shop for bioinformatics and drug development visualizations.

Change Colors of Selected Nodes in Visdcc Network Graph

Hello,

I am working on a project analyzing the words spoken in The Office. I’m currently stuck on one part of my project building a network graph visualizing who speaks to who for any particular episode of the show. The user is given the option to select a season, then an episode, then 2 characters for the network graph.

Here is my code so far:


import pandas as pd
import numpy as np
import dash
import os
from dash import dcc
from dash import html
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output
import visdcc
import itertools as it
from sklearn.feature_extraction.text import CountVectorizer

#Load data
sheet_url = 'https://docs.google.com/spreadsheets/d/18wS5AAwOh8QO95RwHLS95POmSNKA2jjzdt0phrxeAE0/edit#gid=747974534'
url = sheet_url.replace('/edit#gid=', '/export?format=csv&gid=')
office_data = pd.read_csv(url)


office_data['season'] = 'Season ' + office_data['season'].astype(str)
office_data['episode'] = 'Episode ' + office_data['episode'].astype(str)
office_data['scene'] = 'Scene ' + office_data['scene'].astype(str)

#-----------Network Graph Prep----------#

#1.) Filter down to just data with main characters
office_data['main_ind'] = np.where(
    (office_data['speaker']=='Pam')|
    (office_data['speaker']=='Jan')|
    (office_data['speaker']=='Kelly')|
    (office_data['speaker']=='Phyllis')|
    (office_data['speaker']=='Angela')|
    (office_data['speaker']=='Erin')|
    (office_data['speaker']=='Holly')|
    (office_data['speaker']=='Meredith')|
    (office_data['speaker']=='Michael')|
    (office_data['speaker']=='Jim')|
    (office_data['speaker']=='Kevin')|
    (office_data['speaker']=='Oscar')|
    (office_data['speaker']=='Stanley')|
    (office_data['speaker']=='Toby')|
    (office_data['speaker']=='Roy')|
    (office_data['speaker']=='Ryan')|
    (office_data['speaker']=='Andy')|
    (office_data['speaker']=='Creed')|
    (office_data['speaker']=='Darryl')|
    (office_data['speaker']=='Dwight'),
    1,0
)

#2.) Filter down to only scenes containing these people
size = office_data.groupby(['season','episode','scene']).size().reset_index()
sums = office_data.groupby(['season','episode','scene']).agg({'main_ind':'sum'}).reset_index()


main_metrics = pd.merge(size,sums,how='left',on=['season','episode','scene'])
main_metrics.rename(columns={0:'count'}, inplace=True )

office_data = pd.merge(office_data,main_metrics,how='left',on=['season','episode','scene'])
office_data['diff'] = office_data['count'] - office_data['main_ind_y']

data_for_ng = office_data[office_data['diff']==0]


#Create a season-character dictionary
season_character_dict = {'Season 1': ['Angela', 'Darryl', 'Dwight', 'Jan', 'Jim','Kelly','Kevin','Meredith','Michael','Oscar','Pam','Phyllis','Roy','Ryan','Stanley','Toby','Todd Packer'],
                         'Season 2': ['Angela','Creed', 'Darryl', 'David Wallace', 'Dwight', 'Jan', 'Jim','Kelly','Kevin','Meredith','Michael','Oscar','Pam','Phyllis','Roy','Ryan','Stanley','Toby','Todd Packer'],
                         'Season 3': ['Andy', 'Angela','Creed', 'Darryl', 'David Wallace', 'Dwight', 'Jan', 'Jim','Karen','Kelly','Kevin','Meredith','Michael','Oscar','Pam','Phyllis','Roy','Ryan','Stanley','Toby','Todd Packer'],
                         'Season 4': ['Andy', 'Angela','Creed', 'Darryl', 'David Wallace', 'Dwight','Holly', 'Jan', 'Jim','Kelly','Kevin','Meredith','Michael','Oscar','Pam','Phyllis','Roy','Ryan','Stanley','Toby'],
                         'Season 5': ['Andy', 'Angela','Creed', 'Darryl', 'David Wallace', 'Dwight','Erin','Holly', 'Jan', 'Jim','Karen','Kelly','Kevin','Meredith','Michael','Oscar','Pam','Phyllis','Roy','Ryan','Stanley','Toby'],
                         'Season 6': ['Andy', 'Angela','Creed', 'Darryl', 'David Wallace', 'Dwight','Erin','Gabe','Holly','Jan', 'Jim','Kelly','Kevin','Meredith','Michael','Oscar','Pam','Phyllis','Ryan','Stanley','Toby','Todd Packer'],
                         'Season 7': ['Andy', 'Angela','Creed', 'Darryl', 'David Wallace', 'Dwight','Erin','Gabe','Holly','Jan', 'Jim','Karen','Kelly','Kevin','Meredith','Michael','Oscar','Pam','Phyllis','Ryan','Stanley','Toby','Todd Packer'],
                         'Season 8': ['Andy', 'Angela','Creed', 'Darryl', 'David Wallace', 'Dwight','Erin','Gabe', 'Jim','Kelly','Kevin','Meredith','Oscar','Pam','Phyllis','Ryan','Stanley','Toby','Todd Packer'],
                         'Season 9': ['Andy', 'Angela','Creed', 'Darryl', 'David Wallace', 'Dwight','Erin','Gabe','Jan','Jim','Kelly','Kevin','Meredith','Michael','Oscar','Pam','Phyllis','Roy','Ryan','Stanley','Toby','Todd Packer']
}



season_episode_dict = {'Season 1': ['Episode 1', 'Episode 2', 'Episode 3', 'Episode 4', 'Episode 5','Episode 6'],
                         'Season 2': ['Episode 1', 'Episode 2', 'Episode 3', 'Episode 4', 'Episode 5','Episode 6','Episode 7', 'Episode 8', 'Episode 9', 'Episode 10', 'Episode 11','Episode 12','Episode 13', 'Episode 14', 'Episode 15', 'Episode 16', 'Episode 17','Episode 18','Episode 19', 'Episode 20', 'Episode 21', 'Episode 22'],
                         'Season 3': ['Episode 1', 'Episode 2', 'Episode 3', 'Episode 4', 'Episode 5','Episode 6','Episode 7', 'Episode 8', 'Episode 9', 'Episode 10', 'Episode 11','Episode 12','Episode 13', 'Episode 14', 'Episode 15', 'Episode 16', 'Episode 17','Episode 18','Episode 19', 'Episode 20', 'Episode 21', 'Episode 22', 'Episode 23'],
                         'Season 4': ['Episode 1', 'Episode 2', 'Episode 3', 'Episode 4', 'Episode 5','Episode 6','Episode 7', 'Episode 8', 'Episode 9', 'Episode 10', 'Episode 11','Episode 12','Episode 13', 'Episode 14'],
                         'Season 5': ['Episode 1', 'Episode 2', 'Episode 3', 'Episode 4', 'Episode 5','Episode 6','Episode 7', 'Episode 8', 'Episode 9', 'Episode 10', 'Episode 11','Episode 12','Episode 13', 'Episode 14', 'Episode 15', 'Episode 16', 'Episode 17','Episode 18','Episode 19', 'Episode 20', 'Episode 21', 'Episode 22', 'Episode 23','Episode 24','Episode 25','Episode 26'],
                         'Season 6': ['Episode 1', 'Episode 2', 'Episode 3', 'Episode 4', 'Episode 5','Episode 6','Episode 7', 'Episode 8', 'Episode 9', 'Episode 10', 'Episode 11','Episode 12','Episode 13', 'Episode 14', 'Episode 15', 'Episode 16', 'Episode 17','Episode 18','Episode 19', 'Episode 20', 'Episode 21', 'Episode 22', 'Episode 23','Episode 24'],
                         'Season 7': ['Episode 1', 'Episode 2', 'Episode 3', 'Episode 4', 'Episode 5','Episode 6','Episode 7', 'Episode 8', 'Episode 9', 'Episode 10', 'Episode 11','Episode 12','Episode 13', 'Episode 14', 'Episode 15', 'Episode 16', 'Episode 17','Episode 18','Episode 19', 'Episode 20', 'Episode 21', 'Episode 22', 'Episode 23','Episode 24'],
                         'Season 8': ['Episode 1', 'Episode 2', 'Episode 3', 'Episode 4', 'Episode 5','Episode 6','Episode 7', 'Episode 8', 'Episode 9', 'Episode 10', 'Episode 11','Episode 12','Episode 13', 'Episode 14', 'Episode 15', 'Episode 16', 'Episode 17','Episode 18','Episode 19', 'Episode 20', 'Episode 21', 'Episode 22', 'Episode 23','Episode 24'],
                         'Season 9': ['Episode 1', 'Episode 2', 'Episode 3', 'Episode 4', 'Episode 5','Episode 6','Episode 7', 'Episode 8', 'Episode 9', 'Episode 10', 'Episode 11','Episode 12','Episode 13', 'Episode 14', 'Episode 15', 'Episode 16', 'Episode 17','Episode 18','Episode 19', 'Episode 20', 'Episode 21', 'Episode 22', 'Episode 23']
}

character_choices = office_data['speaker'].sort_values().unique()
season_choices = office_data['season'].sort_values().unique()
episode_choices = office_data['episode'].sort_values().unique()


app = dash.Dash(__name__,assets_folder=os.path.join(os.curdir,"assets"))
server = app.server
app.layout = html.Div([
            dbc.Row([
                dbc.Col(
                    dcc.Dropdown(
                        id='dropdown4',
                        options=[{'label': i, 'value': i} for i in season_choices],
                        value=season_choices[0]
                    ), width=3
                ),
                dbc.Col(
                    dcc.Dropdown(
                        id='dropdown7',
                        options=[{'label': i, 'value': i} for i in episode_choices],
                        value=episode_choices[0]
                    ), width=3
                ),
                dbc.Col(
                    dcc.Dropdown(
                        id='dropdown5',
                        options=[{'label': i, 'value': i} for i in character_choices],
                        value=character_choices[0]
                    ), width=3
                ),
                dbc.Col(
                    dcc.Dropdown(
                        id='dropdown6',
                        options=[{'label': i, 'value': i} for i in character_choices],
                        value=character_choices[1]
                    ), width=3
                )

            ]),
            dbc.Row([
                dbc.Col(
                    visdcc.Network(
                        id='net',
                        options = dict(
                            height='600px', 
                            width='100%',
                            physics={'barnesHut': {'avoidOverlap': 0.5}},
                            maxVelocity=0,
                            stabilization={
                                'enabled': 'true',
                                'iterations': 15,
                                'updateInterval': 50,
                                'onlyDynamicEdges': 'false',
                                'fit': 'true'
                            },
                        )
                    )
                )
            ])
])


@app.callback(
    Output('dropdown5', 'options'),
    Output('dropdown5', 'value'),
    Input('dropdown4', 'value') #--> choose season
)
def set_character_options2(selected_season):
    return [{'label': i, 'value': i} for i in season_character_dict[selected_season]], season_character_dict[selected_season][0],

@app.callback(
    Output('dropdown6', 'options'),
    Output('dropdown6', 'value'),
    Input('dropdown4', 'value') #--> choose season
)
def set_character_options2(selected_season):
    return [{'label': i, 'value': i} for i in season_character_dict[selected_season]], season_character_dict[selected_season][1],

@app.callback(
    Output('dropdown7', 'options'), #--> filter episodes
    Output('dropdown7', 'value'),
    Input('dropdown4', 'value') #--> choose season
)
def set_episode_options(selected_season):
    return [{'label': i, 'value': i} for i in season_episode_dict[selected_season]], season_episode_dict[selected_season][0],


@app.callback(
    Output('net','data'),
    Input('dropdown4','value'),
    Input('dropdown7','value'),
    Input('dropdown5','value'),
    Input('dropdown6','value'),
)

def network(season_select, episode_select, character_select1, character_select2):
    
    
    filtered = data_for_ng[['season','episode','scene','speaker']]
    filtered = filtered[filtered['season']==season_select]
    filtered = filtered[filtered['episode']==episode_select]

    def assets_pairs(speakers):
        unique_speakers = set(speakers)
        if len(unique_speakers) == 1:
            x = speakers.iat[0]  # get the only unique asset
            pairs = [[x, x]]
        else:
            pairs = it.permutations(unique_speakers, r=2)  # get all the unique pairs without repeated elements
        return pd.DataFrame(pairs, columns=['Source', 'Target']) 
   
    df_pairs = (
        filtered.groupby(['season', 'episode', 'scene'])['speaker']
        .apply(assets_pairs)   # create asset pairs per group 
        .groupby(['Source', 'Target'], as_index=False)  # compute the weights  by 
        .agg(Weights = ('Source', 'size'))              # counting the unique ('Source', 'Target') pairs
    )

    new_df = df_pairs[(df_pairs['Source']==character_select1)|(df_pairs['Source']==character_select2)]

    node_list = list(
        set(new_df['Source'].unique().tolist()+new_df['Target'].unique().tolist())
    )

    nodes = [{
        'id': node_name, 
        'label': node_name,
        #'color':#i_dont_know_what_to_put_here,
        'shape':'dot',
        'size':15
        }
        for i, node_name in enumerate(node_list)]

    #Create edges from df
    edges=[]
    for row in new_df.to_dict(orient='records'):
        source, target = row['Source'], row['Target']
        edges.append({
            'id':source + "__" + target,
            'from': source,
            'to': target,
            'width': 2
        })

    data = {'nodes':nodes, 'edges': edges}
    return data


app.run_server(host='0.0.0.0',port='8051')

The issue I’m having is that when there are lots of connections, it’s hard to see where the source nodes are (the 2 selected characters). So, I want to be able to change the color of those nodes to make these diagrams easier to interpret. However, I haven’t been able to figure out a way to change the color for specific nodes - so far it only seems possible to change the color of all nodes.

Can someone help me figure out how to get this little part of the diagram working properly? Any help would be appreciated!

Thank you!