Black Lives Matter. Please consider donating to Black Girls Code today.

Parallel Coordinate Plot bug?: Some Tick Values dissappearing, axes line doesn't stretch to the bottom or top of certain columns

Hi guys,

I’ve generated a parallel coordinate plot to display categorical data, by first generating dictionaries with numerical keys.

when i go to generate my plot, I get the following problems:

  1. Some of the categorical ticktext values are not showing up…
  2. the axis columns don’t stretch to all the tickvalues…


I wonder if these are bugs in the code or if i am doing something wrong?

Code for generating plot:


import os
import numpy as np
import pandas as pd

import plotly
import plotly.graph_objects as go

#generate datasets 

np.random.seed(42)

drinks = np.random.randint(1,8,100)
disney = np.random.randint(1,9,100)
games = np.random.randint(1,9,100)
social = np.random.randint(1,7,100)

#make a dictionary to assign categorical values to a number

drinksdict ={1:'Beer',
             2:'Wine',
             3:'Champagne',
             4:'Gin',
             5:'Whiskey',
             6:'Water',
             7:'Lemon Lime Bitters'}

disneydict = {1:'Elsa (Frozen)',
             2:'Jasmine (Aladdin)',
             3:'Belle (Beauty and the Beast)',
             4:'Ariel (The Little Mermaid)',
             5:'Cinderella (Cinderella)',
             6:'Merida (Brave)',
             7:'Moana (Moana)',
             8:'Mulan(Mulan)'}


socialdict = {1:'Facebook',
              2:'Twitter',
              3:'Instagram',
              4:'Email',
              5:'Phone call',
              6:'Post'}


#generate a dataframe
df = pd.DataFrame({
                   'drinks':drinks,
                   'disney':disney,
                   'social':social
                   })

#generate a plot

plot = plotly.offline.plot

df.dropna(inplace = True)

#Make Parallel Coordinate Plot

fig = go.Figure(data = 
    go.Parcoords(
       name = 'Party Graph',
       tickfont = dict(color = 'white',size = 15),
       labelfont = dict(color = 'white',size = 15), 
       line = dict(color = 'red'),
       
        dimensions = list([
            
            dict(
                 label = 'Disney Princess'
                ,range = (0,df['disney'].max() +1)
                ,tickvals = df['disney']
                ,ticktext = [v for k,v in disneydict.items()]
                ,values = df['disney']
            )
            ,
            dict(
                 label = 'Social Media'
                ,range = (0,df['social'].max() +1)
                ,tickvals = df['social']
                ,ticktext = [v for k,v in socialdict.items()]
                 ,values = df['social']
            )
            ,
            dict(
                 label = 'Drink of choice'
                ,range = (0,df['drinks'].max() +1)
                ,tickvals = df['drinks']
                ,ticktext = [v for k,v in drinksdict.items()]
                ,values = df['drinks']
            )
            
        ])
             
    )             
)

fig.update_layout(
    plot_bgcolor = '#333333',
    paper_bgcolor = '#333333'
    
)

fig.update_yaxes(tickfont = dict(color = 'white')
   
)
fig.show()

plot(fig,auto_open = True)


Sorry guys, I was too quick to use the word ‘bug’ in my title. I found out what the problem was with my code.

for the tickvals, i had used a dataframe of all the values in a given column, e.g. df[‘drinks’].

I should have been using the unique values in each of the dataframes df[‘drinks’].unique(). usin this sorts out the problem

Better yet, use a list of all possible values instead of just the ones picked up from sampling,

i.e. ,tickvals =list(socialdict.keys())