Figure Friday 2024 - week 33

Fantastic data analysis as always, and the analysis of what could be improved in the data visualization is equally impressive! Your suggestions for enhancing color contrast by adjusting the scale and unit are spot on. I was about to mention that I would have opted for a different chart for the yearly comparison since it required me to shift my focus between maps to notice differences, but I saw you had already noted that! :rocket:

Additionally, it’s great to see the bump chart in action! I agree with Adam; I typically limit legends to 5-6 categories, with a maximum of 10. For anything beyond that, I prefer using filters. :blush::+1:

2 Likes

Thank you! I know that pie charts are not recommended in data viz. I’m not a beginner. :grinning:
A panel representing the US map as small multiples is a subplot with 8 rows and 11 columns. In order to map data for each US state to its corresponding cell, I created a json file that contains a dict with the following structure:
{"Alabama": {"abbrev": "AL", "axis": 73}, ....}, i.e. for each US state is given its abbreviated name, and the integer associated to the corresponding xaxis, and yaxis. These integers go from 1 to 88=nrows*ncols. But some of them (those corresponding to gaps) are not associated with any state.
If ax is the value associated to a state “axis”, then the trace will be added in the row and column derived as folllows:

row = (ax-1)//ncols+1
col = (ax-1)%ncols +1
3 Likes

Your calculation is really clever! :bulb: I hadn’t thought of that approach before. It seems to offer a lot of potential for other types of charts too. As long as you can map it onto an x/y coordinate and create the shape, it should be feasible :thinking:

1 Like

I’ll work on packaging this up into a dashboard, but my approach was to tabulate each state’s results as a time series and see how they’ve changed. Then I thought about whether it was possible to group these states, and I ran a dbscan over them. It’s been fun, sharing intermediate screenshots:


image
I may freeze one clustering and manually assign labels with some meaning.

4 Likes

Here’s the app:

2 Likes

figure and Graph components not c0mpactabile with my map, can any one suggest how i can handle this

import vizro.models as vm
import vizro.plotly.express as px
from vizro import Vizro
from vizro.actions import filter_interaction
from vizro.tables import dash_data_table
import pandas as pd
import polars as pl
import pyarrow


df = pl.read_csv(
    'C:\\Users\\Moritus Peters\\Downloads\\1976-2020-president.csv',
    infer_schema_length=10000,
    schema_overrides={'writein': pl.Utf8},  # Assuming 'writein' is better handled as a string
    null_values="NA",
    ignore_errors=True
)


# Get the winner by year and state
winner_by_year_state = (
    df.group_by(['year', 'state_po'])
    .agg([
        pl.col('party_detailed').filter(pl.col('candidatevotes') == pl.col('candidatevotes').max()).first().alias('Winning Party'),
        pl.col('candidatevotes').max().alias('candidatevotes'),
        pl.col('totalvotes').first()
    ])
)

# Format party column to be title case
winner_by_year_state = winner_by_year_state.with_columns(
    pl.col('Winning Party')
)

# Calculate the vote percentage for each winner
winner_by_year_state = winner_by_year_state.with_columns(
    (pl.col('candidatevotes') / pl.col('totalvotes')).round(4).alias('Votes')
)

# Select relevant columns
winner_by_year_state = winner_by_year_state.select([
    'year', 'state_po', 'Winning Party', 'Votes'
])

# Filter data for a specific year (e.g., 2020)
#selected_year = 2020
#filtered_df = winner_by_year_state.filter(pl.col('year') == selected_year)

# Convert to pandas for Plotly compatibility
winner_by_year_state =  winner_by_year_state.to_pandas()

# Define color mapping for the parties
party_colors = {
    'Democrat': 'blue',
    'Republican': 'red',
    'Other': 'orange'  # Add more parties if necessary
}

# Apply colors based on the winning party
winner_by_year_state['color'] =  winner_by_year_state['Winning Party'].map(party_colors)
# Group by state and aggregate necessary columns
result = (
    df
    .group_by(['state', 'state_po'])
    .agg([
        pl.col('totalvotes').sum().alias('Total Votes'),
        pl.col('candidatevotes').sum().alias('Total Candidate Votes'),
        (pl.col('candidatevotes').sum() / pl.col('totalvotes').sum() * 100).alias('Winning Vote Percentage'),
        pl.col('candidate').first().alias('Winning Candidate'),
        pl.col('party_simplified').first().alias('Winning Party'),
    ])
    .sort('state_po')
)



page2 = vm.Page(
    title="Page 2",
    path="my-custom-url",
    components=[
        vm.Graph(
            id='location',
            fig=px.choropleth(
                 winner_by_year_state,
                locations='state_po',
                locationmode ='USA-states',
                color='Winning Party',
                hover_name = 'Winning Party',
                hover_data=['Winning Party', 'Votes'],
                scope="usa",
                labels={'Votes': 'Vote Percentage'},
                color_discrete_map=party_colors,
                custom_data=['state_po']
            ),
            actions=[vm.Action(function=filter_interaction(targets=['result_table']))]
            

            
        ),
        vm.Table(
            id='result_table',
            title='Election Result',
            figure=dash_data_table(data_frame=result)
                
            

        )
    ],
    controls=[
        vm.Filter(column="year", targets=["location"]),
     
    ],
)



dashboard = vm.Dashboard(pages=[page2])

Vizro().build(dashboard).run()

1 Like

Hey @Moritus,

the vm.Graph is compatible with any px chart, but from your code I can see that you have a couple of typos.

  1. Take a closer look at the API reference for vm.Graph here. It’s vm.Graph(figure=...) and not vm.Graph(fig=..). The validation errors should have helped here :slight_smile:

  2. Another caveat is that Vizro models currently only accept pandas dataframes. So you need to convert your polar dataframes prior:

    winner_by_year_state = winner_by_year_state.to_pandas()
    result = result.to_pandas()
    
  3. There were also some other issues as the column is called “Winning_Party” in your dataframe, but you have sometimes referenced it as “Winning Party”.

I’ll send you the working code in a DM, so I don’t convolute the forum post here, but hope above helps to see what the errors were :+1:

2 Likes

Thank you for sharing your app, @cal337

A couple of recommendations:

  1. the checkbox is tied to the line chart interactively, but it’s a lot closer to the map. This might confuse the user who is clicking the check boxes and expecting the map to change (that’s what happened to me). Maybe you can move the check boxes to be right above the line chart horizontally.
  2. It’s not very clear what the y axis means? Some states are above zero, some are below, and with time it changes. Ideally, if someone new sees your app, they should understand what each graph is telling them without having to do more research.
  3. Where do the map’s legend names come from?
1 Like

Nice, I’m trying to incorporate these recs in before the session call

I’ve updated my app with bootstrap components but still struggling with the layout - would appreciate any help fixing it! Trying to get the checklist to pop up left of the lines plot on the same row.

1 Like

Thanks @cal337
For those people not familiar with py.cafe, how do we access your code after clicking the link?

Are you able to share a link to the code page directly?

1 Like

You forgot to include a DBC theme to your app.

Try:
app = Dash(external_stylesheets=[dbc.themes.BOOTSTRAP])

3 Likes

I am also pretty new to py.cafe and not sure…there’s an Editor link in the top right but maybe that’s just for me. Pasting the code here:

# check out https://dash.plotly.com/ for documentation
# And check out https://py.cafe/maartenbreddels for more examples
import pandas as pd
import numpy as np
from dash import dcc, html, Dash
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output, State
import plotly.graph_objs as go
import plotly.colors as pc
import plotly.express as px

app = Dash()


palette = ['rgb(255, 182, 193)', '#1F46E0', '#A51300',  #123
'rgba(204, 204, 204, .6)', '#FF4821', '#FECB52', #456
'#3283FE', '#87D1FF', '#AB63FA'] #789



def cluster_map(df):
    fig = px.choropleth(df,
                    geojson="https://raw.githubusercontent.com/plotly/datasets/master/geojson-counties-fips.json",
                    locationmode='USA-states',
                    locations='state_po',
                    color='cluster_name',
                    scope="usa",
                    color_discrete_sequence=palette,
                    category_orders={"cluster_name": list(cluster_name_mapping.values())},
                    hover_name="state",
                    hover_data={"state_po": False},
                    title= "United Clusters"
                    )
    fig.update_layout(margin={"r":0, "l":0, "b":0}, title_x=0.5,)
    return fig


final_df = pd.read_csv('clusters.csv')
clusters = final_df[['cluster', 'cluster_name']].drop_duplicates().sort_values('cluster')
cluster_dict = final_df.set_index("state")['cluster'].to_dict()
df = pd.read_csv('clusters_long.csv')
candidates = pd.read_csv('candidates.csv')

cluster_name_mapping = {
    1: "Clinton Reds",
    2: "Blue Urban Northeast",
    3: "Deep Red South",
    4: "Noise",
    5: "Red Mountain Plains",
    6: "Big Sky Dakotas",
    7: "Blue Converts",
    8: "Trending Blue",
    9: "Swing States",
}


@app.callback(
    Output("Lines", "figure"),
    Input("cluster_selector", "value")
)
def cluster_line_plot(clusters=[]):
    fig = go.Figure()
    if  len(clusters) == 0:
        return fig
        
    df_cluster = df.loc[df['cluster_name'].isin(clusters)]
    
    for state in df_cluster['state'].unique():
        df_state = df.loc[df['state'] == state]
        state_color = palette[cluster_dict[state]-1]
        fig.add_trace(
        go.Scatter(
            x=df_state['year'],
            y=df_state['difference'],
            mode='lines+markers',
            line=dict(color=state_color),
            name=state,
            showlegend=True,
            )
        )
    fig.add_trace(
        go.Scatter(
        x=[1976, 2020],
        y=[0,0],
        mode='lines',
        line=dict(color='black', dash='dot'),
        showlegend=False
        )
    )
    fig.add_trace(
        go.Scatter(
            x=candidates['year'],
            y=[-45]*len(candidates),
            text=candidates['REPUBLICAN'],
            textfont=dict(
                size=8,  # Font size
                color='rgba(255, 0, 0, 0.8)'  # Text color (e.g., red with opacity)
            ),
            mode='text',
            showlegend=False,
        )
    )
    fig.add_trace(
        go.Scatter(
            x=candidates['year'],
            y=[45]*len(candidates),
            text=candidates['DEMOCRAT'],
            textfont=dict(
                size=8,  # Font size
                color='rgba(0, 0, 255, 0.8)'  # Text color (e.g., red with opacity)
            ),
            mode='text',
            showlegend=False,
        )
    )
    fig.update_yaxes(range=[-50, 50],
                     title_text='Voting Difference',  
                     tickvals=[-50, -25, 0, 25, 50],  # Optional: Set specific tick values if needed
        ticktext=['Republicans + 50', 'Republicans + 25', '0', 'Democrats + 25', 'Democrats + 50']  # Optional: Custom tick labels
    )
    fig.update_yaxes(range=[-50, 50])
    fig.update_layout(height=600, paper_bgcolor="rgba(0,0,0,0)",
        plot_bgcolor="rgba(0,0,0,0)")
    return fig


# Callback to update the checklist based on the clicked region on the map
@app.callback(
    Output("cluster_selector", "value"),
    Input("choropleth_map", "clickData"),
    Input("cluster_selector", "value"),
)
def update_checklist_on_click(clickData, current_selection):
    if clickData:
        clicked_state = clickData['points'][0]['location']  # Get the clicked state code
        # Find the cluster associated with the clicked state
        selected_cluster = final_df.loc[final_df['state_po'] == clicked_state, 'cluster_name'].iloc[0]
        
        # Add or remove the cluster from the current selection
        if selected_cluster not in current_selection:
            current_selection.append(selected_cluster)
        else:
            current_selection.remove(selected_cluster)

    return current_selection


app.layout = html.Div(
    className="checklistContainer",
    children=[
        dbc.Container(
            [
                dbc.Row(
                    dbc.Col(
                        dcc.Graph(id="choropleth_map", figure=cluster_map(final_df)),
                        width=12,
                    )
                ),
                dbc.Row(
                    [
                        dbc.Col(
                            dcc.Checklist(
                                id="cluster_selector",
                                options=clusters["cluster_name"],
                                value=["Swing States"],  # default value
                                style={"width": "100%"},
                            ),
                            width=3,
                        ),
                        dbc.Col(dcc.Graph(id="Lines"), width=9),
                    ],
                    align="center",  # Optional: center-align the row's content vertically
                ),
            ],
            fluid=True,
        ),
    ],
)



if __name__ == "__main__":

    app.run_server(debug=True)

Run and edit this code snippet at PyCafe

Hi @cal337 - Adams suggestion is correct. That’s all that was needed to align the checklist :slight_smile: And yes, anyone can edit the code posted on pycafe - It just can’t save it unless it’s to your own account.

3 Likes

Thank you both! Sometimes it’s the simplest things

1 Like

There are always two links:

However, as @cal337 said, if you first open the app in full-screen mode, there is an “EDITOR” button on the top right which goes to the second link where the code/app is side-by-side :slight_smile:

2 Likes