Figure Friday 2024 - week 33

li.nguyen · August 21, 2024, 4:56pm

Fantastic data analysis as always, and the analysis of what could be improved in the data visualization is equally impressive! Your suggestions for enhancing color contrast by adjusting the scale and unit are spot on. I was about to mention that I would have opted for a different chart for the yearly comparison since it required me to shift my focus between maps to notice differences, but I saw you had already noted that!

Additionally, it’s great to see the bump chart in action! I agree with Adam; I typically limit legends to 5-6 categories, with a maximum of 10. For anything beyond that, I prefer using filters.

empet · August 21, 2024, 6:35pm

Thank you! I know that pie charts are not recommended in data viz. I’m not a beginner.
A panel representing the US map as small multiples is a subplot with 8 rows and 11 columns. In order to map data for each US state to its corresponding cell, I created a json file that contains a dict with the following structure:
{"Alabama": {"abbrev": "AL", "axis": 73}, ....}, i.e. for each US state is given its abbreviated name, and the integer associated to the corresponding xaxis, and yaxis. These integers go from 1 to 88=nrows*ncols. But some of them (those corresponding to gaps) are not associated with any state.
If ax is the value associated to a state “axis”, then the trace will be added in the row and column derived as folllows:

row = (ax-1)//ncols+1
col = (ax-1)%ncols +1

li.nguyen · August 21, 2024, 7:32pm

Your calculation is really clever! I hadn’t thought of that approach before. It seems to offer a lot of potential for other types of charts too. As long as you can map it onto an x/y coordinate and create the shape, it should be feasible

cal337 · August 22, 2024, 5:25am

I’ll work on packaging this up into a dashboard, but my approach was to tabulate each state’s results as a time series and see how they’ve changed. Then I thought about whether it was possible to group these states, and I ran a dbscan over them. It’s been fun, sharing intermediate screenshots:

I may freeze one clustering and manually assign labels with some meaning.

cal337 · August 23, 2024, 6:14am

Here’s the app:

Moritus · August 23, 2024, 7:23am

figure and Graph components not c0mpactabile with my map, can any one suggest how i can handle this

import vizro.models as vm
import vizro.plotly.express as px
from vizro import Vizro
from vizro.actions import filter_interaction
from vizro.tables import dash_data_table
import pandas as pd
import polars as pl
import pyarrow


df = pl.read_csv(
    'C:\\Users\\Moritus Peters\\Downloads\\1976-2020-president.csv',
    infer_schema_length=10000,
    schema_overrides={'writein': pl.Utf8},  # Assuming 'writein' is better handled as a string
    null_values="NA",
    ignore_errors=True
)


# Get the winner by year and state
winner_by_year_state = (
    df.group_by(['year', 'state_po'])
    .agg([
        pl.col('party_detailed').filter(pl.col('candidatevotes') == pl.col('candidatevotes').max()).first().alias('Winning Party'),
        pl.col('candidatevotes').max().alias('candidatevotes'),
        pl.col('totalvotes').first()
    ])
)

# Format party column to be title case
winner_by_year_state = winner_by_year_state.with_columns(
    pl.col('Winning Party')
)

# Calculate the vote percentage for each winner
winner_by_year_state = winner_by_year_state.with_columns(
    (pl.col('candidatevotes') / pl.col('totalvotes')).round(4).alias('Votes')
)

# Select relevant columns
winner_by_year_state = winner_by_year_state.select([
    'year', 'state_po', 'Winning Party', 'Votes'
])

# Filter data for a specific year (e.g., 2020)
#selected_year = 2020
#filtered_df = winner_by_year_state.filter(pl.col('year') == selected_year)

# Convert to pandas for Plotly compatibility
winner_by_year_state =  winner_by_year_state.to_pandas()

# Define color mapping for the parties
party_colors = {
    'Democrat': 'blue',
    'Republican': 'red',
    'Other': 'orange'  # Add more parties if necessary
}

# Apply colors based on the winning party
winner_by_year_state['color'] =  winner_by_year_state['Winning Party'].map(party_colors)
# Group by state and aggregate necessary columns
result = (
    df
    .group_by(['state', 'state_po'])
    .agg([
        pl.col('totalvotes').sum().alias('Total Votes'),
        pl.col('candidatevotes').sum().alias('Total Candidate Votes'),
        (pl.col('candidatevotes').sum() / pl.col('totalvotes').sum() * 100).alias('Winning Vote Percentage'),
        pl.col('candidate').first().alias('Winning Candidate'),
        pl.col('party_simplified').first().alias('Winning Party'),
    ])
    .sort('state_po')
)



page2 = vm.Page(
    title="Page 2",
    path="my-custom-url",
    components=[
        vm.Graph(
            id='location',
            fig=px.choropleth(
                 winner_by_year_state,
                locations='state_po',
                locationmode ='USA-states',
                color='Winning Party',
                hover_name = 'Winning Party',
                hover_data=['Winning Party', 'Votes'],
                scope="usa",
                labels={'Votes': 'Vote Percentage'},
                color_discrete_map=party_colors,
                custom_data=['state_po']
            ),
            actions=[vm.Action(function=filter_interaction(targets=['result_table']))]
            

            
        ),
        vm.Table(
            id='result_table',
            title='Election Result',
            figure=dash_data_table(data_frame=result)
                
            

        )
    ],
    controls=[
        vm.Filter(column="year", targets=["location"]),
     
    ],
)



dashboard = vm.Dashboard(pages=[page2])

Vizro().build(dashboard).run()

li.nguyen · August 23, 2024, 12:06pm

Hey @Moritus,

the vm.Graph is compatible with any px chart, but from your code I can see that you have a couple of typos.

Take a closer look at the API reference for vm.Graph here. It’s vm.Graph(figure=...) and not vm.Graph(fig=..). The validation errors should have helped here
Another caveat is that Vizro models currently only accept pandas dataframes. So you need to convert your polar dataframes prior:
```
winner_by_year_state = winner_by_year_state.to_pandas()
result = result.to_pandas()
```
There were also some other issues as the column is called “Winning_Party” in your dataframe, but you have sometimes referenced it as “Winning Party”.

I’ll send you the working code in a DM, so I don’t convolute the forum post here, but hope above helps to see what the errors were

adamschroeder · August 23, 2024, 1:40pm

Thank you for sharing your app, @cal337

A couple of recommendations:

the checkbox is tied to the line chart interactively, but it’s a lot closer to the map. This might confuse the user who is clicking the check boxes and expecting the map to change (that’s what happened to me). Maybe you can move the check boxes to be right above the line chart horizontally.
It’s not very clear what the y axis means? Some states are above zero, some are below, and with time it changes. Ideally, if someone new sees your app, they should understand what each graph is telling them without having to do more research.
Where do the map’s legend names come from?

cal337 · August 23, 2024, 3:42pm

Nice, I’m trying to incorporate these recs in before the session call

cal337 · August 23, 2024, 4:46pm

I’ve updated my app with bootstrap components but still struggling with the layout - would appreciate any help fixing it! Trying to get the checklist to pop up left of the lines plot on the same row.

adamschroeder · August 23, 2024, 4:52pm

Thanks @cal337
For those people not familiar with py.cafe, how do we access your code after clicking the link?

Are you able to share a link to the code page directly?

adamschroeder · August 23, 2024, 4:53pm

You forgot to include a DBC theme to your app.

Try:
app = Dash(external_stylesheets=[dbc.themes.BOOTSTRAP])

cal337 · August 23, 2024, 4:55pm

I am also pretty new to py.cafe and not sure…there’s an Editor link in the top right but maybe that’s just for me. Pasting the code here:

# check out https://dash.plotly.com/ for documentation
# And check out https://py.cafe/maartenbreddels for more examples
import pandas as pd
import numpy as np
from dash import dcc, html, Dash
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output, State
import plotly.graph_objs as go
import plotly.colors as pc
import plotly.express as px

app = Dash()


palette = ['rgb(255, 182, 193)', '#1F46E0', '#A51300',  #123
'rgba(204, 204, 204, .6)', '#FF4821', '#FECB52', #456
'#3283FE', '#87D1FF', '#AB63FA'] #789



def cluster_map(df):
    fig = px.choropleth(df,
                    geojson="https://raw.githubusercontent.com/plotly/datasets/master/geojson-counties-fips.json",
                    locationmode='USA-states',
                    locations='state_po',
                    color='cluster_name',
                    scope="usa",
                    color_discrete_sequence=palette,
                    category_orders={"cluster_name": list(cluster_name_mapping.values())},
                    hover_name="state",
                    hover_data={"state_po": False},
                    title= "United Clusters"
                    )
    fig.update_layout(margin={"r":0, "l":0, "b":0}, title_x=0.5,)
    return fig


final_df = pd.read_csv('clusters.csv')
clusters = final_df[['cluster', 'cluster_name']].drop_duplicates().sort_values('cluster')
cluster_dict = final_df.set_index("state")['cluster'].to_dict()
df = pd.read_csv('clusters_long.csv')
candidates = pd.read_csv('candidates.csv')

cluster_name_mapping = {
    1: "Clinton Reds",
    2: "Blue Urban Northeast",
    3: "Deep Red South",
    4: "Noise",
    5: "Red Mountain Plains",
    6: "Big Sky Dakotas",
    7: "Blue Converts",
    8: "Trending Blue",
    9: "Swing States",
}


@app.callback(
    Output("Lines", "figure"),
    Input("cluster_selector", "value")
)
def cluster_line_plot(clusters=[]):
    fig = go.Figure()
    if  len(clusters) == 0:
        return fig
        
    df_cluster = df.loc[df['cluster_name'].isin(clusters)]
    
    for state in df_cluster['state'].unique():
        df_state = df.loc[df['state'] == state]
        state_color = palette[cluster_dict[state]-1]
        fig.add_trace(
        go.Scatter(
            x=df_state['year'],
            y=df_state['difference'],
            mode='lines+markers',
            line=dict(color=state_color),
            name=state,
            showlegend=True,
            )
        )
    fig.add_trace(
        go.Scatter(
        x=[1976, 2020],
        y=[0,0],
        mode='lines',
        line=dict(color='black', dash='dot'),
        showlegend=False
        )
    )
    fig.add_trace(
        go.Scatter(
            x=candidates['year'],
            y=[-45]*len(candidates),
            text=candidates['REPUBLICAN'],
            textfont=dict(
                size=8,  # Font size
                color='rgba(255, 0, 0, 0.8)'  # Text color (e.g., red with opacity)
            ),
            mode='text',
            showlegend=False,
        )
    )
    fig.add_trace(
        go.Scatter(
            x=candidates['year'],
            y=[45]*len(candidates),
            text=candidates['DEMOCRAT'],
            textfont=dict(
                size=8,  # Font size
                color='rgba(0, 0, 255, 0.8)'  # Text color (e.g., red with opacity)
            ),
            mode='text',
            showlegend=False,
        )
    )
    fig.update_yaxes(range=[-50, 50],
                     title_text='Voting Difference',  
                     tickvals=[-50, -25, 0, 25, 50],  # Optional: Set specific tick values if needed
        ticktext=['Republicans + 50', 'Republicans + 25', '0', 'Democrats + 25', 'Democrats + 50']  # Optional: Custom tick labels
    )
    fig.update_yaxes(range=[-50, 50])
    fig.update_layout(height=600, paper_bgcolor="rgba(0,0,0,0)",
        plot_bgcolor="rgba(0,0,0,0)")
    return fig


# Callback to update the checklist based on the clicked region on the map
@app.callback(
    Output("cluster_selector", "value"),
    Input("choropleth_map", "clickData"),
    Input("cluster_selector", "value"),
)
def update_checklist_on_click(clickData, current_selection):
    if clickData:
        clicked_state = clickData['points'][0]['location']  # Get the clicked state code
        # Find the cluster associated with the clicked state
        selected_cluster = final_df.loc[final_df['state_po'] == clicked_state, 'cluster_name'].iloc[0]
        
        # Add or remove the cluster from the current selection
        if selected_cluster not in current_selection:
            current_selection.append(selected_cluster)
        else:
            current_selection.remove(selected_cluster)

    return current_selection


app.layout = html.Div(
    className="checklistContainer",
    children=[
        dbc.Container(
            [
                dbc.Row(
                    dbc.Col(
                        dcc.Graph(id="choropleth_map", figure=cluster_map(final_df)),
                        width=12,
                    )
                ),
                dbc.Row(
                    [
                        dbc.Col(
                            dcc.Checklist(
                                id="cluster_selector",
                                options=clusters["cluster_name"],
                                value=["Swing States"],  # default value
                                style={"width": "100%"},
                            ),
                            width=3,
                        ),
                        dbc.Col(dcc.Graph(id="Lines"), width=9),
                    ],
                    align="center",  # Optional: center-align the row's content vertically
                ),
            ],
            fluid=True,
        ),
    ],
)



if __name__ == "__main__":

    app.run_server(debug=True)

Run and edit this code snippet at PyCafe

AnnMarieW · August 23, 2024, 5:22pm

Hi @cal337 - Adams suggestion is correct. That’s all that was needed to align the checklist And yes, anyone can edit the code posted on pycafe - It just can’t save it unless it’s to your own account.

cal337 · August 23, 2024, 5:35pm

Thank you both! Sometimes it’s the simplest things

li.nguyen · August 24, 2024, 9:10am

There are always two links:

This opens the app in full-screen: PyCafe - Dash - Electoral Clustering Analysis
This opens the app/code side-by-side: PyCafe - Dash - Electoral Clustering Analysis

However, as @cal337 said, if you first open the app in full-screen mode, there is an “EDITOR” button on the top right which goes to the second link where the code/app is side-by-side

Topic		Replies	Views
Figure Friday 2024 - week 32 Dash Python figure-friday	22	1481	August 21, 2024
Figure Friday 2024 - week 31 Dash Python figure-friday	19	362	August 10, 2024
Figure Friday 2024 - week 38 Dash Python figure-friday	20	285	September 30, 2024
Figure Friday 2024 - week 30 Dash Python figure-friday	27	431	September 8, 2024
Figure Friday 2025 - week 18 Dash Python announcements , figure-friday	20	153	May 9, 2025

Figure Friday 2024 - week 33

Related topics