Figure Friday 2025 - week 18

adamschroeder · May 2, 2025, 2:24pm

join the Figure Friday session on May 9, at noon Eastern Time, to showcase your creation and receive feedback from the community.

What percentage of the OECD countries have access to green space? What percentage of the OECD countries experience job strain?

Answer these and many other questions by using Plotly and Dash on the OECD dataset.

Things to consider:

what can you improve in the app or sample figure below (dumbbell charts)?
would you like to tell a different data story using a different graph?
can you create a different Dash app?

Sample figure:

Code for sample figure:

import plotly.express as px
import pandas as pd
import plotly.graph_objects as go
from dash import Dash, dcc
import dash_ag_grid as dag

# Download CSV sheet at: https://drive.google.com/file/d/1CXxOQA2uBso64VEyvQ3L76AZLxyUlQDB/view?usp=sharing
df = pd.read_csv("OECD-wellbeing.csv")

# Group the dataset
df_grouped = df.groupby(['Measure', 'Education', 'Country', 'Year'])['OBS_VALUE'].sum()
df_grouped = df_grouped.reset_index()

# focus on a specific measure, year, and education level
df_grouped = df_grouped[df_grouped['Measure'] == 'Feeling lonely']
df_grouped = df_grouped[df_grouped['Year'] == 2022]
df_grouped = df_grouped[df_grouped['Education'].isin(['Primary education', 'Secondary education'])]

# Pivot the table to get Primary and Secondary values side-by-side
df_pivot = df_grouped.pivot(index='Country', columns='Education', values='OBS_VALUE').reset_index()

df_pivot.columns.name = None # Remove the name of the index colum

df_pivot = df_pivot.rename(columns={
    'Primary education': 'Primary',
    'Secondary education': 'Secondary'
})


# Sort countries
df_pivot = df_pivot.sort_values(by='Secondary', ascending=True)

# Get the sorted list of countries for the y-axis
countries_sorted = df_pivot['Country'].tolist()

# Prepare Data for Plotly Traces
line_x = []
line_y = []
primary_vals = []
secondary_vals = []

for country in countries_sorted:
    row = df_pivot[df_pivot['Country'] == country]
    primary_val = row['Primary'].iloc[0]
    secondary_val = row['Secondary'].iloc[0]

    primary_vals.append(primary_val)
    secondary_vals.append(secondary_val)

    # For the connecting line segment
    line_x.extend([primary_val, secondary_val, None]) # Add None to break the line
    line_y.extend([country, country, None])


fig = go.Figure()
fig.add_trace(go.Scatter(
    x=line_x,
    y=line_y,
    mode='lines',
    showlegend=False,
    line=dict(color='grey', width=1),
))

# Add markers for Primary Education
fig.add_trace(go.Scatter(
    x=primary_vals,
    y=countries_sorted,
    mode='markers',
    name='Primary Education', # Legend entry
    marker=dict(color='skyblue', size=10),
    hovertemplate =
        '<b>%{y}</b><br>' +
        'Primary Education: %{x:.2f}%' +
        '<extra></extra>'
))

# Add markers for Secondary Education
fig.add_trace(go.Scatter(
    x=secondary_vals,
    y=countries_sorted,
    mode='markers',
    name='Secondary Education', # Legend entry
    marker=dict(color='royalblue', size=10),
    hovertemplate =
        '<b>%{y}</b><br>' +
        'Secondary Education: %{x:.2f}%' +
        '<extra></extra>'
))

fig.update_layout(
    title=dict(text="Feeling Lonely by Education Level (2022)", x=0.5),
    xaxis_title="Percentage Feeling Lonely (%)",
    yaxis_title="Country",
    height=900,
    yaxis=dict(tickmode='array', tickvals=countries_sorted, ticktext=countries_sorted), # Ensure all country labels are shown
    legend_title_text='Education Level',
    legend=dict(
        orientation="h", # Horizontal legend
        yanchor="bottom",
        y=1.02, # Position above plot
        xanchor="right",
        x=1
    ),
    margin=dict(l=100) # Add left margin for country names
)


grid = dag.AgGrid(
    rowData=df.to_dict("records"),
    columnDefs=[{"field": i, 'filter': True, 'sortable': True} for i in df.columns],
    dashGridOptions={"pagination": True},
    columnSize="sizeToFit"
)

app = Dash()
app.layout = [
    grid,
    dcc.Graph(figure=fig)
]


if __name__ == "__main__":
    app.run(debug=False)

Participation Instructions:

Create - use the weekly data set to build your own Plotly visualization or Dash app. Or, enhance the sample figure provided in this post, using Plotly or Dash.
Submit - post your creation to LinkedIn or Twitter with the hashtags #FigureFriday and #plotly by midnight Thursday, your time zone. Please also submit your visualization as a new post in this thread.
Celebrate - join the Figure Friday sessions to showcase your creation and receive feedback from the community.

If you prefer to collaborate with others on Discord, join the Plotly Discord channel.

Data Source:

Thank you to OECD Data Explorer for the data.

thedatahawk · May 5, 2025, 12:31am

Looks very cool! I just finished my first two apps using OECD data. I tried to get multipage to work on render, but I was only able to get it to work on my own computer.

(sorry, they will take up to a minute to load)

adamschroeder · May 5, 2025, 2:31pm

What an interesting way to explore the data, @thedatahawk . Did you use a different dataset from the OECD website?

thedatahawk · May 5, 2025, 3:04pm

Yes! I should have been more clear. I am using a completely different oecd dataset. It is the trade in value added dataset here https://www.oecd.org/en/topics/sub-issues/trade-in-value-added.html

Thanks @adamschroeder

Avacsiglo21 · May 7, 2025, 2:55pm

Hello Everyone,

For this week Figure Friday my proposal/suggestion is a dashboard/app for exploring and understanding well-being data from the OECD (Organisation for Economic Co-operation and Development). It allows users to select various dimensions of well-being, compare countries, analyze trends over time, and even explore potential future scenarios through basic predictive modeling and country clustering.

A summary of the main functionalities:

Interactive Data Exploration:
- Users can select a specific well-being domain (e.g., Health, Safety, Work-Life Balance) using a dropdown menu.
- For each domain, users can choose a specific measure or indicator (e.g., Life expectancy, Employment rate).
- The data can be filtered by demographic groups such as Age, Sex, and Education.
- Users can select one or multiple countries for comparison.
- The dashboard offers two main time perspectives: a “Snapshot” for a single year and a “Journey” for analyzing trends over a period.
Data Visualization:
- Comparison Snapshot: Displays a bar chart comparing the selected measure across the chosen countries for a specific year.
- Historical Journey (Time Series): Presents a line chart showing the trend of the selected measure over a chosen range of years for the selected countries.
- Country Clusters: Utilizes K-means clustering to group countries with similar performance on a selected measure for a given year, providing insights into patterns of well-being across nations.
- Future Trends: Employs linear regression to project potential future values of the selected measure based on historical data for the chosen countries.
Insight Generation and Storytelling:
- Insight Summary: A dynamic text element that summarizes the currently selected data parameters.
- Chart Titles and Annotations: Informative titles and annotations on the charts highlight key findings and encourage interpretation.
- Interpretation Text: Provides narrative explanations and context related to the visualized data, changing based on the active tab and selected parameters. It aims to guide the user in understanding the “story” behind the data.
- Tooltips: Enhanced tooltips with storytelling descriptions for different elements of the dashboard.
- Story Guide (Glossary): A collapsible section explaining key terms used in the dashboard.
Advanced Analytics (Basic):
- Cluster Analysis: Allows users to see how countries group together based on their performance on a specific well-being indicator.
- Future Trend Prediction: Offers a basic forecast of how well-being indicators might evolve in the future.
User Interface and Styling:
- The dashboard uses the Dash Bootstrap Components (dbc) library with the FLATLY theme for a clean and responsive layout.
- Custom CSS and a defined color palette (COLORS) contribute to a consistent and visually appealing user experience.
- Font Awesome icons are used to enhance the visual appeal and intuitiveness of the interface.

Important to mention the dashboard is far from being perfect you can find some errors, this is a type of “MVP”

Some images:

the web app:

As usual highly apprecciate your comments/suggestions

Have a nice day

marieanne · May 7, 2025, 4:38pm

Hi @Avacsiglo21, you left no stone unturned . I was surprised by the avg. gross salary in the Netherlands because it differs so much from what we hear every year. In the dataset it’s far higher/too high? Same for Belgium.

A small google search to validate my thoughts:
What is the average salary in the Netherlands? As of February 2024, the gross average salary in the Netherlands is €44,000 per year or €3,666 per month. This is around $49,000 or ₤37,000. In the Netherlands working hours are, for the most part, kept to 40 hours per week or under.

Avacsiglo21 · May 7, 2025, 5:03pm

Hi Marianne, hahaha, you know you’re right. In fact, if you look at the chart and apply any filter for gender, age, or education, “Total” always appears. It’s one of the many concessions I had to make to be able to create these charts, including a “Total” button. In the Unit of Measure Column is reported US dollars, PPP converted

adamschroeder · May 7, 2025, 7:59pm

What an interesting app, @Avacsiglo21 .

I liked looking at life expectancy at birth:

Is it possible to modify the legend to be discrete-based instead of continuous? Blue 80-89; Yellow 72-29; purple 65-72 ?

That green text of Cluster Insight can probably be changed into a pop-up, because I really just need to read it once, and it kind of blocks the x axis labels.

Avacsiglo21 · May 7, 2025, 8:15pm

Let me see if I understood you correctly: you want to change the color scale so that each cluster has a single color, is that it? Eliminate the color scale?

Regarding the label, I can change it, move it down a bit, but yes, I agree, it could even be a tooltip. However, I’m not sure if I’ll have time for Friday.

Avacsiglo21 · May 7, 2025, 8:37pm

This one done, just moving where the xlabel was. Just One left hahaha

adamschroeder · May 8, 2025, 5:52pm

nice update with the label.
I’m not sure exactly how this could be done. I need to read the docs more.
Your graph really just uses three colors: pink, yellow, purple. But the legend suggests a continuous color mode where one would expect to see a variety of colors and shades. Plus, the color represents the group number each country is in. But i wonder if there is a way to have the legend represent the range of average annual gross earnings instead..

Avacsiglo21 · May 8, 2025, 9:47pm

Ah okkk, I understood your point, the reason Is the color parámeter in the scatter plot Is “Clúster”. So, there are 3 clúster defined. That the reason does not change. Here what I need to do is hide the color scale. To avoid the confusión

Mike_Purtell · May 8, 2025, 10:47pm

Nice work @thedatahawk. I really like the second dashboard with the Treemap. Wondering why it has a dropdown menu to select the year when the only choice is 2020?

Mike_Purtell · May 8, 2025, 10:50pm

Great work @Avacsiglo21, so impressed

thedatahawk · May 9, 2025, 2:14am

Thanks @Mike_Purtell. Good question. I have the data going back to 1997 but the files are quite large. I could only get one year in under GitHubs for size limits.

In my personal computer I have all the data back to 1997.

Ester · May 9, 2025, 7:50am

Hi everyone,
I haven’t used dumbbell chart in plotly yet, this was a good opportunity to try it out.

This Dash application visualizes temporal changes in the “Access to green space” metric for urban populations across OECD countries using a dumbbell chart.

Core Features:

Data Source: The app loads a CSV file (OECD-wellbeing.csv), filters for the measure “Access to green space”, and includes only the years 2012 and 2018.
Data Transformation: A pivot table is created using Country as the index and Year as columns to get comparable OBS_VALUEs side-by-side for each year.
UI Components:
- Dropdown filter for selecting countries (including a “Select All” option).
- KPIs (Key Performance Indicators) display:
  - Country with the highest 2018 value.
  - Country with greatest positive change from 2012 to 2018.
  - Country with the lowest 2018 value.
  - Country with the largest decline.
- Dumbbell Chart using plotly.graph_objects:
  - Two scatter traces (2012 and 2018).
  - Connecting lines between years per country.
  - Annotations showing percentage change.

Callback Logic:

Triggered by the country dropdown (Input).
Dynamically filters data and recalculates:
- The dumbbell chart.
- All KPI values.
Handles edge case for no selection or “Select All”.

Styling & Layout:

Clean, modern CSS-like styling for KPIs and layout blocks.
Chart and layout responsive and designed for clarity with Plotly’s white theme and high contrast markers.
Uses margin and spacing tweaks to accommodate annotations and long country names.

Avacsiglo21 · May 9, 2025, 1:19pm

Hi Adam as requested

Here the code snippet I modified for the label positioning chage the y value in add_annotation to -0.4,
for the color scale, add coloraxis_showscale=False, in update layout.

I

fig = px.scatter(
        cluster_df,
        x='Country',
        y='OBS_VALUE',
        color='Cluster',
        color_discrete_map={i: cluster_colors[i] for i in range(len(cluster_labels))},
        title=f"Country Clusters for {measure} ({year})",
        labels={'OBS_VALUE': measure, 'Cluster': 'Group', 'Country':''},
        category_orders={"Cluster": sorted(cluster_df['Cluster'].unique())},
        size=[12] * len(cluster_df)
    )
    
    # Add horizontal lines for cluster centers
    for cluster, label in cluster_labels.items():
        cluster_mean = cluster_df[cluster_df['Cluster'] == cluster]['OBS_VALUE'].mean()
        fig.add_shape(
            type="line",
            x0=-0.5,
            y0=cluster_mean,
            x1=len(cluster_df['Country'].unique()) - 0.5,
            y1=cluster_mean,
            line=dict(
                color=cluster_colors[cluster],
                width=2,
                dash="dash",
            )
        )
        
        # Add annotation for cluster label
        fig.add_annotation(
            x=len(cluster_df['Country'].unique()) - 0.5,
            y=cluster_mean,
            text=label,
            showarrow=False,
            xshift=50,
            font=dict(
                color=cluster_colors[cluster],
                size=12
            ),
            bgcolor="rgba(255, 255, 255, 0.8)",
            bordercolor=cluster_colors[cluster],
            borderwidth=1,
            borderpad=4
        )
    
    # Improve styling
    fig.update_layout(
        plot_bgcolor=COLORS["card-bg"],
        paper_bgcolor=COLORS["card-bg"],
        font_family="'Segoe UI', Roboto, 'Helvetica Neue', Arial, sans-serif",
        font_color=COLORS["text-primary"],
        ***showlegend=False,***
***        coloraxis_showscale=False,***
        margin=dict(l=40, r=120, t=60, b=120),
        title={
            'text': f"<b>Country Performance Groups</b> for {measure} ({year})",
            'y':0.95,
            'x':0.5,
            'xanchor': 'center',
            'yanchor': 'top',
            'font': dict(size=18)
        }
    )
    
    # Better x-axis with rotated labels
    fig.update_xaxes(
        tickangle=45,
        tickmode='array',
        tickvals=list(range(len(cluster_df['Country'].unique()))),
        ticktext=cluster_df['Country'].unique()
    )
    
    # Add a storytelling annotation
    fig.add_annotation(
        x=0.5,
        y=-0.4,
        xref="paper",
        yref="paper",
        text=f"<i>Cluster Insight:</i> Countries are grouped by similar performance patterns",
        showarrow=False,
        font=dict(
            family="'Segoe UI', italic",
            size=12,
            color=COLORS["secondary"]
        ),
        align="center",
        bgcolor=COLORS["secondary-light"],
        bordercolor=COLORS["secondary"],
        borderwidth=1,
        borderpad=4
    )

Avacsiglo21 · May 9, 2025, 1:39pm

Hi Ester,

I like your application. Excellent topic you chose (Environmental Quality),. If you add the other indicators (‘Exposed to air pollution’,‘Exposure to extreme temperature’), it would make for a very complete story.

Ester · May 9, 2025, 1:49pm

Thank you, unfortunately I don’t have time for it today. See you next time. Your work is better with more tabs!

Avacsiglo21 · May 9, 2025, 1:59pm

Ester as it is works perfectly, it achieves the objective. It’s not about having more content to be better, they are two different approaches and that’s the point. Good job.

Topic		Replies	Views
Figure Friday 2025 - week 10 Dash Python announcements , figure-friday	29	288	March 20, 2025
Figure Friday 2025 - week 21 Dash Python announcements , figure-friday	25	237	June 3, 2025
Figure Friday 2024 - week 48 Dash Python announcements , figure-friday	26	284	December 7, 2024
Figure Friday 2025 - week 12 Dash Python figure-friday	21	199	April 2, 2025
Figure Friday 2025 - week 3 Dash Python announcements , figure-friday	14	316	January 24, 2025