Figure Friday 2025 - week 17

Avacsiglo21 · April 30, 2025, 12:47pm

Wow! First of all, a truly BIG THANK YOU for so many compliments, especially coming from people who do such magnificent work and from whom I’ve learned so incredibly much. Let me tell you, I believe the growth you’re seeing is thanks to:

The focus on working week after week, creating a dashboard or web app with the datasets that Adam provides, and receiving feedback from such an excellent community. That nurtures and enriches; practicing is very important.
The tools we have available today not only help a lot in doing a better job when you’re clear about what you want to achieve.

All of those recommendations will definitely be incorporated into the dashboard. I’m not sure if it will be this week, but I’m 100% in agreement with everything you’ve suggested.

Thanks you again.

Best Wishes

marieanne · April 30, 2025, 1:05pm

Sometimes when switching, even when you say explicitly, all markers should be red, all the data are correct except the markercolor. And when you switch again, it resets. There is a callback to explicitly clean when changing view, but once and again it “just does something” with the colors sometimes. After debugging with paid AI for 2 hours, I’m done. Somehow the word caching comes up but no idea.

The idea was:

see if netto population could increases due to migration.
see where they come from, for the Netherlands the 11% low income immigrants is interesting, because some people seem to forget that the housing problem could very well be largely increased by 90% not low income.
Maybe see interesting patterns, like India emigration is mostly “high end upper mid income countries”, immigration more from “low income”. Stuff like that.

If the colours are correct you can recognize something happening (or not) better, otherwise it’s a mess.

adamschroeder · April 30, 2025, 1:36pm

nice work, @marieanne . Is it common for the Netherlands to see close to 2 million net migration every year?

marieanne · April 30, 2025, 2:06pm

No, I was so obsessed by colors… Anyway, I forgot about this very large number, since it’s very, very large. When I compare some numbers from the dataset for the NL and, I used @Avacsiglo21 dashboard too, to see if we more or less agree on the maths and which rows to select (I think so), and compare it to this information from our statistical office: Lagere bevolkingsgroei in 2024 | CBS , I am not sure, what I’m looking at in our dataset.

It could be cumulative since someone started counting, our ministry of Health:

Of the 17.9 million inhabitants on 1 January 2024, 2.9 million were born abroad. They came to the Netherlands as migrants.

It’s cumulative. A typical case of “read the background information”.

Avacsiglo21 · April 30, 2025, 2:24pm

I think this is total migration registered to 2024. For example, here in Brazil, the number of Venezuelans is 570.000an this is total until 2024, not for year.

Avacsiglo21 · April 30, 2025, 2:30pm

One of your recommendations done. This one , is the easiest, just modified this code snippet

 dbc.Row([
      dbc.Col([
          html.Div([
          html.H1("🌎 Global Migration Patterns 2024", 
                 className="my-4",
                 style={"color": PRIMARY_COLOR, "fontWeight": "bold"}),
          html.P([
              "Exploring origins of migrants towards ",
              html.Span(id="selected-country-subtitle", 
                      style={"color": PRIMARY_COLOR, "fontWeight": "bold"}),
          ], className="lead", style={"color": SECONDARY_COLOR}),
          dbc.Button("Select Migrants Destination", 
                    id="open-modal-btn", 
                    color="primary",
                    className="mt-3 mb-4")
      ])
  ], width=12)
], className="mb-2")

and add a this callback function

@app.callback(
    Output("selected-country-subtitle", "children"),
    [Input("selected-country", "data")]
)
def update_subtitle(selected_country):
    if not selected_country:
        return "selected destination countries"
    return selected_country

Ester · April 30, 2025, 2:37pm

This dataset is very exciting, thank you @adamschroeder.

Avacsiglo21 · April 30, 2025, 2:49pm

Marianne ,

Take the whole error message and give it to the AI paid, this may work if you give the entire picture.

“caching issue,” it refers to a problem caused by the way information is temporarily stored and used in the cache.

Hopefully you can solve this issue, I would likt to play around with your app

Ester · April 30, 2025, 5:06pm

I tried to change the contrast a little with this code: vmin = np.percentile(vals, 5) vmax = np.percentile(vals, 95)

marieanne · April 30, 2025, 7:58pm

I’ve put the version online which is a combination of the basic one, the adjustment @li.nguyen proposed as an alternative to the dash_daq (problems getting it to work on py.cafe) and the adjustments I made later. See original post.
I already spent quiet some time with AI and this problem, it completely restructured the callbacks (which I should do again, but slightly different) and cleaned the part which filled the dots&lines specifically on the client side.
Plus debug info, and still I was looking at wrongly coloured dots where the debuginfo was correct incl. the color to use.
So… going to work on this one, but not tomorrow or friday.

ranknovice · April 30, 2025, 9:04pm

I’m feeling a bit sheepish looking at everyone’s Dash apps and then submitting a simple plotly figure. I took this weeks challenge as an opportunity to learn how to construct Sankey diagrams in Plotly. After crunching through some data clean-up I decided that looking at continent to continent flows would be the simplest. So, here it is with the flows normalized by the continent’s population in 2025. I’ve used the hovertemplate feature to give information about the links.

I might bash together a Dash app that allows some options in how the plot is made and that burps out some descriptive statistics.

Nice work everyone. If I’m being honest with myself, I’ve put about 6 hours into this. About half of that in cleaning up the data and another half learning to make a Sankey diagram. I’m curious to know if my slowness at this is because I haven’t been using an LLM to help. How much do you rely on AI and what do you use it for? Framing out the dash app? Developing figures? All of the above?

Avacsiglo21 · April 30, 2025, 10:55pm

Hi ranknovice, just 6 hours is a good start. I’ve spent three times that amount of time, or even more. In my case, I do data cleaning and exploration without the help of AI, using JupyterLab. I rely on the free version of Claude AI for the more complex Dash tasks AND trickies coding. AND for learning many topics for instance Machine learning. The way you are doing Is the right one in my opinion because you have to read/understand how the chart parámeters works.

ranknovice · May 1, 2025, 1:44am

Okay, I put the Sankey into a Dash app and some controls on whether population is shown by raw counts, per 100k by country of origin or per 100k by country of destination. Also a check box to add or remove the intra-continental migration link. Finally, a little bit of explanation below the plot about what the meaning of these choices is.

#!/usr/bin/env python
# coding: utf-8
import pandas as pd
import numpy as np
import plotly.express as px
import plotly.graph_objects as go
from dash import Dash, html, dcc, callback, Output, Input
import dash_bootstrap_components as dbc

def hex_to_rgba(hex_color,alpha):
    hex_color = hex_color.lstrip('#')
    r = int(hex_color[0:2], 16)
    g = int(hex_color[2:4], 16)
    b = int(hex_color[4:6], 16)
    return f'rgba({r}, {g}, {b}, {alpha})'

df = pd.read_csv('un-migration-2024.csv')

#Have asterisks in some of the country data
origins = list(df['Origin'].values)
destinations = list(df['Destination'].values)
df['Origin'] = [x.strip('*') for x in origins]
df['Destination'] = [x.strip('*') for x in destinations]

#Get the countries data
countriesdf = pd.read_csv('https://raw.githubusercontent.com/gavinr/world-countries-centroids/refs/heads/master/dist/countries.csv')
#Append data for Hong Kong
new_row = pd.DataFrame(
    {
        "longitude": [114.16993426713121], 
        "latitude": [22.318764728315433],
        "COUNTRY": ['Hong Kong'],
        'ISO': ['HK'],
        'COUNTRYAFF': ['Hong Kong'],
        'AFF_ISO': ['HK'],
    }
)
countriesdf = pd.concat([countriesdf, new_row], ignore_index=True)

continents = [
    'AFRICA',
    'ASIA',
    'EUROPE',
    'LATIN AMERICA AND THE CARIBBEAN',
    'NORTHERN AMERICA',
    'OCEANIA',
]

noncountries = [
    'Low-and-Lower-middle-income countries',
    'Low-and-middle-income countries',
    'Low-income countries',
    'Lower-middle-income countries',
    'Eastern Africa',
    'Eastern Asia',
    'Eastern Europe',
    'Eastern and South-Eastern Asia',
    'Europe and Northern',
    'Northern Africa',
    'Northern Africa and Western Asia',
    'Northern Europe',
    'Northern Mariana Islands',
    'Oceania (excluding Australia and New Zealand)',
    'Upper-middle-income countries',
    'Western Africa',
    'Western Asia',
    'Western Europe',
    'Western Sahara',
    'Land-locked Developing Countries (LLDC)',
    'Latin America and the Caribbean',
    'Least developed countries',
    'Less developed regions',
    'Less developed regions, excluding China',
    'Less developed regions, excluding least developed countries',
    'Eastern Africa',
    'Eastern Asia',
    'Eastern Europe',
    'Eastern and South-Eastern Asia',
    'Small Island Developing States (SIDS)',
    'Solomon Islands',
    'Somalia',
    'South Africa',
    'South America',
    'South-Eastern Asia',
    'Southern Africa',
    'Southern Asia',
    'Southern Europe',
]

contiseries_from = pd.Series(continents)
contiseries_to = pd.Series(continents)
contiseries_to.index = (6 + contiseries_from.index).to_list()

#We have duplicate entries in the table. Drop these.
destorig = (df['Destination']+df['Origin'])
df = df.loc[~destorig.duplicated(),:]

#Lets extract out the inter-continent data and the non-country data as separate tables
continentdf = df.loc[(df['Origin'].isin(continents)),:]
continentdf = continentdf.loc[continentdf['Destination'].isin(continents),:]
continentdf = continentdf.sort_values(['Destination','Origin'])
continentdf = continentdf.reset_index(drop=True)
#continentdf
#contiseries[contiseries == continentdf['Destination']].index.to_list()

#Put in the correct location index and destination index for each origin/destination combo
#Can't figure out how to do this without a for loop LAME!
continentdf['Destidx'] = None
continentdf['Origidx'] = None
for continent in contiseries_to:
    contidx_to = contiseries_to[contiseries_to == continent].index.to_list()[0]
    contidx_from = contiseries_from[contiseries_from == continent].index.to_list()[0]
    
    continentdf.loc[(continentdf['Destination']==continent),'Destidx'] = contidx_to
    continentdf.loc[(continentdf['Origin']==continent),'Origidx'] = contidx_from

#lets make a column normalized by population of each continent. We will normalize by the Origin and Destination
#In seperate columns so the data can be selected in Dash
continentdf['origin_pop_normalized'] = 0.
continentdf['destination_pop_normalized'] = 0.

#Population from Wikipedia and Worldometer
popdict = {
    
    'ASIA': 4835320061,
    'AFRICA': 1549867585,
    'EUROPE': 742556239,
    'LATIN AMERICA AND THE CARIBBEAN': 667888552,
    'NORTHERN AMERICA': 387528403,
    'OCEANIA': 46609602, 
}

for continent in popdict.keys():
    selector = continentdf.loc[:,'Origin'] == continent
    continentdf.loc[selector,'origin_pop_normalized'] = (continentdf.loc[selector,'2024']/popdict[continent])*100000
    selector = continentdf.loc[:,'Destination'] == continent
    continentdf.loc[selector,'destination_pop_normalized'] = (continentdf.loc[selector,'2024']/popdict[continent])*100000

# Define color scheme - using a cohesive color palette
COLORS = {
    'primary': '#3E92CC',      # Blue
    'secondary': '#2A628F',    # Darker Blue
    'success': '#13A76C',      # Green
    'warning': '#FF934F',      # Orange
    'danger': '#DB5461',       # Red
    'info': '#5BC0BE',         # Teal
    'light': '#F2F4F8',        # Light Gray
    'dark': '#292F36',         # Dark Gray
    'bg': '#F2F4F8',           # Light background
    'text': '#292F36',         # Text color
}


# Initialize the Dash app with Bootstrap theme
app = Dash(__name__, 
           external_stylesheets=[dbc.themes.FLATLY],
           meta_tags=[{'name': 'viewport', 'content': 'width=device-width, initial-scale=1'}])

#Make the radio items
radios = dbc.RadioItems(
    id='data-type',
    options=[
        {'label': 'Raw', 'value': 'raw'},
        {'label': 'Per 100k Destination', 'value': 'dest'},
        {'label': 'Per 100k Origin', 'value': 'orig'},
    ],
    value='orig',
    className="mb-4",
    inputClassName="me-2",
    labelClassName="ms-1",
    inline=True
)

checklist = dbc.Checklist(
    id='intracont-filter',
    options=[
        {'label': 'Yes', 'value': 'yes'},
    ],
    value=[],
    inline=True,
    className="mb-4",
    inputClassName="me-2",
    labelClassName="ms-1 me-3"
)

# App layout with Bootstrap components
app.layout = dbc.Container([
    dbc.Row([
        dbc.Col([
            dbc.Card([
                dbc.CardHeader("Control Panel", 
                              style={'background-color': COLORS['primary'], 
                                     'color': 'white', 
                                     'font-weight': 'bold'}),
                dbc.CardBody([
                    html.H5("Analyze population by:", className="card-title"),
                    radios,
                    html.H5("Show Intracontinental?", className="card-title"),
                    checklist,
                ]),
            ], className="shadow-sm mb-4"),
        ], width=12, lg=4, className="mb-4"),    
        dbc.Col(id='plot-card', width=12, lg=8, className="mb-4"),
    ]),
])

@app.callback(
    Output('plot-card', 'children'),
    Input('data-type', 'value'),
    Input('intracont-filter', 'value'),
)
def make_continent_sankey(data_type,intracont):
    continentdfuse = continentdf
    #Have to reorganize the colors to ensure that the links have the same color in either direction. 
    colorindexes = [1,2,3,4,5,6,2,7,8,9,10,11,3,8,12,13,14,15,4,9,13,16,17,18,5,10,14,17,19,20,6,11,15,18,20,21]
    continentdfuse['colors'] = [hex_to_rgba(px.colors.qualitative.Light24[0:21][(idx-1)],0.6) for idx in colorindexes]
    
    if intracont:
        continentdfuse = continentdfuse
    else:
        #Filter the dataframe where Origin and Destination match.
        continentdfuse = continentdf.loc[~(continentdf['Origin'] == continentdf['Destination']),:]
    
    #Setup the basic dictionaries for the nodes and links
    nodeuse = dict(
        pad = 15,
        thickness = 10,
        line = dict(color = "black", width = 0.6),
        label = pd.concat([contiseries_from,contiseries_to]),#continentdfuse['Origin'].unique().append(continentdfuse['Origin'].unique()),
        customdata = ['departures','departures','departures','departures','departures',
                      'arrivals','arrivals','arrivals','arrivals','arrivals','arrivals','arrivals'],
        hovertemplate = '', #'%{label} has %{value} %{customdata}<br>per 100k people.<extra></extra>',
        color = "blue"
    )
    linkuse = dict(
        source = continentdfuse['Origidx'], # indices correspond to labels, eg A1, A2, A1, B1, ...
        target = continentdfuse['Destidx'],
        value = continentdfuse['2024'],
        customdata = continentdfuse['2024']/1000000,
        hovertemplate = '',#'%{customdata:.2f} Million people<br>'+
                        #'migrated from %{source.label}<br>to %{target.label}<br>'+
                        #'<extra></extra>',
        color = continentdfuse['colors'],
    )

    if data_type == 'raw':
        nodeuse['hovertemplate'] = '%{label} has %{value} %{customdata}<extra></extra>'
        linkuse['value'] = continentdfuse['2024']
        linkuse['hovertemplate'] = '%{customdata:.2f} Million people<br>migrated from %{source.label}<br>to %{target.label}<extra></extra>'
        cardtitle = "Population migration among the continents"
        markdown_note = dcc.Markdown(
            '''
            This plot shows the population flows between continents using raw population numbers. 
            This scales the nodes by the total number of departing people and arriving people on the
            respective sides of the chart. Asia has the largest number of emigrants and immigrants. 
            
            Continents of departure are on the left. Continent of arrival is on the right. 
            '''
        )

    elif data_type == 'dest':
        nodeuse['hovertemplate'] = '%{label} has %{value} %{customdata} per 100k people<br>in the destination continent.<extra></extra>'
        linkuse['value'] = (continentdfuse['destination_pop_normalized'])
        linkuse['hovertemplate'] = '%{customdata:.2f} Million people<br>migrated from %{source.label}<br>to %{target.label}<br>This is %{value} per 100k people<br>in the destination continent<extra></extra>'
        cardtitle = "Population migration among the continents relative to continent of arrival"
        markdown_note = dcc.Markdown(
            '''
            This plot shows the population flows between continents scaled as number 
            per 100k people in the continent of arrival. This shows the impact that a given continent
            has on the continents that its people migrate to relative to their population. 
            
            Continents of departure are on the left. Continent of arrival is on the right. 
            '''
        )
    else:
        nodeuse['hovertemplate'] = '%{label} has %{value} %{customdata} per 100k people<br>in the continent of origin.<extra></extra>'
        linkuse['value'] = (continentdfuse['origin_pop_normalized'])
        linkuse['hovertemplate'] = '%{customdata:.2f} Million people<br>migrated from %{source.label}<br>to %{target.label}<br>This is %{value} per 100k people<br>in the continent of origin<extra></extra>'
        cardtitle = "Population migration among the continents"
        markdown_note = dcc.Markdown(
            '''
            This plot shows the population flows between continents scaled as number 
            per 100k people in the continent of origin. This shows the impact that emigrees
            have on the continent they are departing from.
            
            Continents of departure are on the left. Continent of arrival is on the right. 
            '''
        )
    fig = go.Figure(data=[go.Sankey(
        node = nodeuse,
        link = linkuse
    )])

    fig.update_layout(
        hovermode = 'x',
        font=dict(size = 14, color = 'white'),
        plot_bgcolor='white',
        paper_bgcolor='white',
        margin={"r":0,"t":0,"l":0,"b":5},
    )

    return(
        dbc.Card([
            dbc.CardHeader(cardtitle, 
                          style={'background-color': COLORS['primary'], 
                                 'color': 'white', 
                                 'font-weight': 'bold'}),
            dbc.CardBody([
                dcc.Graph(figure=fig),
                markdown_note
            ]),
            dbc.CardFooter([
                'Source: ',dbc.CardLink("UN Population Division", href='https://www.un.org/development/desa/pd/content/international-migrant-stock'),
            ], className="text-center m-0")
        ], className="shadow-sm mb-4")
    )

if __name__ == '__main__':
    app.run(debug=True,port=8055)

marieanne · May 1, 2025, 7:16am

Hi @ranknovice, you are doing yourself no justice talking about your slowness. The time you (can or need to) spend depends on a lot of factors, experience being only one of them. I spent approx 25 hours, probably more than half of those hours “debugging” the marker color problem. If my topography would be very bad and if I would never read newspapers I would have never seen the problem and saved a lot of time.

The rest:

first idea, the circular sankey diagram, too difficult and/or the lib did not want to install and a “then what” notion.
second “your” sankey but idea no. 3 came already up
economic characteristic of origin/destination, what it became this week

AI Free usage:

finding world bank source and downloadlink
finding countries & coordinates and downloadlink
pimping up the bar chart, I create the basic one, see what I want different and ask ChatGPT for the updated code.
generating basic code which I could do myself but always have to google/ask for something “syntax”
generating the basic code for inserting the markers & lines (I know, I know )
first round of debugging markers & lines, after I did it myself and could not find a reason if the end instruction which goes into the map is draw 3 orange circles, I look at 1 one orange and two blue ones.

AI paid (only debugging same problem)

ChatGPT did not come up with something useful, but was happy as ever.
Claude came up with splitting the creation of lines and markers, an extra client side callback to make absolutely sure the instruction starts empty in the callback although debugging already proved that and some extra debugging code. The end result in the map was better but not perfect.

This morning I picked a backup, did the split of instructions, paid AI proposed, again and that makes the end result much better. Not perfect. Somehow splitting did not come up in my mind so…

Lot of hours (too many) for this one.

Like your Sankeys!

Ester · May 1, 2025, 8:09am

I I liked trying out the dark/light mode.

ThomasD21M · May 1, 2025, 2:26pm

Hearing that you also use AI helps me with insecurity about using it. I’m continuing to learn the basics but relying on AI is like a gravity well that just keeps pulling me in.

Avacsiglo21 · May 1, 2025, 2:49pm

HI Thomas,

Absolutely, but use it understanding what it does and why so you can modify some things it often does wrong or unnecessarily. I don’t know your Python level, but it should be intermediate/advanced so you can get the most out of the code—that’s my opinion, of course. The key for me is try to Master Python/Plotly/Dash Fundamentals. On the other hand, I have some understanding of web page creation (HTML, Joomla, CSS), and that helps me understand more quickly.

Hopefully this encourages you or anyone to use wisely

Avacsiglo21 · May 1, 2025, 2:52pm

I prefer the light background

Avacsiglo21 · May 1, 2025, 3:10pm

HI Ester,

Your web app is really awesome! I love the variety of map options – the different types and colors are great, and the immigrant/migrant data and Wikipedia info are super useful. But what I like the most is how efficient and short the code is. And you know me, I always prefer the light theme!

adamschroeder · May 1, 2025, 3:21pm

hi @ranknovice
from my experience, AI can definitely save me some time, but the tricky part is knowing what questions to ask and how to ask them. Like @ThomasD21M said, AI can be a gravity well in which you can easily get lost.

Topic		Replies	Views
Figure Friday 2024 - week 48 Dash Python announcements , figure-friday	26	277	December 7, 2024
Figure Friday 2024 - week 35 Dash Python figure-friday	19	306	September 10, 2024
Figure Friday 2024 - week 38 Dash Python figure-friday	20	285	September 30, 2024
Figure Friday 2025 - week 18 Dash Python announcements , figure-friday	9	92	May 7, 2025
Figure Friday 2025 - week 12 Dash Python figure-friday	21	167	April 2, 2025

Figure Friday 2025 - week 17

Related topics