Figure Friday 2024 - week 48

adamschroeder · November 29, 2024, 8:10pm

join the Figure Friday session on December 6, at noon Eastern Time, to showcase your creation and receive feedback from the community.

Did you know that in 2018 98% of the population in Bahrain was using the internet, while in Brazil it was 70% and in Bolivia it was 44%?

In this week’s Figure-Friday we’ll look at the Worldbank’s data on Individuals using the Internet (as a % of the population). It’s important to note that internet users are defined as individuals who have used the Internet (from any location) in the last 3 months. (The Internet can be used via a computer, mobile phone, personal digital assistant, games machine, digital TV etc.)

Things to consider:

can you improve the sample figure below (line chart docs)?
would a different figure tell the data story better?
can you create a Dash app instead?

Sample figure:

Code for sample figure:

import plotly.express as px
import pandas as pd

df = pd.read_csv("https://raw.githubusercontent.com/plotly/Figure-Friday/refs/heads/main/2024/week-48/API_IT.NET.USER.ZS_DS2_en_csv_v2_2160.csv")
df_filtered = df[df["Country Name"].isin(["Angola", "Albania", "Andorra", "Argentina"])]

melted_data = pd.melt(
    df_filtered,
    id_vars=['Country Name'],
    var_name='Year',
    value_name='Quantity'
)

melted_data['Year'] = pd.to_numeric(melted_data['Year'], errors='coerce')
# Drop rows where 'Year' is NaN (non-year columns) or 'Quantity' is NaN
melted_data = melted_data.dropna(subset=['Year', 'Quantity'])
print(melted_data)

fig = px.line(melted_data, y="Quantity", x="Year", color="Country Name", markers=True)
fig.update_layout(hovermode='x unified')
fig.show()

Participation Instructions:

Create - use the weekly data set to build your own Plotly visualization or Dash app. Or, enhance the sample figure provided in this post, using Plotly or Dash.
Submit - post your creation to LinkedIn or Twitter with the hashtags #FigureFriday and #plotly by midnight Thursday, your time zone. Please also submit your visualization as a new post in this thread.
Celebrate - join the Figure Friday sessions to showcase your creation and receive feedback from the community.

If you prefer to collaborate with others on Discord, join the Plotly Discord channel.

Data Source:

Thank you to the WorldBank for the data.

JuanG · December 3, 2024, 2:11pm

Hi, I was looking for a number to rank the countries according to its growth indicator, and I couldn’t find one yet, any clue?
In the meantime, just picking some random year from 1992 to 2023 to display top5 and bottom5. Keep working…

Code

"""Just importing modules"""
from dash import Dash, dcc, html, Input, Output
import dash_bootstrap_components as dbc
import plotly.express as px
import pandas as pd
import numpy as np
import random

np.set_printoptions(suppress=True)

# df_arg = pd.read_csv(r"C:\Users\Juan Aguirre\Downloads\Book2.csv")
df = pd.read_csv("https://raw.githubusercontent.com/plotly/Figure-Friday/refs/heads/main/2024/week-48/API_IT.NET.USER.ZS_DS2_en_csv_v2_2160.csv")

exclude_cols = ['Country Code', 'Indicator Name', 'Indicator Code']
# df[df.columns.difference(exclude_cols, sort=False)]

df_melted = pd.melt(
    df[df.columns.difference(exclude_cols, sort=False)],
    id_vars=['Country Name'],
    var_name='Year',
    value_name='Quantity'
)

df_melted['Year'] = pd.to_numeric(df_melted['Year'], errors='coerce')
# Drop rows where 'Year' is NaN (non-year columns) or 'Quantity' is NaN
df_melted = df_melted.dropna(subset=['Year', 'Quantity'])

df_country_year_idxd = (df_melted.set_index(keys=['Country Name', 'Year']))
df_year_idxd = (df_melted.set_index(keys=['Year']))
# Initialize the Dash app with Bootstrap theme
app = Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])

# Layout definition with Bootstrap components
button = html.Div(
    [
        dbc.Button("Pick Random Year", id="submit-button", className="btn btn-primary me-2", n_clicks=0),
        html.Span(id="selected-year-output", style={"verticalAlign": "middle"}),
    ]
)

app.layout = dbc.Container(
    fluid=True,
    children=[
            dbc.Row([
                dbc.Col(
                    html.H3('Top and Bottom 5 countries by Internet Growth',  className="text-start text-primary mb-2"),
                    #width={'size': 8, 'offset': 2}
                ),
            ]),
            dbc.Row([
                dbc.Col(
                    button,
                    # width={"size": 2, "offset": 5}
                ),
            ]),
            dbc.Row([
                dbc.Col(dcc.Graph(id="top-5-graph", figure={}), width=6),
                dbc.Col(dcc.Graph(id="bottom-5-graph", figure={}), width=6)
            ]),
    ]
)

# Callback to handle random year selection and graph updates
@app.callback(
    [Output("selected-year-output", "children"),
     Output("top-5-graph", "figure"),
     Output("bottom-5-graph", "figure")],
    [Input("submit-button", "n_clicks")]
)
def update_graphs(n_clicks):
    if n_clicks == 0:
        return "Click the button to pick a random year.", {}, {}

    # Randomly select a year between 1992 and 2023
    selected_year = random.randint(1992, 2023)

    # Filter data for the selected year
    filtered = df_year_idxd.loc[selected_year]#.nsmallest(5, columns='Quantity')

    # Get Top 5 countries by Quantity
    top_5 = filtered.nlargest(5, "Quantity")

    # Get Bottom 5 countries by Quantity
    bottom_5 = filtered.nsmallest(5, "Quantity")

    # Create Top 5 bar plot
    fig_top_5 = px.bar(top_5, y="Country Name", x="Quantity", title=f"Top 5 Countries in {selected_year}",
                       text_auto=True, color_discrete_sequence=["#FF7F0E"], orientation='h', template='plotly_white',
                       labels={'Country Name': '', 'Quantity': ''})

    # Create Bottom 5 bar plot
    fig_bottom_5 = px.bar(bottom_5, y="Country Name", x="Quantity", title=f"Bottom 5 Countries in {selected_year}",
                          text_auto=True, color_discrete_sequence=['#BCBD22'], orientation='h', template='plotly_white',
                          labels={'Country Name': '', 'Quantity': ''})

    return f"Selected Year: {selected_year}", fig_top_5, fig_bottom_5


if __name__ == "__main__":
    app.run_server(debug=True, jupyter_mode='external')

adamschroeder · December 3, 2024, 5:04pm

oh, I like the random year button, it’s like an addicting game

What do you mean by:

number to rank the countries according to its growth indicator

JuanG · December 3, 2024, 5:34pm

… I tried with some formula over ‘Quantity’ to rank Countries according to growth penetration, like Year-over-year pd.pct_change() o some other more complex like CAGR (Compound Annual Growth Rate), but the results wasn’t that accurate as I expected despite they were some kind reasonable, like Cuba with the largest growth rate. Anyway, mi initial idea was try to rank the countries the most accurately way possible and plot those metrics for comparison.
I guess, I will keep with Year-over-year metric and go that way.

Tiga · December 4, 2024, 12:30am

I love this pandas function !

lumars · December 4, 2024, 1:09am

Hello! I tried to find the top and bottom 5 countries based on average internet usage over the past decade. I attempted to add a ‘%’ symbol to the y-axis in the top 5 graph, but it kept messing up the values (would display them as x.0000%). As a result, I decided to leave the y-axis without the percentage symbol for now. If you have any tips on how to fix this, I’d appreciate your suggestions!

Code

import pandas as pd
import plotly.express as px
import plotly.io as pio

df = pd.read_csv("https://raw.githubusercontent.com/plotly/Figure-Friday/refs/heads/main/2024/week-48/API_IT.NET.USER.ZS_DS2_en_csv_v2_2160.csv")
df.head()

melted_data = pd.melt(
    df,
    id_vars=['Country Name'],
    var_name='Year',
    value_name='Quantity'
)

melted_data['Year'] = pd.to_numeric(melted_data['Year'], errors='coerce')
melted_data = melted_data.dropna(subset=['Year', 'Quantity'])
print(melted_data)

latest_year = melted_data['Year'].max()
df_last_10_years = melted_data[melted_data['Year'] >= latest_year - 10]

average_quantity = df_last_10_years.groupby('Country Name')['Quantity'].mean().reset_index()

average_quantity_sorted = average_quantity.sort_values('Quantity', ascending=False).head(5)

fig = px.bar(average_quantity_sorted, 
             x='Country Name', 
             y='Quantity', 
             title='Top 5 Countries with the Highest Average Internet Users (as % of population) in the Last Decade',
             labels={'Quantity': 'Average Percentage of Population', 'Country Name': 'Country'},
             color='Quantity')

fig.update_layout(showlegend=False)

fig.update_layout(xaxis_tickangle=-45, yaxis=dict(range=[average_quantity_sorted['Quantity'].min() - 0.5, 100]))

fig.show()

average_quantity_sorted = average_quantity.sort_values('Quantity', ascending=True).head(5)

average_quantity_sorted['Quantity'] = average_quantity_sorted['Quantity'] / 100

fig2 = px.bar(average_quantity_sorted, 
              x='Country Name', 
              y='Quantity', 
              title='Bottom 5 Countries with the Lowest Average Internet Users (as % of population) in the Last Decade',
              labels={'Quantity': 'Average Percentage of Population', 'Country Name': 'Country'},
              color='Quantity')

fig2.update_layout(showlegend=False)

fig2.update_layout(
    yaxis=dict(
        range=[0, average_quantity_sorted['Quantity'].max() + 0.05],
        tickmode='array',
        tickvals=[i / 100 for i in range(0, int(average_quantity_sorted['Quantity'].max() * 100) + 1, 1)],
        ticktext=[f'{i}%' for i in range(0, int(average_quantity_sorted['Quantity'].max() * 100) + 1, 1)],
        tickformat='%'
    )
)

fig2.update_layout(xaxis_tickangle=-45)

fig2.show()

JuanG · December 4, 2024, 11:14am

Hi, If you are trying to change the labels, take a look a this Automatic Labelling with Plotly Express. The website has a lot of examples, to try on…

Code

import plotly.express as px

df = px.data.iris()
fig = px.scatter(df, x="sepal_length", y="sepal_width", color="species",
                 labels={
                     "sepal_length": "Sepal Length (cm)",
                     "sepal_width": "Sepal Width (cm)",
                     "species": "Species of Iris"
                 },
                title="Manually Specified Labels")
fig.show()

adamschroeder · December 4, 2024, 1:36pm

Nice job melting and grouping the data, @lumars . Welcome to the Plotly community.

It’s really surprising to me that Uganda had only an average of 6% of the population using the internet in the past decade. Seems so low. Did anything else stand out to you in the data?

U-Danny · December 5, 2024, 2:12pm

Hi, I have added a choropleth map that shows the distribution of the latest internet usage records for each country, along with each country’s population. I also attempted to add a projection using regression, based on recommendations for targets of 50%, 75%, and 90%, according to the data distribution for each country. Although this projection may not be entirely accurate due to the inherent limitations of the regression models used or the lack of sufficient data, it is valuable for identifying general trends.

Application code

adamschroeder · December 5, 2024, 3:37pm

You used scikit-learn, that neat!

I love working with that library. They have really good packages and built-in modules.
Do you think RandomForestRegressor would have been more accurate?

Also, I read that the adoption of internet followed the power law. i wonder what your graphs would look like if their x and y axis were on a logarithmic scale.

You planning to join the Figure Friday session tomorrow?

Ester · December 5, 2024, 3:43pm

Hi, as a beginner, I tried the task and added the slicer and data labels for better readability:)

Code: GitHubGist

adamschroeder · December 5, 2024, 4:14pm

For a beginner, this is great work, @Ester . Way to go!

What version of Dash are you using?

Regarding the graph, I would remove the x axis title since people would assume the x axis is years without you telling them. I would also update the y axis title to percentage of users

AIMPED · December 5, 2024, 4:38pm

@adamschroeder, I love your always constructive feedback.

Ester · December 5, 2024, 5:14pm

Thank you.Yes, I tried to remove the titles, but I dont know why has not succeeded so far:)
I use dash version 2.18.2. I just watched I have to update it.

adamschroeder · December 5, 2024, 6:40pm

Cool, 2.18.2 is fairly recent. You can shorten your code by removing __name__ so it’s: app = Dash() . And the layout can be a simple list instead of html.Div, so it would be: app.layout = [..]

Mike_Purtell · December 5, 2024, 7:24pm

For week 48 (% internet usage by country), I looked at countries with the biggest internet usage increases year over. I only included the countries that experienced at 25% increase in their internet usage over 2 consecutive years.

Focus on year over year change, or derivative, makes it is easier to compare the recency of when each country had peak growth. Here is a screen shot and the code. I recommend running the code one your system and using the plotly legend to select and deselect various traces.

import plotly.express as px
import polars as pl
import polars.selectors as cs
#-------------------------------------------------------------------------------
#   Read data set, drop columns with uniform data
#-------------------------------------------------------------------------------
df = (
    pl.read_csv('API_IT.NET.USER.ZS_DS2_en_csv_v2_2160.csv')
    .drop('Indicator Name', 'Indicator Code')
)

#-------------------------------------------------------------------------------
#   data has years as columns, countrys as rows. Transpose, then cast YEAR as
#   integer, and percentage columns as type float. Drop all years prior to 1990
#-------------------------------------------------------------------------------
df_transpose = (
    df
    .drop('Country Code')
    .transpose(column_names='Country Name', include_header=True, header_name='YEAR')
    .with_columns(pl.col('YEAR').cast(pl.UInt16))
    .with_columns(cs.string().cast(pl.Float32))
    .filter(pl.col('YEAR') >= 1990)
)

#-------------------------------------------------------------------------------
#   First level filter drops countries where all values are null
#-------------------------------------------------------------------------------
df_transpose = (
    df_transpose[
        [s.name for s in df_transpose 
            if not (s.null_count() == df_transpose.height)
            ]
        ]
)

#-------------------------------------------------------------------------------
#   Next step is to calculate year over year % increase with polars .diff
#   Filter out countries where diff values are all null, and filter out 
#   countries that where the max year over year increase is < 25%
#-------------------------------------------------------------------------------
country_list = df_transpose.select(cs.float()).columns
df_diff = df_transpose
for country in country_list:
    df_diff = (
        df_diff
        .rename({country: country+'_ORG'})
        .with_columns(pl.col(country+'_ORG').diff(n=1).alias(country))
    )
    drop_this_country = False
    if df_diff.select(pl.col(country)).is_empty():
        drop_this_country = True
    else:
        if df_diff[country].max() < 25:
            drop_this_country = True
    if drop_this_country:
        df_diff = df_diff.drop(country, country+'_ORG')                 
data_col_list =  sorted(df_diff.select(cs.float()).columns)
df_diff = df_diff.select(['YEAR'] + data_col_list)

#-------------------------------------------------------------------------------
#   Prepare dataframe for plotting, use px.line() with Year as X-Axis
#-------------------------------------------------------------------------------

rename_col_list =  sorted(df_diff.drop('YEAR').select(~cs.ends_with('_ORG')).columns)
df_plot = df_diff.select(['YEAR'] + rename_col_list)
plot_col_list = df_plot.select(pl.all().exclude('YEAR')).columns    
fig=px.line(
    df_plot,
    'YEAR',
    plot_col_list,
    template='simple_white',
    width=800, height=400,
    title=(
        'Internet Percent Change, Year over Year<br>' +
        '<sup>Only countries who reached 25% growth for any year</sup>'
    )
)
fig.update_layout(
    xaxis=dict(title=dict(text='YEAR')),
    yaxis=dict(title=dict(text='% CHANGE')
    ),
    legend=dict(title=dict(text='Country')),
)

fig.show()

Mike_Purtell · December 5, 2024, 7:39pm

I encourage everyone who is interested in Figure Friday to join the weekly Zoom call where we wrap up the week, and @adamschroeder introduces the data set for the following week. All participants may comment and explain their contribution and can ask questions or offer comments to others. This greatly enhances the learning and insights for the past week, and just like these virtual discussions, the people who participate are some of the nicest to be found anywhere. Hope to see you there, the Zoom link can be found at the top of this page.

JuanG · December 5, 2024, 7:44pm

I tried to find a metric that allows us to rank the internet growth by country but I couldn’t, so any ideas? In the meanwhile I’ve added some meta data from the same repository, just to plot a choropleth map. There are other ‘groupers’ to take a look, as well. So, If copied this code take into account the path to WorldData Metadata repo.
The code has a lot of commented code, just in case I get back to it…

Int_pen_by_year_w48

Code

"""Just importing modules"""
from dash import Dash, dcc, html, Input, Output, State
import dash_bootstrap_components as dbc
import dash_mantine_components as dmc
import plotly.express as px
import plotly.io as pio
import pandas as pd
import numpy as np
from pathlib import Path

pio.templates.default = 'plotly_white'

# np.set_printoptions(suppress=True)

# Original data
df = pd.read_csv("https://raw.githubusercontent.com/plotly/Figure-Friday/refs/heads/main/2024/week-48/API_IT.NET.USER.ZS_DS2_en_csv_v2_2160.csv")

# metadata (complete with URL from WorldDataBank) //
# take into account that the reading in pd.read_csv, the encoding is 'ansi'
# This is a short version with only those columns: ['Code', 'Long Name', 'Income Group', 'Region']
path_meta = Path(r'..\data\World_data_bank_meta_short.csv')
df_meta = pd.read_csv(path_meta, encoding='ansi')
df_meta.dropna(subset='Income Group', axis=0, inplace=True)

# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
## First Part to melt data and plot a line_chart by country
exclude_cols = ['Country Code', 'Indicator Name', 'Indicator Code']
# df[df.columns.difference(exclude_cols, sort=False)]

df_melted = (pd.melt(
    df[df.columns.difference(exclude_cols, sort=False)],
    id_vars=['Country Name'],
    var_name='Year',
    value_name='Quantity'
))

df_melted['Year'] = pd.to_numeric(df_melted['Year'], errors='coerce')
# Drop rows where 'Year' is NaN (non-year columns) or 'Quantity' is NaN
df_melted = df_melted.dropna(subset=['Year', 'Quantity'])


# +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
## Second Part to merge with Income and Region Grouper
df_merged = (df
             .merge(df_meta, how='left', left_on='Country Code', right_on='Code')
             )
row_to_drop = df_merged[df_merged['Code'].isna()].index.values
df_merged.drop(index=row_to_drop, axis=0, inplace=True) # type: ignore

# Dropping columns with threshold 9 non-null
dff = (df_merged.dropna(axis=1, thresh=9))

## Melted with groupers
dff2 = (pd.melt(
    dff[dff.columns.difference(exclude_cols+['Long Name', 'Code'], sort=False)],
    id_vars=['Income Group', 'Region', 'Country Name'],
    var_name='Year',
    value_name='Quantity'
))
dff2['Year'] = pd.to_numeric(dff2['Year'], errors='coerce')
dff2.dropna(axis=0, inplace=True, subset='Quantity')

# ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# This dff3 to plot choroplet map with 'Code'
dff3 = (pd.melt(
    dff[dff.columns.difference(exclude_cols+['Income Group', 'Region', 'Long Name'], sort=False)],
    id_vars=['Code','Country Name'],
    var_name='Year',
    value_name='Quantity'
))
dff3['Year'] = pd.to_numeric(dff3['Year'], errors='coerce')
dff3.dropna(axis=0, inplace=True, subset='Quantity')


# ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
# App Layout
# Initialize Dash app with Bootstrap
app = Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])

button_1 = dbc.Button(
      "Submit",
      id="submit-button",
      color="primary",
      n_clicks=0,
      className="mt-3")

# Layout definition
app.layout = dbc.Container([
    dbc.Row([
        dbc.Col(html.H3("Global Internet Penetration Dashboard", className="text-start text-primary mb-4"), width=12)
    ]),
    dbc.Row([
        dbc.Col([
            html.Label("Select Countries:"),
            dmc.MultiSelect(
                id="country-dropdown",
                data=[{"label": country, "value": country} for country in df_melted['Country Name'].sort_values().unique()],
                maxSelectedValues=3,
                clearable=True,
                searchable=True,
                placeholder="Select up to 3 countries"
            )
        ], width=5),
        dbc.Col([
            html.Label("Select Year Range:"),
            dcc.RangeSlider(
                id="year-slider",
                min=1990,
                max=2023,
                step=1,
                marks = None,
                # tooltip={
                #     "always_visible": True,
                #     "template": "{value}",
                #     "placement":'bottom',
                # },
                # marks={year: str(year) for year in df_melted['Year'] if (year >= 1990) and  (year%3 == 0)},
                value=[1990, 2023]
            )
        ], width=5),
        dbc.Col(button_1, 
            className="text-center",
            width=2
        )
    ]),
    dbc.Row([
        dbc.Col(dcc.Graph(id="time-series-chart"), width=12)
    ]),
    # dbc.Row([
    #     dbc.Col([
    #         html.Label("Select Region:"),
    #         dmc.Select(
    #             id="region-dropdown",
    #             data=[{"label": region, "value": region} for region in dff2['Region'].unique()],
    #             placeholder="Select a region"
    #         )
    #     ], width=6)
    # ]),
    dbc.Row([
        dbc.Col(dcc.Graph(id="region-treemap", figure={}), width=12)
    ]),
    dbc.Row([
        dbc.Col(dcc.Graph(id="yoy-bar-chart", figure={}), width=6),
        dbc.Col(dcc.Graph(id="choropleth-map", figure={}), width=6)
    ]),
    dbc.Row([
        dbc.Col(html.Div(id="summary-statistics"), width=12)
    ])
], fluid=True)

# Callback to update the time-series chart only when submit button is clicked
@app.callback(
    Output("time-series-chart", "figure"),
    [Input("submit-button", "n_clicks")],
    [State("country-dropdown", "value"),
     State("year-slider", "value")]
)
def update_time_series(n_clicks, selected_countries, year_range):
    if not selected_countries or not year_range:
        return px.line(title="Please select countries and year range.")
    
    # print(year_range, selected_countries)
    filtered_df = df_melted[(df_melted["Year"] >= year_range[0]) & (df_melted["Year"] <= year_range[1])]
    filtered_df = filtered_df[filtered_df["Country Name"].isin(selected_countries)]
    
    fig = px.line(
        filtered_df,
        x="Year", y="Quantity",
        color="Country Name",
        title="Internet Penetration Over Time",
        markers=True,
        labels={"Quantity": "Internet Penetration (%)", 'Year':''}
    )
    fig.update_layout(hovermode='x')
    return fig

# Callback to update the treemap chart based on region selection
@app.callback(
    Output("region-treemap", "figure"),
    # Input("region-dropdown", "value"),
    Input("year-slider", "value"),
    # prevent_initial_call = True
)
def update_treemap(year_range):
    filtered_df = dff2[(dff2["Year"] >= 1995) & (dff2["Year"] <= 2000)]
    agg_df = filtered_df.groupby("Region").agg({"Quantity": "sum"}).reset_index()
    total_quantity = agg_df["Quantity"].sum()
    agg_df["Proportion"] = (agg_df["Quantity"] / total_quantity) * 100

    fig = px.treemap(
        agg_df,
        path=["Region"],
        values="Proportion",
        title="Proportion of Global Internet Penetration by Continent by year-range",
        color="Proportion",
        color_continuous_scale="sunset_r",
        labels={"Proportion": "Global Share (%)"}
    )
    return fig

# def update_treemap(selected_region, year_range):
#     # print(str(selected_region), year_range)
#     # print(type(selected_region))
#     filtered_df2 = dff2[(dff2["Year"] >= year_range[0]) & (dff2["Year"] <= year_range[1])]
#     if selected_region:
#         filtered_df2 = filtered_df2[filtered_df2["Region"] == selected_region].sort_values(by='Quantity')
#     fig = px.treemap(
#         filtered_df2,
#         path=["Region", "Country Name"],
#         values="Quantity",
#         title=f"Internet Penetration in {selected_region}",
#         color="Quantity",
#         color_continuous_scale='sunset_r'
#     )
#     return fig

# # Callback for updating the YoY growth bar chart
@app.callback(
    Output("yoy-bar-chart", "figure"),
    Input("submit-button", "n_clicks"),
    State("country-dropdown", "value"),
    State("year-slider", "value"),
    prevent_initial_call=True,
)
def update_yoy_chart(n_clicks, selected_countries, year_range):
    filtered_df = dff2[dff2["Country Name"].isin(selected_countries)]
    filtered_df3 = filtered_df.copy()
    filtered_df3["YoY Growth"] = filtered_df3.groupby("Country Name")["Quantity"].pct_change() * 100
    yoy_filtered = filtered_df3[(filtered_df3["Year"] >= year_range[0]) & (filtered_df3["Year"] <= year_range[1])]
    fig = px.bar(
        yoy_filtered, x="Year", y="YoY Growth", color="Country Name",
        barmode='group',
        title="Year-over-Year Growth"
    )
    return fig

# Callback for updating the choropleth map
@app.callback(
    Output("choropleth-map", "figure"),
    Input("submit-button", "n_clicks"),
    State("year-slider", "value"),
    prevent_initial_call=True,
)
def update_choropleth(n_clicks, year_range):
    filtered_df = dff3[dff3["Year"] == year_range[1]]
    fig = px.choropleth(
        filtered_df, locations="Code", locationmode="ISO-3",
        color="Quantity", title=f"Internet Penetration by Region in Y{year_range[1]}"
    )
    return fig

# # Callback for updating the summary statistics
# @app.callback(
#     Output("summary-statistics", "children"),
#     Input("submit-button", "n_clicks"),
#     State("country-dropdown", "value"),
#     State("year-slider", "value"),
#     prevent_initial_call=True,
# )
# def update_summary_stats(n_clicks, selected_countries, year_range):
#     filtered_df = df[df["Country"].isin(selected_countries)]
#     latest_data = filtered_df[filtered_df["Year"] == year_range[1]]
#     stats = []
#     for _, row in latest_data.iterrows():
#         stats.append(html.P(f"{row['Country']}: {row['Quantity']:.2f}% penetration"))
#     return stats

if __name__ == "__main__":
    app.run(debug=True)#, port=8099, jupyter_mode='external')

Mike_Purtell · December 5, 2024, 7:53pm

Hi @lumars, it is interesting that you could put % on the y-axis for the bottom 5 but not for the top 5. I have plotted with units of % in the past and as I recall it takes values of 1.0 as 100% so you may have to scale your data first. I will take a look. If you are available, can you please join the Friday Zoom call? We will surely figure this out.

Mike_Purtell · December 5, 2024, 7:57pm

Hi @JuanG, if you can join the Zoom call tomorrow, we should be able to brainstorm a good metric to rank the countries by internet growth. Like all of the data sets, there are so many ways that the data can be grouped and transformed to give insights that are not obvious at first glance. I like your thoughts on this.

Topic		Replies	Views
Figure Friday 2024 - week 35 Dash Python figure-friday	19	312	September 10, 2024
Figure Friday 2024 - week 50 Dash Python announcements , figure-friday	38	363	December 24, 2024
Figure Friday 2024 - week 37 Dash Python figure-friday	17	247	September 20, 2024
Figure Friday 2024 - week 31 Dash Python figure-friday	19	377	August 10, 2024
Figure Friday 2025 - week 18 Dash Python announcements , figure-friday	20	198	May 9, 2025

Figure Friday 2024 - week 48

Things to consider:

Participation Instructions:

Data Source:

Related topics