Figure Friday 2025 - week 1

Here’s my submission:

Feel free to review this and let us know your views on the same:)

Python Code:

Happy Learning, Community!

2 Likes

The New York City Marathon held on the first Sunday of November is a World Marathon Major. The race starts in Staten Island on the western end of the Verrazano Narrow Bridge. After crossing the bridge the race winds through 4 other boroughs of New York City; Brooklyn, Queens, The Bronx and Manhattan.

The code produces three gender based visualizations (M, F, X) to show median pace of age groups ranging from 10s (10-19) up to 80s (80-89). Aggregation by median was chosen over mean to minimize effect of outliers. A library named pycountry_convert was used to get the continent name from the country name.

For genders M & W, the pace patterns are what I would expect, where the fastest runners are in their 20s and 30s, and paces slow as age progresses. Only exception is the data for Africa where it looks like the pace improves with age. This may be a related to the low number of African runners (224 out of 55,524 or 0.4%), and disproportionate representation of African runners among the top performers (5 of the top 10, and 9 of the top 100 runners came from Africa).

For gender X, the pace patterns are somewhat erratic. This may be due to low number of participants, 199 out of 55,524 or 0.2%.

Here is the code:

'''
Plotly Figure Friday - 2025 week 01 - NYC Marathon 
The dataset has results of runners who finished the race. This notebook 
produces plots of median pace by age group, for 6 of the world's 7 continents. 
'''
import plotly.express as px
import polars as pl
import pycountry_convert as pc

def country_to_continent(country_name):
    ''' Use this function to get name of continent from country code'''
    country_alpha2 = pc.country_name_to_country_alpha2(country_name)
    country_continent_code = pc.country_alpha2_to_continent_code(country_alpha2)
    country_continent_name = (
        pc.convert_continent_code_to_continent_name(country_continent_code)
    )
    return country_continent_name

df = (
    # pl.scan_csv produces a lazy frame
    pl.scan_csv('NYC Marathon Results, 2024 - Marathon Runner Results.csv')
    .select(pl.col(['age', 'gender', 'countryCode', 'pace']))
    .with_columns(
        CONTINENT = pl.col('countryCode')
            .map_elements(
                lambda x: country_to_continent(x),return_dtype=pl.String
                )
    )
    .drop_nulls('CONTINENT')
    .with_columns(
        AGE_GROUP = (pl.col('age')/10).cast(pl.UInt16).cast(pl.String) +'0s'
    )
    .with_row_index()
    .with_columns(PACE_SPLIT = pl.col('pace').str.split(':'))
    .with_columns(
        PACE_MINUTES = (pl.col('PACE_SPLIT').list.first().cast(pl.Float32))
    )
    .with_columns(
        PACE_SECONDS = (
            pl.col('PACE_SPLIT').list.slice(1).list.first().cast(pl.Float32) 
            )
    )
    .with_columns(
        PACE_FLOAT = (pl.col('PACE_MINUTES') + pl.col('PACE_SECONDS')/60.0)
    )
    .with_columns(
        PACE_MEDIAN = 
            pl.col('PACE_FLOAT')
            .median()
            .over(['AGE_GROUP', 'CONTINENT', 'gender'])
    )
    .select(pl.col(['CONTINENT', 'AGE_GROUP','PACE_MEDIAN', 'gender' ]))
    .unique(['CONTINENT', 'AGE_GROUP','gender'])
    .collect() # convert lazy frame to dataframe before the pivot
    .pivot(
        on = 'CONTINENT',
        index = ['AGE_GROUP', 'gender']
    )
    .sort('AGE_GROUP')
)

for gender in ['M', 'W', 'X']:
    df_gender = df.filter(pl.col('gender') == gender).drop('gender')
    fig =px.scatter(
        df_gender,
        'AGE_GROUP',
        ['Africa', 'Asia','Europe', 'North America','Oceania','South America'],
        template='simple_white',
        height=400, width=600
    )
    fig.update_layout(
        title=f'Average Pace, gender {gender}'.upper(),
        yaxis_title='Pace (Minutes per Mile)'.upper(),
        xaxis_title='Age group'.upper(),
        legend_title='Continent'
    )
    fig.update_traces(
        mode='lines+markers'
    )
    fig.show()

3 Likes

hi @Shail-Shukla
what version of dash_mantine_components are you running? I’m getting an error when trying to run your code with the latet version of dmc.

0.12

I am yet to upgrade :slight_smile:

1 Like