join the Figure Friday session on January 10, at noon Eastern Time, to showcase your creation and receive feedback from the community.
Welcome to the first week of Figure Friday 2025
This week we’ll look at the results of the NYC Marathon that took place in November 2024. Data includes runner’s name, age, gender, pace, final time, and much more.
Download data:
- Go to Joe Hovde’s google sheet and download it as a CSV sheet. Click File → Download → Comma Separate Values
- Save the CSV sheet in the same directory as the Python code provided (under the sample figure), and run code.
Things to consider:
- can you improve the sample figure below (violin plot)?
- would you like to tell a different data story using a different graph?
- can you create a Dash app instead?
Sample figure:
Code for sample figure:
import plotly.express as px
import pandas as pd
df = pd.read_csv('NYC Marathon Results, 2024 - Marathon Runner Results.csv')
# Convert `pace` column from string format (minutes:seconds) to numeric (float) in minutes
def convert_pace_to_minutes(pace_str):
try:
minutes, seconds = map(int, pace_str.split(':'))
return minutes + seconds / 60
except ValueError:
return None
# Apply conversion to the `pace` column
df['pace_minutes'] = df['pace'].apply(convert_pace_to_minutes)
# Drop rows where `pace_minutes` could not be calculated
cleaned_data = df.dropna(subset=['pace_minutes'])
# Define age groups
bins = [10, 20, 30, 40, 50, 60, 70, 80, 90]
labels = ['10-20', '20-30', '30-40', '40-50', '50-60', '60-70', '70-80', '80-90']
# Create a new column for age groups
cleaned_data['age_group'] = pd.cut(cleaned_data['age'], bins=bins, labels=labels, right=False)
fig = px.violin(
cleaned_data,
x='age_group',
y='pace_minutes',
title='Distribution of Minutes per Mile, by Age Group',
labels={'pace_minutes': 'Pace (minutes per mile)', 'age': 'Age'},
box=True
)
fig.update_xaxes(categoryorder='array', categoryarray=labels)
fig.show()
Participation Instructions:
- Create - use the weekly data set to build your own Plotly visualization or Dash app. Or, enhance the sample figure provided in this post, using Plotly or Dash.
- Submit - post your creation to LinkedIn or Twitter with the hashtags
#FigureFriday
and#plotly
by midnight Thursday, your time zone. Please also submit your visualization as a new post in this thread. - Celebrate - join the Figure Friday sessions to showcase your creation and receive feedback from the community.
If you prefer to collaborate with others on Discord, join the Plotly Discord channel.
Data Source:
Data on the November 2024 marathon results were scraped by researcher & data analyst Joe Hovde from the NYC Marathon Official results page.