Figure Friday 2025 - week 32

join the Figure Friday session on August 15, at noon Eastern Time, to showcase your creation and receive feedback from the community.

US-Canada-Mexico: what items cross between the borders? what’s the volume of items passing through the borders?

Answer these questions and a few others by using Plotly on the Border Crossing dataset.

Things to consider:

  • what can you improve in the app or sample figure below (scatter map)?
  • would you like to tell a different data story using a different graph or Dash app?
  • how can you explore the data with Plotly Studio?

Sample figure:

Code for sample figure:
from dash import Dash, dcc
import dash_ag_grid as dag
import plotly.express as px
import pandas as pd

# download CSV sheet from Google Drive - https://drive.google.com/file/d/1bfFMhUftTyubSzX14tSP2qdoucneCSlK/view?usp=sharing
df = pd.read_csv("Border_Crossing_Entry_Data.csv")

df_filtered = df.groupby(["Port Name","Port Code", "Latitude", "Longitude"])[["Value"]].sum().reset_index()

fig = px.scatter_map(df_filtered, lat="Latitude", lon="Longitude", size="Value", hover_data=["Port Name"],
                     zoom=3, color_discrete_sequence=["fuchsia"], color="Value", height=650)
fig.update_layout(map_style="open-street-map")

grid = dag.AgGrid(
    rowData=df.to_dict("records"),
    columnDefs=[{"field": i, 'filter': True, 'sortable': True} for i in df.columns],
    dashGridOptions={"pagination": True},
    columnSize="sizeToFit"
)

app = Dash()
app.layout = [
    grid,
    dcc.Graph(figure=fig)
]


if __name__ == "__main__":
    app.run(debug=True)

For community members that would like to build the data app with Plotly Studio, but don’t have the application yet, simply click the Apply For Early Access button on Plotly.com/studio and fill out the form. You should get the invite and download emails shortly after. Please keep in mind that Plotly Studio is still in early access.

Below is a screenshot of the Plotly Studio app built on top of this dataset:

Participation Instructions:

  • Create - use the weekly data set to build your own Plotly visualization or Dash app. Or, enhance the sample figure provided in this post, using Plotly or Dash.
  • Submit - post your creation to LinkedIn or Twitter with the hashtags #FigureFriday and #plotly by midnight Thursday, your time zone. Please also submit your visualization as a new post in this thread.
  • Celebrate - join the Figure Friday sessions to showcase your creation and receive feedback from the community.

:point_right: If you prefer to collaborate with others on Discord, join the Plotly Discord channel.

Data Source:

Thank you to Data. Gov for the data.

1 Like

I’m far from an expert and have a lot to learn about visualizations but to me that looks like a very professional starting dashboard - clean, easily understandable, great layout and design/aesthetics and a reasonable amount of user input fields. Plotly Studio is going to be tough to beat! :grin:

2 Likes

Update: Initially I misunderstood VALUE as a monetary amount when it is really the number of entries into the US. I have updated axis labels, title, and hover info – no change to the graphs.

This dashboard looks at entries into United States over land borders with Canada and Mexico.

You can explore my dashboard on Plotly Cloud, here is the link:

Two take aways for me:

  • The onset of Covid-19 in March 2020 had profound effect on the number of entries across all land borders
  • The often cold & snowy northern border with Canada has much greater seasonal variation than the southern border with Mexico.

Regarding this dataset:

  • No data for New Hampshire, a state with only 1 border crossing, not sure why.
  • Maine has ports at Portland (100+ miles from Canada) and Bar Harbor (60+ miles from Canada). I left them in but don’t know why these are in the dataset.
  • California ports at Calexico and East Calexico have the same GPS coordinates. I merged them into just Calexico.
  • I ignored Measure, which describes the means of transport.

Upper left graph is a choropleth showing all states with border traffic. Hover on any active state sets up a bottom-left map libre scatter_map of the selected states port locations. The bottom-right shows a px.line chart of entry counts by month for the selected state’s ports.

Upper right graph is controlled by the pull-down menu, where 1 of 4 group by parameters (Border, State, top 10 or bottom 10) is selected.

Here is the code:

import polars as pl
pl.Config().set_tbl_cols(10)
import polars.selectors as cs
import plotly.express as px
import dash_ag_grid as dag
import dash
from dash import Dash, dcc, html, Input, Output
import dash_mantine_components as dmc
import dash_bootstrap_components as dbc
import os
dash._dash_renderer._set_react_version('18.2.0')
# ---- NOTES ABOUT THIS DATASET ------------------------------------------------
# California ports Calexico and Calexico East are at the same location. I merged
#     these to be noted as Calexico
# Maine ports Portland and Bar Harbor are not at or near the Canadien Border
# Seasonal variation is much greater with Canada than with Mexico
# Effect of Covid pandemic is very easy to see in these graphs.
#----- GLOBALS -----------------------------------------------------------------
style_horizontal_thick_line = {'border': 'none', 'height': '4px', 
    'background': 'linear-gradient(to right, #007bff, #ff7b00)', 
    'margin': '10px,', 'fontsize': 32}

style_horizontal_thin_line = {'border': 'none', 'height': '2px', 
    'background': 'linear-gradient(to right, #007bff, #ff7b00)', 
    'margin': '10px,', 'fontsize': 12}

style_h2 = {'text-align': 'center', 'font-size': '40px', 
            'fontFamily': 'Arial','font-weight': 'bold'}
style_h3 = {'text-align': 'center', 'font-size': '24px', 
            'fontFamily': 'Arial','font-weight': 'normal'}

parquet_data_source = 'Border_Crossing_Entry_Data.parquet'
csv_data_source = 'Border_Crossing_Entry_Data.csv' 
date_fmt ='%b-%y'
fig_template = 'presentation'

#----- Make dataframe with STATE name and abbreviation -------------------------
state_list = [
    'Alaska',  'Arizona', 'California', 'Idaho', 'Maine', 'Michigan',
    'Minnesota', 'Montana', 'New Mexico', 'New York', 'North Dakota', 'Texas',
    'Vermont', 'Washington'
    ]
state_abbr_list = [
    'AK', 'AZ', 'CA', 'ID', 'ME', 'MI',
    'MN', 'MT', 'NM', 'NY', 'ND', 'TX', 
    'VT', 'WA'
]
df_states = pl.DataFrame({  # will be  joined to add state abbreviations
    'STATE' :state_list,
    'STATE_ABBR' : state_abbr_list
})

zoom_level = {   # zoom level gives best view of all ports
    'Alaska'       : 3,
    'Arizona'      : 6, 
    'California'   : 7, 
    'Idaho'        : 9, # large state, where ports close to each other
    'Maine'        : 5, 
    'Michigan'     : 4,
    'Minnesota'    : 5, 
    'Montana'      : 5, 
    'New Mexico'   : 8, 
    'New York'     : 5, 
    'North Dakota' : 6, 
    'Texas'        : 4, # large state, where ports spread out
    'Vermont'      : 7, 
    'Washington'   : 5
}

#----- GATHER AND CLEAN DATA ---------------------------------------------------
if os.path.exists(parquet_data_source): # use pre-cleaned parquet if exists
    print(f'Reading data from {parquet_data_source}')
    df = (
        pl.scan_parquet(parquet_data_source)
        .with_columns(pl.col('LAT', 'LON').cast(pl.Float64))
        .collect()
    )
else:  # read data from csv and clean, and save to parquet
    print(f'Reading data from {csv_data_source}')
    df = (
        pl.scan_csv(csv_data_source,try_parse_dates=True)
        .rename(lambda c: c.upper()) # col names to upper case
        .select(
            PORT = pl.col('PORT NAME').str.replace('Calexico East', 'Calexico'),
            STATE = pl.col('STATE'),
            BORDER = pl.col('BORDER')
                .str.replace(' Border', ''),
            DATE = pl.col('DATE').str.to_date(format=date_fmt),
            ENTRY_NUM = (pl.col('VALUE').cast(pl.Float64)* 1e-6),
            POINT = pl.col('POINT')   # get rid or parens and word POINT
                .str.replace_all('POINT ', '')
                .str.strip_chars('(')
                .str.strip_chars(')')
        )
        .filter(pl.col('ENTRY_NUM') > 0.0)
        .with_columns(  # LONG and LAT from POINT column have better resolution
            LON = pl.col('POINT').str.split(' ').list.first().cast(pl.Float64),
            LAT = pl.col('POINT').str.split(' ').list.last().cast(pl.Float64),        
        )
        .drop('POINT')
        .drop_nulls(subset='STATE')
        .collect()
        .join(df_states, on='STATE', how='left')
        .with_columns(
            PORT_STATE = pl.col('PORT') + 
                pl.lit(', ') + 
                pl.col('STATE_ABBR')
        )
        .group_by(
            'BORDER', 'STATE', 'STATE_ABBR', 'PORT', 'PORT_STATE',
            'DATE', 'LAT', 'LON', 
            )
        .agg(pl.col('ENTRY_NUM').sum())
        .sort(['BORDER', 'STATE','PORT', 'DATE'])  
    )
    # define enum types, and apply to dataframe columns
    port_list = df.get_column('PORT').unique().sort()
    port_enum = pl.Enum(port_list)
    port_state_list = df.get_column('PORT_STATE').unique().sort()
    port_state_enum = pl.Enum(port_state_list)
    state_enum = pl.Enum(state_list)
    state_abbr_enum = pl.Enum(state_abbr_list)
    border_enum = pl.Enum(['US-Canada', 'US-Mexico'])
    df = (
        df
        .lazy()
        .with_columns(
            STATE=pl.col('STATE').cast(state_enum),
            STATE_ABBR=pl.col('STATE_ABBR').cast(state_abbr_enum),
            PORT=pl.col('PORT').cast(port_enum),
            PORT_STATE=pl.col('PORT_STATE').cast(port_state_enum),
            BORDER = pl.col('BORDER').cast(border_enum)
        )
        .collect()
    )
    df.write_parquet(parquet_data_source)

# #----- DASH COMPONENTS -------------------------------------------------------
dmc_select_group_by = (
    dmc.Select(
        label='Group By',
        placeholder='Select one',
        id='group-by',
        data=['BORDER', 'STATE', 'PORT (TOP 10)', 'PORT (BOTTOM 10)'],
        value='BORDER',
        size='sm',
    ),
)

fig_choro = px.choropleth(
    df.group_by('STATE', 'STATE_ABBR').agg(pl.col('ENTRY_NUM').sum()),
    locations='STATE_ABBR', 
    locationmode='USA-states', 
    scope='usa',
    color='ENTRY_NUM',
    template=fig_template,
    color_continuous_scale='Viridis_r',
    title=(
        'Total Number of Entries<br>' +
        '<sup>Hover over state filters Port Map</sup>'
    ),
    custom_data=['STATE', 'ENTRY_NUM',],
)
fig_choro.update_traces(
    hovertemplate =
        '%{customdata[0]}<br>' + 
        '%{customdata[1]:,.2f} Million<br>' + 
        '<extra></extra>'
)
fig_choro.update(layout_coloraxis_showscale=False)

# #----- FUNCTIONS ---------------------------------------------------------------
def get_port_map(selected_state):
    ''' returns scatter_map of selected state with ports '''
    df_state = (
        df
        .filter(pl.col('STATE') == selected_state)
        .group_by('STATE', 'STATE_ABBR', 'PORT', 'PORT_STATE', 'LAT', 'LON')
        .agg(pl.col('ENTRY_NUM').sum())
        .sort('PORT')
    )
    # Calculate bounding box
    min_lat, max_lat = df_state['LAT'].min(), df_state['LAT'].max()
    min_lon, max_lon = df_state['LON'].min(), df_state['LON'].max()
    # Calculate center
    center_lat = (min_lat + max_lat) / 2
    center_lon = (min_lon + max_lon) / 2
    # Create a scatter map
    fig_state_port_map = px.scatter_map(
        df_state,
        lat='LAT', lon='LON',
        text='PORT_STATE',
        center={'lat':center_lat, 'lon':center_lon},
        zoom=zoom_level[selected_state],
        title=(f'Port Map of {selected_state}'),
        map_style='basic',  # Map Libre
        opacity=0.75,
        template=fig_template,
        custom_data=['PORT_STATE', 'ENTRY_NUM',],
    )
    fig_state_port_map.update_traces(
        hovertemplate =
            '%{customdata[0]}<br>' + 
            '%{customdata[1]:,.2f} Million<br>' +
            '<extra></extra>'
    )
    fig_state_port_map.update_traces(marker={'size': 20})
    fig_state_port_map.update(layout_coloraxis_showscale=False)
    return fig_state_port_map

def get_line_group_by(group_by):
    ''' returns px.line plot of entry_num by selected group_by '''
    if group_by in ['BORDER', 'STATE']:
        df_group_by = (
            df
            .group_by('DATE', group_by)
            .agg(pl.col('ENTRY_NUM').sum())
            .sort(group_by)
        )
        custom_col = group_by
    else: # by port, top 10 or bottom 10
        custom_col='PORT_STATE'
        if 'TOP' in group_by:
            ten_port_list = (
                df
                .group_by('PORT_STATE')
                .agg(pl.col('ENTRY_NUM'))
                .sort('ENTRY_NUM', descending=True)
                .head(10)
                .get_column('PORT_STATE')
                .to_list()
            )
        if 'BOTTOM' in group_by:
            ten_port_list = (
                df
                .group_by('PORT_STATE')
                .agg(pl.col('ENTRY_NUM'))
                .sort('ENTRY_NUM', descending=True)
                .tail(10)
                .get_column('PORT_STATE')
                .to_list()
            )
        df_group_by = (
            df
            .filter(pl.col('PORT_STATE').is_in(ten_port_list))
            .group_by('DATE', 'PORT_STATE')
            .agg(pl.col('ENTRY_NUM').sum())
            .sort('PORT_STATE', 'DATE')
        )
    color_by = group_by
    sort_by = [group_by, 'DATE']
    if 'PORT' in group_by:
        color_by = 'PORT_STATE'
        sort_by = ['PORT_STATE', 'DATE']
    fig_line_group_by = px.line(
        df_group_by.sort(sort_by),
        x='DATE',
        y='ENTRY_NUM',
        color=color_by,
        template=fig_template,
        markers=True,
        line_shape='spline',
        custom_data=[custom_col, 'DATE', 'ENTRY_NUM',],
    )
    fig_line_group_by.update_traces(
        hovertemplate =
            '%{customdata[0]}<br>' + 
            '%{customdata[1]}<br>' + 
            '%{customdata[2]:,.2f} Million<br>' +
            '<extra></extra>',
        line={'width': 1}, marker={'size': 2},
    )
    legend_title=group_by
    if 'PORT' in group_by:
        legend_title = 'PORT'
    fig_line_group_by.update_layout(
        title= (
            f'Number of Entries (Millions) by {group_by} <br>' +
            '<sup> Each date represents a full month'
        ),
        xaxis_title='',  # x-axis is for date, label is obvious, keep blank
        yaxis_title='Entry Count (Millions)',
        legend_title_text=f'{legend_title}'
    )
    return fig_line_group_by

def get_state_ports(selected_state):
    ''' returns px.line plot of entries by selected group_by '''
    df_state = (
        df
        .filter(pl.col('STATE') == selected_state)
        .group_by('PORT_STATE','PORT', 'DATE')
        .agg(pl.col('ENTRY_NUM').sum())
        .sort('PORT_STATE', 'DATE')
    )
    fig_state_ports = px.line(
        df_state, x='DATE', y='ENTRY_NUM',
        color='PORT',
        template=fig_template,
        markers=True,
        line_shape='spline',
        custom_data=['PORT_STATE', 'DATE', 'ENTRY_NUM',],
    )
    fig_state_ports.update_traces(
        hovertemplate =
            '%{customdata[0]}<br>' + 
            '%{customdata[1]}<br>' + 
            '%{customdata[2]:,.2f} Million<br>' +
            '<extra></extra>',
        line={'width': 1},
        marker={'size': 2},
    )
    fig_state_ports.update_traces(line={'width': 1}, marker={'size': 2},)
    fig_state_ports.update_layout(
        title= (
            f'Number of Entries (Millions), {selected_state}<br>' +
            '<sup> Each date represents a full month'
        ),
        xaxis_title='',  # x-axis is for date, label is obvious, keep blank
        yaxis_title='ENTRY_NUM (Millions)',
        legend_title_text=f'{selected_state} Port'
    )
    return fig_state_ports

# #----- DASH APPLICATION STRUCTURE-----------------------------------------------
app = dash.Dash(__name__, external_stylesheets=[dbc.themes.BOOTSTRAP])
server = app.server
app.title = 'US Border Crossings, Number of Entries'

app.layout =  dmc.MantineProvider([
    dmc.Space(h=30),
    html.Hr(style=style_horizontal_thick_line),
    dmc.Text('US Border Crossings, Number of Entries', ta='center', style=style_h2),
    html.Hr(style=style_horizontal_thick_line),
    dmc.Grid(children = [
        dmc.GridCol(dmc.Text(''), span={"base": 8, "lg":7},),  # indent space
        dmc.GridCol(dmc_select_group_by, span={"base": 3, "lg":3}),
    ]),  
    dmc.Grid(  
        children = [
            dmc.GridCol(dcc.Graph(figure=fig_choro,id='choro'),
                span={"base": 6, "lg":6} 
            ),
            dmc.GridCol(dcc.Graph(id='line-group-by'), 
                span={"base": 6, "lg":6}
            ),          
        ]
    ),
    dmc.Grid(  
        children = [
            dmc.GridCol(dcc.Graph(id='port-map'), 
                span={"base": 6, "lg":6}
            ),
            dmc.GridCol(dcc.Graph(id='port-data'), 
                span={"base": 6, "lg":6}
            ),          
        ]
    ),
])

@app.callback(
    Output('port-map', 'figure'),
    Output('line-group-by', 'figure'),
    Output('port-data', 'figure'),
    Input('choro', 'hoverData'),
    Input('group-by', 'value'),
)
def update(selected_state, group_by):
    if selected_state is None:
        selected_state = 'California'
    else:
        selected_state = selected_state['points'][-1]['customdata'][0]
    port_map = get_port_map(selected_state)
    line_group_by = get_line_group_by(group_by)
    port_data_fig = get_state_ports(selected_state)
    return port_map, line_group_by, port_data_fig

if __name__ == '__main__':
    app.run_server(debug=True)

Here is a screenshot:

3 Likes

Hi Adams,

Just clarify me the Value column in this dataset what represents?

1 Like

Good question, @Avacsiglo21 .

the data [value column] are the total number of crossings that occur in each month or year. For example, one drayage truck operating on the southern border may cross through the port of Laredo, Texas five times a day as it loads and unloads shipments between the United States and Mexico. More here.

1 Like

@Mohamad_Osama I recommend reviewing a post about 5 years ago that you can search for, title is “How to Get your Questions Answered on the Plotly Forum”. Are you suggesting that Figure Friday should toss out the csv file released on Friday and replace it with your pre-cleaned parquet file? Good luck on that.

Hi @Mohamad_Osama ,
Thanks for using Figure Friday as your graduation project.
Can you share more about the error that you got on Plotly Studio?

  1. Did Plotly Studio show a blank page? Do you have a screenshot to share?
  2. Did it give you any error messages?
  3. At what stage of the app-building process did it get stuck exactly?

@Mike_Purtell
I really like the interactivity feature you implemented where hovering over a State updates the two visualizations below.

I didn’t realize Main had so many border crossings :open_mouth:
And it’s a bit unfortunate that for many states the amount of goods crossing the border post-COVID never recuperated.

2 Likes

@Avacsiglo21 @adamschroeder

If a row says:

  • Border: US-Mexico

  • Port Name: El Paso

  • Date: January 1, 2021

  • Measure: Trucks

  • Value: 150

It means 150 trucks crossed the border at El Paso on that day “150 of this traansport_category (measure)’.
i.e. “On Jan 1st, 2021, at the US-Mexico border in El Paso, 150 trucks crossed.”

During your cleaning and data preparation step, it might help if you changed the names of those two columns (‘measure’ & ‘value’) to a more descriptive ones, e.g.:

df = df.rename(columns = {'measure': 'transport_category', 'value': 'total_count'})

Kindly notice though that this dataset includes land ports only, both air and water ports / entries are excluded. It also don’t show the actual number of individuals entered through those ports.

N.B.

  • For more information about the cleaning process, kindly review the cleaning NBs Here. NB 4_4 summarizes the whole cleaning process.
  • If new data were released, please feel free to use my NBs as a reference framework, all you need is please open NB1_1, upload the new CSV file, then run all the cells, please do the same with NBs 2 & 3. NB 4 is a summary for the cleaning process, with suggestions to improve the quality of the dataset, and prepare it for ML models.

Best of luck :slight_smile:

2 Likes

No error messages, it just keeps loading / creating env until forever after uploading my parquet file.

I can do share the screen later today after I finish work.

Thank you @adamschroeder. I misunderstood VALUE as a monetary figure; it is actually the number of people entering the US. I will update my submission to change axis labels and title, with no change expected to the graphs. Regarding the number of ports in Maine, it is a very large state uniquely surrounded by Canada on 3 sides. I am guessing Portland and Bar Harbor have boat or ferry services that count as land borders.

2 Likes

hi @Mohamad_Osama hi @Mike_Purtell
I think Mohamad’s explanation is close but I don’t think it’s limited to the number of trucks.

“For example, one truck operating on the southern border may cross through the port of Laredo, Texas five times a day as it loads and unloads shipments between the United States and Mexico.” That should be considered 5 crossings although it’s only one truck.

2 Likes

This is what I meant actually, the number of crossings for this transport_category (measure).

2 Likes

Hi @adam, I have crossed the border at Champlain Rouses Point from Quebec into New York State 3 times: once in a car with 7 people, once in a car with 3 people and once on a train with 100+ people. My guess is the counts are based on the number of people, so instead of this counting as 3 entries it would be 10 entries by car plus the number of train passengers. Do you or others interpret this the same way, or would this count as 3 entries? Thank you.

I think this would count as 3 entries. But I wonder if anyone would interpret this differently here?

1 Like

Agree, this Is the way I see it.

1 Like

UPDATED: I published a Plotly Studio app to Plotly Cloud. It is very humbling to create a dashboard this way after many hours spent this weekend as a dashboard rookie making one the old way by creating the concept, writing the code and debugging it.

Here is the link:

The app was temporarily down after I asked the app to alter the color scheme on the header. To get it back I had it rebuild from scratch. A useful tip for others is to keep the text you provided as instructions to Plotly Studio. That will help you if you have to remake the app with modified instructions. A useful feature for Plotly Studio would be to enable developers to retrieve the original instructions used to build the app in cases where not available.

Many thanks to @adamschroeder , @mattbrown and @chriddyp for teaching me how to create dashboards and for allowing me to access this very impressive and mesmerizing tool.

2 Likes

wow, @Mike_Purtell , amazing first Plotly Studio app. I’m glad you got it on the Cloud as well.
Have you tried going into each graph’s module on the left side bar? Once you’re in there, you should be able to update the outline prompt (text field on the top) to get your desire modification.

Update: I see you updated your post. I’m glad you got the modifications working.

1 Like

@mattbrown this has come up before, correct?

1 Like

Yes, it’s a great idea and we have a feature card for it over on roadmap.plotly.com. The best part is that you can vote on those features, which helps the Product team gauge interest!

3 Likes