✊🏿 Black Lives Matter. Please consider donating to Black Girls Code today.
⚡️ Concerned about the grid? Kyle Baranko teaches how to predicting peak loads using XGBoost. Register for the August webinar!

Callbacks with large data sets

Hi again, still feeling fresh and raw, finding it hard to decode the docs to find what I am after.

We have a relatively large data set - disk usage, per user, every 5 minutes, for a number of months.

I want to be able to provide a date slider so people can see their usage per date (day only, not every 5 mins) but I don’t want to have to load the complete usage data set first - just the dates.

The default graph can show today’s latest reading. Then users use the slider and that grabs the appropriate data from the database.

I’m finding the callback examples in the Advanced Callbacks and Clientside Callbacks documentation too abstract - I can’t see if they are actually what I need or not.

Could it be as easy as (per the Basic Callbacks) just having two dataframes - an initial df with the list of distinct dates, and a second that’s created (retrieved from the database) when the update_figure(selected_date):function is called?


df = get_distinct_dates()

app = dash.Dash()

app.layout = html.Div([
    dcc.Graph(id='graph-with-slider'),
    dcc.Slider(
        id='year-slider',
        min=df['year'].min(),
        max=df['year'].max(),
        value=df['year'].min(),
        marks={str(year): str(year) for year in df['year'].unique()},
        step=None
    )
])

@app.callback(
    Output('graph-with-slider', 'figure'),
    [Input('year-slider', 'value')])
def update_figure(selected_year):
    date_data = get_usage_data(selected_year)

    fig = px.scatter(date_data, x="gdpPercap", y="lifeExp", 
                     size="pop", color="continent", hover_name="country", 
                     log_x=True, size_max=55)

    return fig