Performance: How to increase loading speed when access dcc.Store data from callback

There are two parts to my question.

First, when starting Dash app as shown in the demo script below, I want to store some dataframes to dcc.Store objects for processing later callbacks. But it seems like it takes a long time to load the dataframe first before loading the layout if the dataframe is big (like >10MB) . Is there any way to improve this?

Second, when I try to use to State(‘data’) of dcc.Store (like >10MB) in a callback, it takes a while to fire up the callback, especially when I set background=True to use with running and progress. In that case, isn’t the running argument supposed to start while the callback is working to note users that the process is running. In my script, it take a while for running argument to kick-in after clicking Show button, I have to wait for a while before the Spinner show up.

Demo script:

from dash import html,callback,DiskcacheManager,Dash,dcc,dash_table,ctx
import dash_bootstrap_components as dbc
from dash.dependencies import Input,Output,State
import pandas as pd
import diskcache
import time

cache = diskcache.Cache("./cache")
background_callback_manager = DiskcacheManager(cache)

app = Dash(__name__,
           suppress_callback_exceptions=True,
           external_stylesheets=[dbc.themes.CYBORG,
                                 dbc.icons.BOOTSTRAP],
           background_callback_manager=background_callback_manager)
server = app.server

df = pd.read_csv('https://covid19.who.int/WHO-COVID-19-global-data.csv')

app.layout = html.Div(
    [dcc.Store(id="stored_data",data=df.to_dict('records')),
     html.Br(),
     dbc.Collapse([dbc.Row([dbc.Spinner(),
                            html.P("Processing...")]),
                   ],
                  id="collapse_1",
                  is_open=False),
     dbc.Button("Show",id="show_button"),
     dbc.Button("Clear",id="clear_button"),
     dash_table.DataTable(id='result')
     ])

@callback(Output("result","data"),
          Output("result","columns"),
          Input("show_button","n_clicks"),
          Input("clear_button","n_clicks"),
          State("stored_data","data"),
          background=True,
          running=[(Output("collapse_1","is_open"),True,False)],
          prevent_initial_call=True,
          )
def show_data_test(button1,button2,stored_data):
    if ctx.triggered_id=="show_button":
        df = pd.DataFrame(stored_data).head()
        data = df.to_dict('records')
        columns = [{"name": i, "id": i} for i in df.columns]
        return data,columns
    else:
        time.sleep(1)
        return None,None

if __name__ == '__main__':
    app.run(debug=True)

Hello @DeKhaos,

Have you looked into serversude caching for the storages? Browsers limit the amount of data you can store there to about 8mb, plus the network traffic that you use would be pretty big.

See side caching would eliminate the need to transfer the data.

1 Like

Thank you @jinnyzor , great help. I looked into dash-extension. Turn out saving the data as cache on your machine instead of dcc.Store in the browser reduce the load on the app a lot :slight_smile:

1 Like