Load data once and retrieve in multiple pages

I am trying to host a multi-page dash app that reports the performance of a machine learning model in form of charts, tables etc. This app needs models and datasets to be loaded in order to populate the charts in each page. Now these models and datasets are slightly larger in size and therefore loading them independently in page does not seem like an optimal solution. Instead, what I am trying to achieve is something like this - Load the models and datasets in the app.py file which hosts the dash app and then retrieve those objects in the child pages (e.g. page_1.py). I am not able to figure out a way to do this though. I can’t just import them from app.py as it would cause recursion error. So I am trying to use Flask-Caching module to cache the objects in my local in app.py and then retrieve them from the same cache inside the page_1.py but I am not sure how to do this since the caching function is again declared in the app.py file. Is there another (better) way to achieve my objective?

1 Like

Hi, have you tried the dash dcc.Store method? It can be used to save the data and use it on multiple pages.

Link to the documentation: Store | Dash for Python Documentation | Plotly

Best,
Saurabh

As far as I am aware, dcc.Store method works well only for small data size. In my case, I have about 10 datasets each about 100 MB and a 10 models of similar size. Not sure if dcc.Store would work in this scenario.

1 Like

Hi @asanoop24 and @saurabh_joshi and welcome to the Dash community :slightly_smiling_face:

This is a great question!

As mentioned by @saurabh_joshi , if you aren’t working with large datasets, then using dcc.Store is the easiest way to share data between pages of a multi-page app.

However, if you have larger data, then you may want to use caching as described in example 3 and 4 in sharing data between callbacks chapter of the dash tutorial. Part 5. Sharing Data Between Callbacks | Dash for Python Documentation | Plotly

If you are using the pages/ feature to make a multi-page app, there are some new features in dash 2.5 (soon to be announced) that will make this much easier. There is a new get_app() function that can be used to access the app object from modules within the pages folder without running into the circular imports issue. You can see an example based on the dash-tutorial here: dash-multi-page-app-demos/multi_page_cache at main · AnnMarieW/dash-multi-page-app-demos · GitHub

Multi-page app (without using the pages/ feature)
With a multi page app, that does not use the pages/ feature, you can refactor your project so that you define the app and the cache in a separate file (app.py). Then in all the other modules, you can use from app import app without running into the circular imports issue.

The app structure would look something like:

- app.py 
- index.py
- pages
   |-- __init__.py
   |-- page1.py
   |-- page2.py


6 Likes

Hi @AnnMarieW I was looking into sharing data across pages with the pages/feature and it looks like the link you previously shared is broken. Does this example live somewhere else?

1 Like

Hi @kelly_gfc

I removed the cache demo because the background callbacks were easier to use with multi-page apps. Here’s an example: dash-multi-page-app-demos/multi_page_cache_background_callback at main · AnnMarieW/dash-multi-page-app-demos · GitHub

Also, if you don’t need to set up the cache, here is an example that just uses dcc.Store: dash-multi-page-app-demos/multi_page_store at main · AnnMarieW/dash-multi-page-app-demos · GitHub

I also expect that the sever side cache in Dash Extensions will work with Pages, but I haven’t tried it yet.

2 Likes

Hi @AnnMarieW,

I tried to reproduce your example from dash-multi-page-app-demos/multi_page_store at main · AnnMarieW/dash-multi-page-app-demos · GitHub, but using background callbacks (since my dataset is too large for dcc.Store) and was unsuccessful. Could you possibly provide some additional guidance? Many thanks in advance.

Hey @lorenzo did you try the last link @AnnMarieW provided?

The link is broken, but you can find the information here:

Hi @AIMPED, thanks for the hint. I tried to use ServersideOutputTransform, but in vain.

Here is a MVE of what I am trying to achieve. I’m probably making a mistake somewhere:

app.py:

import dash
from dash import dcc  # pip install dash
import dash_bootstrap_components as dbc  # pip install dash-bootstrap-components
from dash_extensions.enrich import DashProxy, Output, Input, State, Serverside, html, dcc, \
    ServersideOutputTransform


app = DashProxy(
    __name__, 
    transforms=[ServersideOutputTransform()], 
    external_stylesheets=[dbc.themes.FLATLY], 
    suppress_callback_exceptions=True, 
    use_pages=True
)

navbar = dbc.NavbarSimple(
    dbc.DropdownMenu(
        [
            dbc.DropdownMenuItem(page["name"], href=page["path"])
            for page in dash.page_registry.values()
            if page["module"] != "pages.not_found_404"
        ],
        nav=True,
        label="More Pages",
    ),
    brand="Multi Page App Plugin Demo",
    color="primary",
    dark=True,
    className="mb-2",
)

app.layout = dbc.Container(
    [navbar,
     dash.page_container,
    #  dcc.Store(id="stored-data", data=df),
    #  dcc.Store(id="store-dropdown-value", data=None)
     dcc.Loading(dcc.Store(id='store'), fullscreen=True, type="dot")
     ],
    fluid=True)

if __name__ == "__main__":
    app.run_server(debug=True, port=8050)

page1.py

import dash
import dash_ag_grid as dag
import time

dash.register_page(__name__)

from dash import html, Input, Output, callback
import pandas as pd
from dash_extensions.enrich import Serverside
import plotly.express as px

df = px.data.carshare()

columnDefs = [
    {"field": "centroid_lat","maxWidth": 300},
    {"field": "centroid_lon","maxWidth": 300},
    {"field": "car_hours","filter": "agNumberColumnFilter", "maxWidth": 300},
    {"field": "peak_hour", "filter": "agNumberColumnFilter", "maxWidth": 300},
]

layout = html.Div(
    [   
        html.Div(
            [
            dag.AgGrid(
            id='table',
            rowData=df.to_dict("records"),
            columnSize="sizeToFit",
            columnDefs=columnDefs,                        
            defaultColDef={"resizable": True, "sortable": True, "filter": True},
            dashGridOptions={"pagination": True,
                            "enableCellTextSelection": True,
                            "ensureDomOrder": True,
                            "rowSelection": 'simple',},
            # getRowId="params.data.State",
            persistence_type ='session',
            persisted_props=["filterModel"],
            persistence=True,
            style={"height": "80vh",}
            )
            ]
        ),
    ]
)



@callback(
    Output('store', 'data'),
    Input('table', 'virtualRowData'),

)
def update_geojson(filtered_data):
    if not filtered_data:
        return Serverside(df)
    return Serverside(pd.DataFrame(filtered_data))

and page2.py

import dash
import dash_leaflet as dl
import dash_leaflet.express as dlx
dash.register_page(__name__, path="/")

from dash import html, Input, Output, callback

layout = html.Div(
    [
        html.Div(id="map-container", children=[]),
    ]
)

@callback(
    Output("map-container", "children"),
    Input('store', 'data'),
     
)
def graph_and_table(data):
    print(data)
    return [dl.Map([
            dl.TileLayer(),
            # From in-memory geojson. All markers at same point forces spiderfy at any zoom level.
            dl.GeoJSON(data=dlx.dicts_to_geojson(data.to_dict('records'), lon="centroid_lon", lat="centroid_lat"), cluster=True,zoomToBoundsOnClick=True,
                   superClusterOptions={"radius": 100}),

        ], center=(45.471549, -73.588684), zoom=11, style={'height': '50vh'})]
    

Unfortunately, it returns this error:

dash.exceptions.InvalidCallbackReturnValue: The callback for `<Output `store.data`>`
                returned a value having type `Serverside`
                which is not JSON serializable.

I think you have to adapt your imports on page1:

from dash import html
import pandas as pd
from dash_extensions.enrich import Serverside, Input, Output, callback

Not sure about the callback but I’m pretty sure Input and Output have to be imported from dash_extensions

2 Likes

Absolutely @AIMPED,

It’s working well after adjusting the imports, which I’ll modify for each page. I’ll do more testing to confirm, but so far it looks promising. Thanks for your assistance.

2 Likes