Shared variables between callbacks - DjangoDash

Hello,

I’m facing an issue with managing shared data between callbacks in a Django Dash application.

During local development, I used a global variable initialized at server startup to store my data, allowing all callbacks to access it. However, in production, the data remains static as it reflects the state at server startup.

To address this, I followed the documentation: Dash - Sharing Data Between Callbacks.

Solution 1:

I moved the data computation to a specific callback, serialized my Pandas DataFrames into JSON, and stored them in a dcc.Store. Each callback then reads this data and deserializes it back into a DataFrame. However, this process is slow (around 0.2 seconds per DataFrame), which significantly slows down the application.

Solution 2:

I implemented caching using flask_caching, configured as follows:

CACHE_CONFIG = {
    'CACHE_TYPE': 'redis',
    'CACHE_REDIS_URL': REDIS_URL
}

My data-fetching function uses this cache:

@cache.memoize(timeout=60)  # Cache timeout: 60 seconds
def fetch_data():
    # Data computation
    pass

However, when the serve_layout is loaded for the first time, the 4 callbacks that call this function are executed simultaneously, resulting in the data being computed 4 times.

Do you have any recommendations to avoid these redundant computations and improve performance?

Thank you in advance!

Which callbacks are those and do they need to get triggered? Are you referring on startup of the app? Also, the serve_layout function is within your app.pyor are you referring to pages?

You could take a look into serverside_outputs from dash-extensions. This allows to store any python object in pickle format (if I’m not mistaken).

One approach, if you’re inside Django, is to look at using its own caching implementation. That should then play nicely with whatever deployment approach you take for the overall server. You would have to think about how to appropriately label the cached values if they’re a function of user, session etc., but that would be something to consider for any caching scheme.

You could also consider running the calculation on first use, or when an auxiliary view is requested, or on some sort of timed basis, rather than at startup of one of the server processes. Process startup will happen multiple times, often simultaneously both within and across multiple machines, and be repeated reasonably often, in a production environment, and its not always the best place for app-level intensive calculations.