Show and Tell - Server Side Caching

The Store component in Dash makes it easy to share state between callbacks. Under the hood, the data are stored as JSON in the browser. This approach is chosen to keep the server stateless (i guess), but it has a few drawbacks

  • As the data are stored in JSON, you must convert objects from/to JSON in the beginning/end of each callback
  • Since the callbacks are executed server side while the data are stored client side, the data will be sent across the wire every time a callback is invoked
  • The maximum storage size is limited by the browser (more than ~ 10 MB will probably cause trouble)

For small amounts of data, none of these issues are significant. For large amounts of data, they can be deal breakers in terms of application performance. The solution to the problem is (yes, you guessed it) server side caching. While it’s already documented, i have always felt that the syntax was more complicated than it needs to be.

The CallbackCache is an attempt to address this challenge. To enable it’s magic, callbacks must be registered on this object rather than the Dash app itself. In the end (before run_server), the object itself is registered on the app,

import dash
from dash_extensions.callback import CallbackCache

app = dash.Dash()
cc = CallbackCache()  # create callback cache
...

@cc.callback(...)  # register callback on cc instead of app
...

cc.register(app)  # this call registers the callbacks on the application

if __name__ == '__main__':
    app.run_server()

In addition to the normal callback decorator, it has a special cached_callback decorator, which saves the data in a server side cache. The cache takes care of serialization (typically via pickle), so you don’t need to convert from/to JSON in the beginning/end of each callback. Hence, you could do this

@cc.cached_callback(Output("store", "data"), [Trigger("btn", "n_clicks")])  # Trigger is like Input, but excluded from args
def query_data():
    time.sleep(1)  # sleep to emulate a database call / a long calculation
    return px.data.gapminder()  # no conversion, just return the data frame

@cc.callback(Output("dd", "options"), [Input("store", "data")])
def update_dd(df):
    return [{"label": column, "value": column} for column in df["year"]]  # no conversion, just use the data frame

And since the cache is server side, there is no data transfer (apart from the cache key, which is a short string). The maximum storage size is limited only by the underlying cache, i.e. it can be on the order of GBs depending on the server hardware. Enough talk, here is a small self-contained example,

import dash
import dash_core_components as dcc
import dash_html_components as html
import time
import plotly.express as px

from dash.dependencies import Output, Input
from flask_caching.backends import FileSystemCache
from dash_extensions.callback import CallbackCache, Trigger

# Create app.
app = dash.Dash(prevent_initial_callbacks=True)
app.layout = html.Div([
    html.Button("Query data", id="btn"), dcc.Dropdown(id="dd"), dcc.Graph(id="graph"),
    dcc.Loading(dcc.Store(id="store"), fullscreen=True, type="dot")
])
# Create (server side) cache. Works with any flask caching backend.
cc = CallbackCache(cache=FileSystemCache(cache_dir="cache"))


@cc.cached_callback(Output("store", "data"), [Trigger("btn", "n_clicks")])  # Trigger is like Input, but excluded from args
def query_data():
    time.sleep(1)  # sleep to emulate a database call / a long calculation
    return px.data.gapminder()


@cc.callback(Output("dd", "options"), [Input("store", "data")])
def update_dd(df):
    return [{"label": column, "value": column} for column in df["year"]]


@cc.callback(Output("graph", "figure"), [Input("store", "data"), Input("dd", "value")])
def update_graph(df, value):
    df = df.query("year == {}".format(value))
    return px.sunburst(df, path=['continent', 'country'], values='pop', color='lifeExp', hover_data=['iso_alpha'])


# This call registers the callbacks on the application.
cc.register(app)

if __name__ == '__main__':
    app.run_server()

To run the example, you’ll need the latest version of dash extensions,

pip install dash-extensions==0.0.23

The server cache (passed via the cache argument) can be any flask_caching backend, so there are lot’s of options to choose from. For most users, i guess the (default) FileSystemCache will do. If you would like to reuse the cached result when the inputs are unchanged, you can pass instant_refresh=False. If you would furthermore like to reuse cached results between sessions, pass session_check=False also.

If you have any questions and/or suggestions to improve the syntax, please let me know :slight_smile:

EDIT: I have now done some simple benchmarks. Assuming that your application is bottlenecked by serialization and/or data transfer of pandas data frames, the cached_callback tends to yield a performance improvement in the range to 10-100 times.

22 Likes

this is really sweet! Curious about the syntax: the callbacks are still triggered & linked to each other by the dcc.Store component, right? So I’m sssuming that behind the scenes you’re setting some cache key / UID to this component that is then used to access the actual data from the cache in your new callback decorator?

Very very nice :+1:t3:

2 Likes

Thanks! Yes, that is essentially what i am doing. In a few more words,

  • I evaluate the callback value server side and calculate a key (md5 hash) based on function name, inputs argument and (optionally) session id. The (key, value) is inserted into the cache, and the key is return to the client where it is inserted into the Store component that is targeted as the output of the callback
  • When a callback has an input/state that has been cached, the key sent from the client is replaced by the value read from the cache before the callback function is invoked

One drawback of this approach is that you cannot use client side callbacks as they will only be able to access the key.

1 Like

To get an idea of the performance of the cached_callback versus the default Dash callback, i have carried out a few benchmarks. For the purpose of these benchmarks i am using a FileSystemCache and I consider the following case,

  • A data frame with a single column of size n is created server side (to emulate e.g. a fetch from a database) and inserted into a Store component with id store. Next, the mean of the column is calculated (to emulate a data processing step) in another callback that takes the store as input.

I measure the time from just after the data frame creation until just before the mean operation, i.e. it includes serialization as well as the transfer of data from the server to the client and back. For each value of n i measured 5 times and took the average (and std for error bars). Here are the numbers for my local desktop,

In the first chart we see that the standard callback (blue) works up til around 1 mio. rows, at which point the operation takes roughly 4s. At 10 mio. rows, the browser crashes. The cached callback (yellow) on the other hand just keeps on going. I stopped at 1 bil. rows at which point the operation took around 20s. At this point, the pickle on disk was 8GB (!).
The second chart illustrates the ratio between the runtimes. Rather surprisingly, the cached callback is around 50 times faster for a single element data frame. Maybe this is due to the pandas serialization to/from JSON being slow? At 1 mio. rows, the cached callback is more than 200 (!) times faster.

Now, this is cool and all, but no one uses their local host for deployment. So let’s move to the cloud (Heroku, free tier),

On Heroku, the standard callback (blue) still works up til around 1 mio. rows, but cached callback (yellow) crashed at 100 mio. rows. From the logs i could see that the dyno ran out of memory, i.e. the limit can probably be pushed (much) further by purchasing a more beefy dyno. In the head to head comparison, the cached callback is still faster, but the performance gain has been reduced to a factor of 10 for small data frames and 100 for large ones.

For reference, here is the benchmark code

import datetime
import dash
import dash_core_components as dcc
import dash_html_components as html
import numpy as np
import pandas as pd

from dash.dependencies import Output, Input, State
from flask_caching.backends import FileSystemCache
from dash_extensions.callback import CallbackCache

# region Benchmark data definition

options = [{"label": x, "value": x} for x in [1, 10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000, 1000000000]]


def make_data(n):
    return pd.DataFrame(data=np.random.rand(n), columns=["rnd"])


# endregion


# Create app.
app = dash.Dash(prevent_initial_callbacks=True)
server = app.server
app.layout = html.Div([
    # Standard implementation.
    html.Button("Run benchmark (no cache)", id="btn"), dcc.Dropdown(id="dd", options=options, value=1),
    dcc.Store(id="time"), dcc.Loading(dcc.Store(id="store"), fullscreen=True, type="dot"), html.Div(id="log"),
    # Cached implementation.
    html.Button("Run benchmark (with cache)", id="btn_wc"), dcc.Dropdown(id="dd_wc", options=options, value=1),
    dcc.Store(id="time_wc"), dcc.Loading(dcc.Store(id="store_wc"), fullscreen=True, type="dot"), html.Div(id="log_wc")
])


# region Standard implementation


@app.callback([Output("store", "data"), Output("time", "data")], [Input("btn", "n_clicks")], [State("dd", "value")])
def query(n_clicks, value):
    df = make_data(int(value))
    tic = datetime.datetime.now().timestamp()
    return df.to_json(), tic


@app.callback(Output("log", "children"), [Input("store", "data")], [State("time", "data")])
def calc(data, time):
    time = datetime.datetime.fromtimestamp(int(time))
    df = pd.read_json(data)
    toc = datetime.datetime.now()
    mean = df["rnd"].mean()
    return "ELAPSED = {}s (and mean is {:.3f})".format((toc - time).total_seconds(), mean)


# endregion

# region Cached implementation

# Create (server side) cache. Works with any flask caching backend.
cc = CallbackCache(cache=FileSystemCache(cache_dir="cache"))


@cc.cached_callback([Output("store_wc", "data"), Output("time_wc", "data")],
             [Input("btn_wc", "n_clicks")], [State("dd_wc", "value")])
def query_wc(n_clicks, value):
    df = make_data(int(value))
    return df, datetime.datetime.now()


@cc.callback(Output("log_wc", "children"), [Input("store_wc", "data")], [State("time_wc", "data")])
def calc_wc(df, time):
    toc = datetime.datetime.now()
    mean = df["rnd"].mean()
    return "ELAPSED = {}s (and mean is {:.3f})".format((toc - time).total_seconds(), mean)


# This call registers the callbacks on the application.
cc.register(app)

# endregion


if __name__ == '__main__':
    app.run_server()
9 Likes

This is really awesome @Emil! Keep up the good work!

1 Like

Hi @Emil,
I was trying to repeat the experiment where you plotted millions of rows in a matter of seconds. It take the code below about 5 seconds to plot ~500,000 rows onto a px.scatter(). Do you know why that might be taking so long?

I used this data here.

And below, you can find the reproducible code.

import datetime
import dash
import dash_core_components as dcc
import dash_html_components as html
import pandas as pd
import plotly.express as px

from dash.dependencies import Output, Input, State
from flask_caching.backends import FileSystemCache
from dash_extensions.callback import CallbackCache

df_org = pd.read_csv("green_tripdata_2019-01.csv")

app = dash.Dash(prevent_initial_callbacks=True)
server = app.server
app.layout = html.Div([
    html.Button("Run benchmark (with cache)", id="btn_wc"),
    dcc.Dropdown(id="dd_wc", options=[{"label": x, "value": x} for x in df_org["passenger_count"].unique()], value=1),
    dcc.Store(id="time_wc"), dcc.Loading(dcc.Store(id="store_wc"), fullscreen=True, type="dot"), html.Div(id="log_wc"),
    dcc.Graph(id='mygrpah')
])

cc = CallbackCache(cache=FileSystemCache(cache_dir="cache"))


@cc.cached_callback([Output("store_wc", "data"), Output("time_wc", "data")],
                    [Input("btn_wc", "n_clicks")])
def query_wc(n_clicks):
    df = df_org[["passenger_count", "trip_distance", "total_amount"]]
    return df, datetime.datetime.now()


@cc.callback([Output("log_wc", "children"), Output("mygrpah", "figure")],
             [Input("store_wc", "data")], [State("dd_wc",'value'), State("time_wc", "data")])
def calc_wc(df, value, time):
    toc = datetime.datetime.now()
    df_filtered = df[df["passenger_count"] == value]
    print(df_filtered[:3])
    print(df_filtered.info())
    fig = px.scatter(df_filtered, x='trip_distance', y='total_amount')
    return ("ELAPSED = {} seconds".format((toc - time).total_seconds())), fig

cc.register(app)

if __name__ == '__main__':
    app.run_server(debug=True)

Running your example, i get ELAPSED = 0.056303 seconds with caching and ELAPSED = 4.035909 seconds without. The time to do the actual plotting (not measured, but it seems to be around a few seconds on my laptop) comes on top, but i didn’t include that in the benchmark as it is not affected by the caching mechanism.

Hello @Emil!

I looked at your dash-extensions code and it looks pretty neat!

The CallbackCache components seems to have great performance improvement on large(r) datasets! Nice work :clap:t2: I really like the @cc.callback() syntax and it is the same I came up with dash-uploader (@du.callback()). I think it’s good if the community standardizes to some syntax!

2 Likes

Thank you for explaining. But I don’t fully understand what can be affected by the caching. Can the plotting be affected by the caching mechanism?

If i pass a figure into Store with @cc.cached_callback, doesn’t that affect plotting?

I was thinking the benefits from the Trigger / CallbackGrouper components (from dash-extensions) a bit more, but since they were not directly the topic of this thread, I created another one: @app.callback improvements? (Trigger component, same Output multiple times, callback without Output).

I also thought the names of the callbacks for CallbackCache: Would it make sense to have the @cc.callback to be the cached_callback() by default? Then, have @cc.callback_nocache for the case when cache is not needed, since the class name indicates that there would be some cache used? Or, have a keyword argument for @cc.callback(), such as cached=True, which would control the caching? It’s really a matter of taste, though!

1 Like

Hi @Emil

I had used your cached callback. Actually i was fetching large data frame from SQL server and plotting graphs from it on my dash app. But I want the graphs to be updated after every 30 seconds. So i used your cached callback as suggested by you. And i ran a schedular in another file to keep fetching data seperately from SQL and save as parquet file on my system. The cached callback used to read the data from it and then display it on all the 10 graphs on my dash app. And it was running perfectly.

I would love it if this callback will be documented on the dash’s official website.

4 Likes

Hi, I got time to make some testing and benchmarking for the CallbackCache. Here are the results:

Testing CallbackCache

Now I tested with example code of adamschroeder using his example data, and this is what I see in the dev tools:

Each time I pressed the “Run benchmark (with cache)” -button, there would be two HTTP POST requests to http://127.0.0.1:8050/_dash-update-component, (one after another since they are chained). You’ll have one HTTP POST request per each callback that is triggered.

  • The first _dash-update-component is fast and response size is very small, about 400Bytes, content is
    {"response": {"store_wc": {"data": "28de8ed6118d3dc73f39311bf5de1910"}, "time_wc": {"data": "83c5acc6308b231e3e56493fcf3774e1"}}, "multi": true}
    This corresponds the ID’s of the dcc.Store component. With this ID the data is read from the cache, if I understand correctly.

    I think here is the gain: Since only ID is saved to the browser, you do not have to send data from browser (dcc.Store) to server each time a callback is called.

  • The second _dash-update-component is slow (with large TTFB1; about 5-7 seconds). This contains the Graph data as JSON which is drawn to browser. Response size is about 1.8Mb.

As a diagram, this would be something like (3x small JSON, 1x large JSON, df to JSON only once)

So, what is taking the time in the second part?

The creation of the figure with px.scatter(df_filtered) takes about 5 seconds. Funny thing that df_filtered.to_json takes only ~0.5 second, so actually 95% of the time used to create a plotly figure is something else than just creating a json object out of it. (some optimization possible in px.scatter(), perhaps? Moreover, it could be possible to memoize calc_wc, too!)

Problem with caching?

I tried the mentioned code + this change

cc = CallbackCache(cache=FileSystemCache(cache_dir="cache"), instant_refresh=True)

@cc.cached_callback([Output("store_wc", "data"), Output("time_wc", "data")],
                    [Trigger("btn_wc", "n_clicks")])
def query_wc():
    print('Calculating')
    import time
    time.sleep(5) # Added sleep
    df = df_org[["passenger_count", "trip_distance", "total_amount"]]
    return df, datetime.datetime.now()

and I was hoping to see the time to be 5 seconds with first callback call and ~0 seconds during the next ones. Unfortunately, it did not work like that. I tried both instant_refresh options (True/False). You can see this yourself from the dev console, looking at the Timing of the first _dash-update-component. It could be a bug, configuration issue, or just how it should work. My best guess is the last; the CallbackCache will not memoize the callback, but just serves as “Store” in the server side. That said, this could be memoized easily also without dash-extensions, (if there is no such functionality?) so I assume there could be additional speed gains, if needed.

Compare to without CallbackCache

The callbacks without CallbackCache would be roughly something like this

@app.callback([Output("store_wc", "data"), Output("time_wc", "data")],
                    [Input("btn_wc", "n_clicks")])
def query_wc(n_clicks):

    df = df_org[["passenger_count", "trip_distance", "total_amount"]]
    return df.to_json(), datetime.datetime.now()


@cc.callback([Output("log_wc", "children"), Output("mygrpah", "figure")],
             [Input("store_wc", "data")], [State("dd_wc",'value'), State("time_wc", "data")])
def calc_wc(df, value, time):
    df = pd.read_json(df)
    toc = datetime.datetime.now()

    df_filtered = df[df["passenger_count"] == value]

    fig = px.scatter(df_filtered, x='trip_distance', y='total_amount')
    return ("ELAPSED "), fig

As a diagram, this would be something like (1x small JSON, 3x large JSON. df to JSON 2 times, JSON to df 1 time)

Results

  • First callback takes ~5.3 seconds and response is 6.7Mb! (this is the data as JSON string)
  • Second callback takes about ~10.5 seconds, since now it first sends the data as JSON from browser to server, and then gets the figure as response.
  • Summary: The CallbackCache saves now about 2/3 of time when there is dcc.Store involved with two chained callbacks + a Graph.
  • By caching / using memoization of the callback functions it could be possible to make this even faster.

I hope this makes the gains more clear to everyone!

:bell: Note: The tests were done in localhost without throttling (the network speed is much faster than in normal situation). In real world application, sending the big JSON packages back and forth would be even slower, and the difference between using CallbackCache vs. not using it would be more dramatical.

- Niko


1The TTFB is defined as

The browser is waiting for the first byte of a response. TTFB stands for Time To First Byte. This timing includes 1 round trip of latency and the time the server took to prepare the response

4 Likes

Will it be better to use flask-caching directly?

Great analysis! I love your diagram in particular :blush:. I’ll take it as the starting point of my answer. Let’s denote the arrows (1,2,3,4). To sum up what happens,

  1. Client sends button click (small).
  2. Client receives the full data, i.e. around 6.7 MB is this example.
  3. Client sends the full data, 6.7 MB.
  4. Client receives the figure, around 1.8 MB.

As you note, (1) is fast. Due to the large payload (6.7 MB), both (2) and (3) are slow. And since the data are not used for anything by the client itself, this data transfer is in fact unnecessary. The figure transfer in (4) is also rather slow (as the figure is large), but unlike (2) and (3) it is necessary. Since the figure is rendered client side, without sending the figure JSON to the client, the client would not know what to draw.

The caching mechanism targets the unnecessary transfers (2, 3), but it cannot do anything about (4). Hence based on your results, I would say that the cache works as intended.

You can use flask cache directly to avoid reevaluating the function multiple time, but it won’t save you the data round trip, which is the key point of the CallbackCache.

It would not make sense to use the cached callback per default. As noted in my previous post, it only makes sense to use the cache for callbacks that return data, which is not used by the client.

It might be more intuitive to use the same callback decorator for all callbacks and add a cache keyword argument. However, I think this argument should take a cache object as input rather than a Boolean. This would make it possible to use different caches for different callbacks, e.g a disk cache for large data blocks and a memory cache for smaller ones.

1 Like

Regarding to this, I posted my thoughts on naming/default behaviour of a “cache” to the separate thread.

Based on inputs from @fohrloop and @chriddyp, i have come up with a new syntax (available in dash-extensions 0.0.28). The performance should be the same, but the syntax is simpler (at least that is the intention). Here is the benchmark example using the new syntax,

import datetime
import dash_core_components as dcc
import dash_html_components as html
import numpy as np
import pandas as pd

from dash_extensions.enrich import Dash, ServersideOutput, Output, Input, State, Trigger

# Drop down options.
options = [{"label": x, "value": x} for x in [1, 10, 100, 1000, 10000, 100000, 1000000, 10000000, 100000000]]
# Create app.
app = Dash(prevent_initial_callbacks=True)
server = app.server
app.layout = html.Div([
    html.Button("Run benchmark (with cache)", id="btn"), dcc.Dropdown(id="dd", options=options, value=1),
    dcc.Store(id="time"), dcc.Loading(dcc.Store(id="store"), fullscreen=True, type="dot"), html.Div(id="log")
])


@app.callback([ServersideOutput("store", "data"), ServersideOutput("time", "data")],
              Trigger("btn", "n_clicks"), State("dd", "value"))
def query(value):
    df = pd.DataFrame(data=np.random.rand(int(value)), columns=["rnd"])
    return df, datetime.datetime.now()


@app.callback(Output("log", "children"), Input("store", "data"), State("time", "data"))
def calc(df, time):
    toc = datetime.datetime.now()
    mean = df["rnd"].mean()
    return "ELAPSED = {}s (and mean is {:.3f})".format((toc - time).total_seconds(), mean)


if __name__ == '__main__':
    app.run_server()

So what has changed? Instead of having to register the callbacks on the Dash app object, you now just have to use the custom objects from dash_extensions.enrich. The cached_callback decorator has been abandoned. You now just use the normal callback decorator and indicate which outputs should stay server side by using ServersideOutput instead of Output.

4 Likes

To change the cache path, you should create a new FileSystemStore backend, i.e. something like this,

from dash_extensions.enrich import Dash, FileSystemStore

output_defaults=dict(backend=FileSystemStore(cache_dir="some_path"), session_check=True)
app = Dash(output_defaults=output_defaults)

As an initial debugging step, you could check if files are in fact being written to "some_path". If not, it would indicate that you are still having permission issues.

Hi @Emil,

First, great job!!

I notice that in this version, you import dash from dash_extensions.enrich, and you use app=Dash(…) Instead of app=dash.Dash(…)

When I do the same, my app stops displaying / my callbacks don’t fire correctly anymore. Could you enlighten me on the reason for this change and if there are any alternatives? Thanks!