Show and Tell - Server Side Caching

Thanks! Did you remember to import the Input, Output and State objects from enrich also? And what versions are you using?

@Emil Yes I have!

my versions:

dcc: 1.10.2
html: 1.0.3
dash_extensions: 0.0.31

Not even the app layout displays when I use app=Dash(...) instead of dash.Dash(...).

Also, is there a reason why you use prevent_initial_callbacks?

PS: When the callbacks don’t fire correctly (prevent_initial_callbacks=False), I actually get JSON serialization errors and the layout still doesn’t show

I tend to use prevent_initial_callbacks as initial callbacks with None values often needs special handling (which seems unnecessary when you can just use the `prevent_initial_callbacks flag). Hmm, i don’ t see why the callbacks shouldn’t work. Do you get any error, or are the callbacks just not fireing?

1 Like

After inspecting in the browser, the callbacks do fire but nothing is displayed. It might only have to do with the interaction with the layout. I have tried without using a css template but the problem persists.

Is there any way I can still use your module while importing dash.Dash()? It seems like I’m able to use ServerSideOutput without it, and callbacks still fire and the layout does show.

If you use the ServerSideOutput object with a standard Dash object it just does nothing. Just to be sure, you are using the ServerSideOutput for Store objects only, right?

EDIT: If you could create a small, self-contained example, i could take a look at what goes wrong.

Yes, only for Store objects. So… Output isn’t server side in this case? my app works perfectly well with ServerSideOutput, but I haven’t benchmarked it in production. I guess I’ll have to troubleshoot the app from the start once I get a little time.

Also, one thing that might have an effect is that I am using a multipage app with a flat project layout. I’ll keep you updated on my progress, and you keep us all updated on yours!!

EDIT: I’ll definitely do that ASAP.
Thanks!

No, unless you use the Dash object from enrich, it will remain client side. It’s the Dash object from enrich that performs the “magic”. If you only need the ServerSideOutput feature, you could try disabling the other features. The syntax would be something like,

fs = FileSystemStore(cache_dir="path_that_you_can_write_to")
sot = ServersideOutputTransform(backend=fs)
app = DashTransformer(transforms=[sot])

where the app variable corresponds to the normal Dash object. Another thing to note is that it’s important that you have write permission to the directory to which the cache is written. Per default, it’s a folder created next to the app, but you can change it as per the code above if needed.

I do have write permissions, but sadly when using this method the callbacks don’t even fire anymore and I get this error:

⛑️ A callback is missing Outputs
Please provide an output for this callback:
{
  "clientside_function": null,
  "inputs": [
    {
      "id": "url",
      "property": "pathname"
    }
  ],
  "output": "....",
  "prevent_initial_call": false,
  "state": [],
  "outputs": [
    {
      "id": "",
      "property": "",
      "out": true
    }
  ]
}

I’ll write a self-contained example next week. Thanks, I really appreciate it

Wow. This is great. Thank you, @Emil

I have some of the examples working well, but when trying to adapt a larger script I’m given the following error that I can’t quite understand:

Traceback (most recent call last):
  File "/Users/derrick/anaconda3/envs/dash/lib/python3.6/site-packages/flask/app.py", line 2464, in __call__
    return self.wsgi_app(environ, start_response)
  File "/Users/derricklewis/anaconda3/envs/dash/lib/python3.6/site-packages/flask/app.py", line 2450, in wsgi_app
    response = self.handle_exception(e)
  File "/Users/derrick/anaconda3/envs/dash/lib/python3.6/site-packages/flask/app.py", line 1867, in handle_exception
    reraise(exc_type, exc_value, tb)
  File "/Users/derrick/anaconda3/envs/dash/lib/python3.6/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/Users/derrick/anaconda3/envs/dash/lib/python3.6/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/Users/derrick/anaconda3/envs/dash/lib/python3.6/site-packages/flask/app.py", line 1945, in full_dispatch_request
    self.try_trigger_before_first_request_functions()
  File "/Users/derrick/anaconda3/envs/dash/lib/python3.6/site-packages/flask/app.py", line 1993, in try_trigger_before_first_request_functions
    func()
  File "/Users/derrick/anaconda3/envs/dash/lib/python3.6/site-packages/dash_extensions/enrich.py", line 82, in _setup_server
    super()._setup_server()
  File "/Users/derrick/anaconda3/envs/dash/lib/python3.6/site-packages/dash/dash.py", line 1089, in _setup_server
    _validate.validate_layout(self.layout, self._layout_value())
  File "/Users/derrick/anaconda3/envs/dash/lib/python3.6/site-packages/dash_extensions/enrich.py", line 69, in _layout_value
    layout = transform.layout(layout, self._layout_is_function)
  File "/Users/derrick/anaconda3/envs/dash/lib/python3.6/site-packages/dash_extensions/enrich.py", line 451, in layout
    children = layout.children + self.hidden_divs
TypeError: unsupported operand type(s) for +: 'Container' and 'list'

I’ll try to truncate the code I’m using:

from datetime import datetime, timedelta
import numpy as np
import pandas as pd
from plotly.subplots import make_subplots
import plotly.graph_objects as go
from flask import Flask
import dash_table
import dash_html_components as html
import dash_core_components as dcc
import dash_bootstrap_components as dbc
import dash_auth
from dash_extensions.enrich import Dash, ServersideOutput, Output, Input, State, Trigger, FileSystemStore

output_defaults = dict(backend=FileSystemStore(
    cache_dir="./some_path"), session_check=True)

df = pd.read_csv('posts.csv')

server = Flask(__name__)

app = Dash(name=__name__,
           prevent_initial_callbacks=True,
           server=server,
           output_defaults=output_defaults,
           external_stylesheets=[dbc.themes.GRID])


app.layout = html.Div(
    dbc.Container([
        dcc.Loading(id='loading_icon', children=[
            dbc.Row([
                dbc.Col([
                    dcc.Graph(
                        id='main_chart',

                    )
                ])
            ]),
            dcc.Store(id='filter_df'),
            dcc.Store(id='agg_df'),
            dcc.Store(id='user_df')
        ],
            type='default'
        ),
])

@app.callback(
              [ServersideOutput("filter_df", "data"),
               ServersideOutput("agg_df", "data"),
               ServersideOutput("user_df", "data")
               ],
              [Input('submit_button', 'n_clicks')],
              [State('region_dropdown', 'value'),
               State('category_dropdown', 'value'),
               State('type_dropdown', 'value'),
               State('sponsored', 'value'),
               State('follower_slider', 'value'),
               State('username_input_field', 'value')]
              )
def get_benchmark_data(clicks, region, category, type, sponsored, followers, user_target):
    #do some expensive calculations

    return filter_df, agg_df. user_df


@app.callback(
    Output('main_chart', 'figure'),
    [Input('type_dropdown', 'value'),
     Input('metric_dropdown', 'value'),
     Input('filter_df', 'data'),
     Input('agg_df', 'data'),
     Input('user_df', 'data')],
    [State('username_input_field', 'value')
     ]
)
def update_main_chart(type, metric, filter_df, agg_df, user_df, username):
    #build a figure

    return fig

if __name__ == '__main__':
    app.run_server()

Is there another way to use the dbc.Container component?

Using:
dash-extensions = 0.0.31
dash-html-components = 1.0.3
dash-core-components = 1.10.1

Thanks in advance if anyone has any thoughts to share.

I think I have a solution.

I can either add an array to the html.div:

app.layout = html.Div([
    dbc.Container([
        dcc.Loading(id='loading_icon', children=[

Or remove the div altogether.

app.layout = dbc.Container([
        dcc.Loading(id='loading_icon', children=[

One more question. Please let me know if this is the wrong topic.

I’m noticing the cache file getting big quickly. Is there a way to limit the cache collected with ServersideOutput?

I can see in the plotly docs on flask-caching something like this:

cache = Cache(app.server, config={
    'CACHE_TYPE': 'filesystem',
    'CACHE_DIR': 'cache-directory',
    # should be equal to maximum number of users on the app at a single time
    # higher numbers will store more data in the filesystem / redis cache
    'CACHE_THRESHOLD': 200}
1 Like

Hey @Emil,

Just to let you know that you were right, and I had imported Input and Output somewhere else and had forgotten about it.
I just gave your package another try and it’s amazing. Many thanks for your efforts

Great! I am happy that it worked out :blush:

Yes, it is possible to adjust the cache settings. As a rule of thumb, the cache size should be > number of concurrent users times number of server side outputs :upside_down_face:

Hey @Emil, incredible work on this package!

The only issue I’m having right now is being able to limit the size of the cache. Is it different than the cache_threshold argument used for flask caching?

Thanks! No, it’s the same (in fact, the arguments are passed to the flask caching FileSystemCache class under the hood). You would simply create the store with the desired configuration,

from dash_extensions.enrich import FileSystemStore
fss = FileSystemStore(threshold=1)

and bind it either as the default backend (i.e. it will be used for all serverside outputs),

from dash_extensions.enrich import Dash
app = Dash(output_defaults=dict(backend=fss, session_check=True))

or to the desired outputs,

@app.callback(ServersideOutput("store", "data", backend=fss), Trigger("btn", "n_clicks"))

However, keep in mind that i haven’t implemented any graceful handling of missing (overwritten) cache values. Hence if you make the cache too small so that values of active clients are overwritten, their applications will crash.

1 Like

Really appreciate the detailed response Emil. I just tried implementing this but unfortunately, it does not work how I expected. If the threshold is hit, I thought it would replace the oldest cache values with the new values of the active client, but it just immediately throws a “_pickle.UnpicklingError: invalid load key” error.

I should probably take a step back and explain what I’m trying to do here. My app loads the latest data on page refresh, processes it, and then filters that data in numerous callbacks. The data only totals ~5 MB right now but the files grow in size every week so I want to avoid using Dcc.Store.

The ServerSideOutput works perfectly hosting locally, but when I host it on heroku, my concern is that I will either run out of temporary filesystem space or the cache will timeout (is there a default timeout set?). I know heroku uses an ephemeral filesystem, but I don’t think that will be an issue in my case since I just need the data to persist during an active session. If the dyno restarts (once every 24 hours minimum), my understanding is that will trigger a page refresh of any active session, and the data will queried again.

Any suggestions to changing approach here or are my concerns unfounded?

Yes, that error is what I mean by “no graceful handling” :wink:

The timeout is infinite, so as long as you have enough disk space, you should be fine. Do you know many clients are accessing the app? If you expect e.g. 5 clients and 2 server side callbacks, a threshold value of 10 should theoretically be enough. If you set the threshold an order of magnitude higher, i.e. 100, you should be safe.

On Heroku the game is a little different. I am not sure exactly what happens when a dyno restarts, but your suggestion sounds reasonable; and in that case, i guess the app would recover after the page refresh. However, if you have > 1 dyno, storing anything on disk will break the app, as you might hit different dynos on different requests. Hence in that case, you would need to use a different storage medium, e.g. a Redis server.

Ah, I misunderstood the no graceful handling as, for example, if the fifth active client to load the app hits the threshold, then they would overwrite the cache values of the first active client, even if the first active client is still using the app. I thought I just needed to set the threshold high enough to handle (number of simultaneous active clients) x (ServerSideOutputs).

Im expecting to have up to 50 clients using the app (generally only ~10) with six ServerSideOutputs so a threshold of roughly 300 would be ideal. However, now that I understand how the threshold works, there doesn’t seem to be any reason to set it. Both going over the threshold or using up all the available disk space will throw an error to the end user.

If my understanding is correct, I think my best solution for now is using one dyno and hoping I do not hit the disk space limit before it resets. I’ve never used Redis, but I will definitely look into implementing it over the weekend. Really appreciate you helping me out and all the time/effort you put into this package Emil!

1 Like

Has anybody tried to evaluate performance of using server side caching with Redis or Memcached instead of FileSystemCache?

1 Like