Black Lives Matter. Please consider donating to Black Girls Code today.

Preventing Wasteful Parallel Callbacks When Running Multiple Processes and Caching

In my app multiple callbacks depend on a shared result. Calculating this shared result once is slow enough, but having multiple processes (or threads) repeatedly doing the calculation is even slower.

I think I am looking for a cache-like solution that when the cache returns a miss, the requesting callbacks will simply wait until the first callback is finished and then grab the same result. I have tried using Flask-Caching but in my tests it seems like if there is a cache miss then all workers/threads will still execute the memoized function.

At the end of this post. I have some code to demonstrate the problem. Notice that “Calculating” gets printed at least twice to the console (and perhaps more times if you are using more than one worker).

I have a solution in place using Python’s thread synchronisation tools (locks etc) however this only works for a single worker/process. What can I do to fix this when using multiple workers/processes?

import dash
import dash_html_components as html
from flask_caching import Cache
import time

app = dash.Dash(__name__)

server = app.server

cache = Cache(app.server, config={
    'CACHE_TYPE': 'filesystem',
    'CACHE_DIR': 'cache-directory'
})

app.layout = html.Div(children=[
    html.Button('Submit', id='button'),
    html.Div(id='output-container-button1', children=[]),
    html.Div(id='output-container-button2', children=[]),
])

# This should only ever be called once!
@cache.memoize()
def slow_function(argument_1):
    print("Calculating")
    time.sleep(3)
    return 1

@app.callback(
    dash.dependencies.Output('output-container-button1', 'children'),
    [dash.dependencies.Input('button', 'n_clicks')])
def update_output1(n_clicks):
    value = slow_function(3)
    return 'The button has been clicked {} times'.format(value)

@app.callback(
    dash.dependencies.Output('output-container-button2', 'children'),
    [dash.dependencies.Input('button', 'n_clicks')])
def update_output2(n_clicks):
    value = slow_function(3)
    return 'The button has been clicked {} times'.format(value)

if __name__ == '__main__':
    app.run_server(debug=True)

I was able to solve this using a combination of:

  • multiprocessing.lock() (instead of threading.lock())
  • gunicorn preloading

If you use the preload option with gunicorn then you can share objects between processes using the multiprocessing family of proxy objects.

1 Like

I’ve spun my code off into a module, which is available via pip https://github.com/sjtrny/jitcache

I hope that others get some use out of this. I have an example using Dash here https://jitcache.readthedocs.io/en/latest/dash.html, which I have copied below:

REDACTED: CODE OUTDATED. Refer to the following post for updated code https://community.plotly.com/t/preventing-wasteful-parallel-callbacks/18956/5?u=sjtrny
1 Like

Nice! Thanks for sharing!

Since the other day I have changed the design of jitcache to be more in line with LRU Cache and Flask-Caching by using a decorator instead.

from jitcache import Cache

cache = Cache()

@cache.memoize
def slow_fn(input_1, input_2, input_3=10):
    return input_1 * input_2 * input_3

print(slow_fn(10, 2))

For plot.ly you can either decorate entire callbacks (just like in Dash’s Performance Docs) or you can decorate a subroutine. Below I demonstrate how to decorate callbacks (you can find more documentation here):

import dash
import dash_html_components as html
from jitcache import Cache
import dash_core_components as dcc

cache = Cache()

app = dash.Dash(__name__)

server = app.server
app.layout = html.Div(
    children=[
        html.Div(id="output-container-dropdown1", children=[]),
        html.Div(id="output-container-dropdown2", children=[]),
        dcc.Dropdown(
            options=[
                {"label": "New York City", "value": "NYC"},
                {"label": "Montréal", "value": "MTL"},
                {"label": "San Francisco", "value": "SF"},
            ],
            value="MTL",
            id="dropdown",
        ),
    ]
)

@app.callback(
    dash.dependencies.Output("output-container-dropdown1", "children"),
    [dash.dependencies.Input("dropdown", "value")],
)
@cache.memoize
def update_output1(input_dropdown):
    print("run1")

    return input_dropdown

@app.callback(
    dash.dependencies.Output("output-container-dropdown2", "children"),
    [dash.dependencies.Input("dropdown", "value")],
)
@cache.memoize
def update_output2(input_dropdown):
    print("run2")

    return input_dropdown

if __name__ == "__main__":
    app.run_server(debug=True)
2 Likes

This looks really great. Once we investigate this a little bit more, we’ll consider adding it to the docs. Thanks for sharing this and keep the thread updated with your progress!

I like jitcache, I’ve tried it and it works well. Cool build.

If you want an option that does this explicitly and feels like using a dictionary, I made https://github.com/russellromney/brain-plasma for this very purpose. Also available through pip.

from brain_plasma import Brain

brain = Brain()

@app.callback(...)
def create_slow_df_only_once(...):
    df = # large, slow, data
    brain['slow_df'] = df # saves to a Plasma in-memory object store
    ...

@app.callback(...)
def access_large_df(...):
    df = brain['slow_df']
    ...
    
2 Likes

@sjtrny - I’m curious on how your @cache.memoize function works with .lock and concurrent requests.
Let’s imagine that a callback def update takes 5 seconds to run and two requests are made, the second request 2 seconds after the first one.

What happens? Does the memoize decorator put some kind of lock on the update function and does the second request “wait” for the first request to finish and then access the value from memory? Therefore the second request gets the value after 3 seconds instead of recomputing the same result in 5 seconds?

Yes that’s correct - the second request will only take 3 seconds.

1 Like

Very nice. @russellthehippo - Does your plasma integration work the same way or does the end user need to program in the “waits”?

@chriddyp sadly brain-plasma has no locking control for the brain for the same object…which I suppose means brain-plasma doesn’t solve the key problem of doing the same thing multiple times in separate callbacks.

Some behavior notes for brain-plasma:

  • If the same client is trying to access an object that is being created, the request will fail if the object create transaction is not done (because the object doesn’t officially exist yet).
  • A request to any other object works fine, either with the same client or a different client
  • A request by a separate client will just return an object not found error until the object is created
  • If the object already exists, it is immutable. This means that, while an object is being changed, the previous value will be returned until the old value is deleted (ending the copy-replace-delete update transaction).

I like this implementation https://github.com/sjtrny/jitcache/blob/master/jitcache.py#L32
I may implement something similar for same-client get/put calls in brain-plasma. There’s no way to protect against it between clients though.

1 Like