Avoid running callback if inputs are the same

I was’t sure what to title this… in my app, I have some API calls which I’d like to minimize for financial implications. The user will have some text boxes to type into, I take those contents and some other options, build an API call and make the request. Currently, I’m using a button as an Input and the dcc.Input boxes are passed via State to avoid the callback trying to fire with every changed character.

I’d like to figure out how to only run the API call for changed elements when the button is pushed. Here’s a rough example, representing doing something based on a click with the box contents:

import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output, Event, State
import numpy as np


app = dash.Dash(__name__)
server = app.server
app.css.append_css({'external_url': 'https://codepen.io/chriddyp/pen/bWLwgP.css'})


app.layout = html.Div([
    html.H2('non-redundant callbacks'),
    html.Label('enter up to 2 numbers'),
    dcc.Input(id='box_1', type='number'),
    dcc.Input(id='box_2', type='number'),
    html.Button('run!', id='run', n_clicks=0),
    html.Div(id='out')])


@app.callback(
    Output('out', 'children'),
    [Input('run', 'n_clicks')],
    [State('box_1', 'value'),
     State('box_2', 'value')])
def contents(clicks, box_1, box_2):
    boxes = [box_1, box_2]

    message = list()
    for i in range(len(boxes)):
        if boxes[i] is not None:
            message.append('{}: {}'.format(i, boxes[i]**2))

    return(str(message))

    
if __name__ == '__main__':
    app.run_server(debug=True)

So, let’s say the user puts in 2 and 3, then clicks the button. Squaring represents an “expensive” computation (or API call). Let’s say the user changes 3 to 4. I’d like to keep the value of 2 and it’s associated return the same. After all, we did the work once already, why do it again? I can’t find it at the moment, but I know I’ve seen a post somewhere about global variables being frowned upon. What might be a different approach to this?

Ideally I’d find a solution that ignored order ([2, 3] has the same generated output as [3, 2]), but for now I’d just like it such that for a single given input, if it’s contents don’t change between button clicks, I don’t re-run the API call with it.

Thanks for any ideas!

i handled this kind situation using a sort of a simple cache (dictionary)
not sure it’s the best option but it works

just need to make sure to clear the cache when you can to avoid buildup

This is covered in the performance section of the Dash guide. You can cache the results of callbacks (aka memoization) so that callbacks will check the cache to see if it’s been previously computed before running the function. Note that if you have multiple worker processes, each worker process will have its own cache, unless you use a shared store service such as redis.

1 Like

is this session aware?

if i want to keep the results separately for each user session, can this work?

It is not and Dash does not have the concept of user sessions. Instead, Flask-Caching saves the data to the filesystem where it is accessible for all users and all processes. Since Dash callbacks are stateless, any dash callback function on any process serving any user can safely access the cache.

Note that if you are using multiple processes or workers, this cache won’t be shared across processes, causing duplication and incompleteness.

1 Like

This is covered in the performance section of the Dash guide.

Well, of course it would be :slight_smile: I should have tried more search terms!

That’s super helpful though I’m curious about typical strategies on a callback where perhaps one aspect of it is “expensive,” but perhaps others aren’t (and need to be updated). For an example, let’s say you have a weather visualization app with a box to enter the location, and checkboxes on what to show: rain, temperature, and humidity.

If the input box changes, I need to re-run the API call, but once I run it I already have all of the data and I only want to filter it based on the updates to the checkboxes. I’m not seeing that type of flexibility with this. What would you advise for that type of situation?

I feel like I’m back to hidden divs with the “real data” or the need for global variables.


Update: hmmm. I just noticed the very first example was not passing this directly to the callback. Would I do something like this:

@functools32.lru_cache(maxsize=32)
def slow_function(expensive_var):
    # API call
    return response

@app.callback(
    Output('foo', 'children'),
    [Input(...)])
def callback(expensive_var, var2, var3):
    resp = expensive_func(expensive_var)
    ...
    # stuff involving resp, var2, and var3

    return stuff

If I do that, expensive_func will cache the result of my API call, but I’d still be updating based on changes to other variables? If that works, I can just remove my API call from the callback itself and put it in a separate cached function.

Alright, I’ve created a further example now that I’ve read more about caching. This is is more representative to my real-world structure, but does not appear to be working:`

import dash
from dash.dependencies import Input, Output, Event, State
import dash_core_components as dcc
import dash_html_components as html
import flask
from flask_caching import Cache
import time

app = dash.Dash(__name__)
server = app.server
cache = Cache(app.server, config={
    'CACHE_TYPE': 'filesystem',
    'CACHE_DIR': '/tmp'
})

app.css.append_css({'external_url': 'https://codepen.io/chriddyp/pen/bWLwgP.css'})

class Thing:

    def __init__(self, foo, bar):
        self.foo = foo
        self.bar = bar
        self.prod = self.make_prod()

    @cache.memoize(timeout=600)
    def make_prod(self):

        val = self.foo * self.bar
        return 'product: {}, time: {}'.format(val, int(time.time()))


app.layout = html.Div([
    dcc.Input(id='foo', type='number', value=0),
    dcc.Input(id='bar', type='number', value=0),
    html.Button('run!', id='run', n_clicks=0),
    html.Div(id='product')])


@app.callback(
    Output('product', 'children'),
    [Input('run', 'n_clicks')],
    [State('foo', 'value'),
     State('bar', 'value')])
def callback(clicks, foo, bar):

    thing = Thing(foo, bar)
    return thing.prod


if __name__ == '__main__':
    app.run_server(debug=True)

If I just sit there are click the run button, the time still updates. Maybe this is due to it instantiating thing each time? But the arguments to Thing would be the same each time.

If I pull the function out, it works:

@cache.memoize(timeout=600)
def make_prod(foo, bar):
    val = foo*bar
    return 'product: {}, time: {}'.format(val, int(time.time()))

Then I do self.prod = make_prod(self.foo, self.bar). Any tips on this? I’m still getting the hang of classes, but it seemed a good fit for this application; basically a set of data defined by some unique inputs which create other data based on functions and then I can pass these around to do what I need to do.

Looks like caching might not be a friend of classes that get are getting instantiated vs. just called?

1 Like

From the Flask Cache docs for the memoize method:

Use this to cache the result of a function, taking its arguments into account in the cache key.

Which means that the arguments of the function are used to look up the cache to see if there’s a previously computed value for that function with the same input arguments. In your case your memoizing a method with the single input of self. Flask Cache apparently uses the identity function to hash self arguments of methods, which will of course be different for each Thing instance. You’ll need to rework things such that you have function (or method on an instance that isn’t being continually re-created) such that the arguments correspond to values you want to use as the hash key.

3 Likes

Thanks @nedned. I saw this from that same page, which I’d been following, and thought it would work:

class Person(db.Model):
    @cache.memoize(50)
    def has_membership(self, role_id):
        return Group.query.filter_by(user=self, role_id=role_id).count() >= 1

Now I think the issue is that having and instance of Person exist and calling person.has_membership(role_id) a bunch does work, but constantly doing person = Person() and then calling the same doesn’t.

Thanks for pointing me in the right direction. I’ll give my structure some more mulling over, but at least I understand where I need to go.

It all depends, here, on when you instantiate the class. If the Parent class is only ever instantiated once, then it’s gonna be cached as you expect. But if you create a new instance on each callback (as you do with Thing), then you won’t be getting cache hits.

1 Like