Server side caching dependent on user input

Hello!

I’ve been thinking about if and how there is a solution to the following situation: In a multi-user app where the “global” data query/processing is expensive, I’ve found server side caching (example 4 in the documentation) to be very useful.

What I’m facing now is the same situation, but where the expensive data prep depends on user input. Is there any way to approach this? How can I make the cached user data update based on changes to the input?

Below is some code I’m playing with based on the example in the documentation. At the moment, however, what happens is that the data loads on page load/refresh and returns nothing since the input is empty. But my callback is failing to reload the data when the input changes.

Any ideas/hints would be appreciated :slight_smile:

app = dash.Dash(__name__)

cache = Cache(app.server, config={
    'CACHE_TYPE': 'redis',
    'CACHE_TYPE': 'filesystem',
    'CACHE_DIR': 'cache-directory',
    'CACHE_THRESHOLD': 200
})

def get_dataframe(session_id, user_input_value):
    @cache.memoize()
    def query_and_serialize_data(session_id):
        df = # Here I make my query based on user_input_value 
             # If there is no input I just return an empty df
        return df.to_json()
    return pd.read_json(query_and_serialize_data(session_id))

def serve_layout():
    session_id = str(uuid.uuid4())

    return html.Div([
        html.Div(session_id, id='session-id', style={'display': 'none'}),
        # User input            
        # My output which doesn't behave as expected
    ])

app.layout = serve_layout

# Callback related to the user input

@app.callback(
    [
        Output('my-output', 'output-component')
    ],
    [
        Input('session-id', 'children'),
        Input('user-input', 'value')
    ]
)
def generate_output(session_id, user_input_value):
    df = get_dataframe(session_id, user_input_value)
    return # Output component

if __name__ == '__main__':
    app.run_server(debug=True)

Based on your current code, I guess it will work as intended, if you pass the user input to the query_and_serialize_data function.

1 Like

True, thanks for pointing that out! But I am still a bit confused; if I do that, it means I will have to provide the user input to every callback that accesses “global” cached data - right? But something feels suboptimal here, as I need the cached data to update on user input, but apart from that, remaining callbacks accessing this data have no connection to the input. Or am I completely off here…?

Yes, that is correct. That’s more-or-less why i created the ServersideOutput component :slight_smile:. It takes care of all this stuff behind the scenes and yields an interface similar to what you are used to in Dash. Here is a small example of the syntax as per dash-extensions (0.0.27),

  # This code inserts the output of the callback into a cache on the server, similar to your code.
  @app.callback(ServersideOutput("store", "data"), Trigger("left", "n_clicks"), memoize=True) 
  def query():
      return pd.DataFrame(data=list(range(10)), columns=["value"])

  # This code retrives the data from the cache and passes it to the callbacks.
  @app.callback(Output("log", "children"), Input("store", "data")) 
  def right(df):
      return df["value"].mean()

EDIT: For reference, here is a complete example.

import time
import dash_core_components as dcc
import dash_html_components as html
import plotly.express as px
from dash_extensions.enrich import Dash, Output, Input, Trigger, ServersideOutput

app = Dash(prevent_initial_callbacks=True)
app.layout = html.Div([
    html.Button("Query data", id="btn"), dcc.Dropdown(id="dd"), dcc.Graph(id="graph"),
    dcc.Loading(dcc.Store(id='store'), fullscreen=True, type="dot")
])


@app.callback(ServersideOutput("store", "data"), Trigger("btn", "n_clicks"), memoize=True)
def query_data():
    time.sleep(1)
    return px.data.gapminder()


@app.callback(Input("store", "data"), Output("dd", "options"))
def update_dd(df):
    return [{"label": column, "value": column} for column in df["year"]]


@app.callback(Output("graph", "figure"), [Input("store", "data"), Input("dd", "value")])
def update_graph(df, value):
    df = df.query("year == {}".format(value))
    return px.sunburst(df, path=['continent', 'country'], values='pop', color='lifeExp', hover_data=['iso_alpha'])


if __name__ == '__main__':
    app.run_server()
3 Likes

That sounds awesome! Gonna try it out ASAP :slight_smile:

1 Like