Patterned callbacks - Reduce data traffic over inputs/state and output

What is my goal? My goal is to reduce the data traffic bettween client and server. Hence limit the data transfer between Input/State and Output.

Initally I had 14 inputs, 8 states and 1 output. But there is too much unnecessary data roaming between each callback call.

Since callbacks are not allowed to share the same output, I tried to create a generic callback:

    Output(component_id={'id': ALL, 'type': ALL, 'plot':'map'},component_property= 'figure'),
    [Input(component_id={'id': ALL, 'type': MATCH, 'plot': 'histogram'}, component_property='clickData'),
     Input(component_id={'id': ALL, 'type': MATCH, 'plot': 'scatter'}, component_property='clickData')],
    [State(component_id={'id': ALL, 'type': ALL, 'plot': 'map'}, component_property='figure'),
    State(component_id={'id': ALL, 'type': MATCH, 'plot': 'scatter'}, component_property='figure')],

I know the meaning of the following error callback, but is there a cheeky way around it? Or a method to create multiple callbacks for one unique output?

In the callback for output(s):
  {"id":"mapbox-graph","plot":"map","type":"mapbox"}.figure
State 1 ({"id":ALL,"plot":"scatter","type":MATCH}.figure)
has MATCH or ALLSMALLER on key(s) type
where Output 0 ({"id":"mapbox-graph","plot":"map","type":"mapbox"}.figure)
does not have a MATCH wildcard. Inputs and State do not
need every MATCH from the Output(s), but they cannot have
extras beyond the Output(s).

You could try the MultiplexerTransform from dash-extensions. It makes it possible to create multiple callbacks for one output. Here is a small example,

import dash_html_components as html
from dash_extensions.enrich import Output, DashProxy, Input, MultiplexerTransform

app = DashProxy(prevent_initial_callbacks=True, transforms=[MultiplexerTransform()])
app.layout = html.Div([html.Button("left", id="left"), html.Button("right", id="right"), html.Div(id="log")])


@app.callback(Output("log", "children"), Input("left", "n_clicks"))
def left(_):
    return "left"


@app.callback(Output("log", "children"), Input("right", "n_clicks"))
def right(_):
    return "right"


if __name__ == '__main__':
    app.run_server(debug=True, port=7777)
4 Likes

Thank you for the quick reply Emil! Im a bit hesistated using this 3rd-party solution, because I might need to refactor my written code!

I think you would only need to change the code which you are currently looking into changing anyway?

Anyhow, I would suggest that you try it out (i.e. in a feature branch) to test if it suits your use case. Worst case is that it doesn’t, then you can just go back :slight_smile:

1 Like

I have the same question. I refreshed three graphs in one callback, and the data volume of the callback reached 3.8 MB each time. In fact, compared to the two triggers, two-thirds of the data sent is the same.

Do we have any plans to use caching to optimize this problem?

@stu If your intent is to reduce network traffic, I agree that caching is the best option. You could either use caching explicitly (as suggested in the docs), or you could take a look at the ServersideOutput component,

Hi @Emil, thank you for the reply. But I don’t think it’s a good idea to always look for a compromise, especially when it comes to some problem from the component itself. I also would like to try some extensions when I need to use some new features. However, this is not a new issue, it has been with us since we started using callbacks, and dash relies heavily on callbacks. The difficulty I have, is that my graphs consist of several traces, and I may only change one or two of them each time, but up to the current version I can only call back the whole graph. When I mention caching, I don’t mean simply putting data on the server or client side. The multiplexer transform solution you provided is actually a good idea. But the question is, as a user, why should I manually split my callbacks? This issue is not even my fault, it’s a limitation of callback design. So in fact, we can talk about a solution where the programs at both ends can intelligently determine which part of the serialized data has been modified between two triggers, and then only send and receive that part.

@stu I suggested the MultiplexerTransform because the initial poster had combined multiple callbacks into one to circumvent the general limitation of Dash not being able to target an output by multiple callbacks (which the MultiplexerTransform lifts). Not because I generally think you should split a callback into smaller ones. In fact, I really don’t think you should (unless you want to for readability).

I wouldn’t consider the use of caching (either explicit or via the ServersideOutputTransform) as a “compromise”. Rather, it’s a tool that’s useful for some use cases (e.g. sharing of data between callbacks). However, for your use case (partial updates of figures), it’s obviously not the right tool. It sounds like your are looking for something like the extendData property,

Dash is designed to update properties. What they represent is up to the component designer. It could be a complete figure, or it could be some subpart of the figure. Hence, if the relevant properties are not available, I would consider it a missing feature of the component rather than a flaw in the design of Dash.

Conceptually, I think delta updates is a very interesting idea. However, it doesn’t really fit easily with the key design principle of the Dash server being stateless. If the client only sends the “delta” data, where should “the rest” come from?

1 Like

You are right, stateless is indeed a difficult threshold to cross. But what if the cost of callbacks is far greater than the need for stateless? I mean, when dash is deployed on some metered services.

Or we may not necessarily use delta data. As you mentioned, multiplexing, which happens to be splitting a large callback into several small ones. And the process of splitting may be judged and processed by the program, and then the two ends maintain the same principle. When the serialized data is split into several small pieces, the server can send the hash value before sending the actual data. Then, theoretically, it probably still remain stateless.

Since you know that’s not the right tool for my use case, you still sent it to me, so it’s a compromise for me. :upside_down_face:

In fact, when I update the graphs, it comes from the data processed by sklearn, and those inputs are just a series of model parameter adjustments. Whether splitting or locating relevant properties, it may make my code very complex. And I don’t think it should be up to me to do that.

I didn’t say that was a flaw, either. But it does bring in redundant data. So what I’m saying is, don’t mess component designers and users around , but find a way to optimize this problem.

This extendData property might be more suitable for my MQTT example.