Patterned callbacks - Reduce data traffic over inputs/state and output

Henrikege · May 11, 2021, 7:29am

What is my goal? My goal is to reduce the data traffic bettween client and server. Hence limit the data transfer between Input/State and Output.

Initally I had 14 inputs, 8 states and 1 output. But there is too much unnecessary data roaming between each callback call.

Since callbacks are not allowed to share the same output, I tried to create a generic callback:

    Output(component_id={'id': ALL, 'type': ALL, 'plot':'map'},component_property= 'figure'),
    [Input(component_id={'id': ALL, 'type': MATCH, 'plot': 'histogram'}, component_property='clickData'),
     Input(component_id={'id': ALL, 'type': MATCH, 'plot': 'scatter'}, component_property='clickData')],
    [State(component_id={'id': ALL, 'type': ALL, 'plot': 'map'}, component_property='figure'),
    State(component_id={'id': ALL, 'type': MATCH, 'plot': 'scatter'}, component_property='figure')],

I know the meaning of the following error callback, but is there a cheeky way around it? Or a method to create multiple callbacks for one unique output?

In the callback for output(s):
  {"id":"mapbox-graph","plot":"map","type":"mapbox"}.figure
State 1 ({"id":ALL,"plot":"scatter","type":MATCH}.figure)
has MATCH or ALLSMALLER on key(s) type
where Output 0 ({"id":"mapbox-graph","plot":"map","type":"mapbox"}.figure)
does not have a MATCH wildcard. Inputs and State do not
need every MATCH from the Output(s), but they cannot have
extras beyond the Output(s).

Emil · May 11, 2021, 11:57am

You could try the MultiplexerTransform from dash-extensions. It makes it possible to create multiple callbacks for one output. Here is a small example,

import dash_html_components as html
from dash_extensions.enrich import Output, DashProxy, Input, MultiplexerTransform

app = DashProxy(prevent_initial_callbacks=True, transforms=[MultiplexerTransform()])
app.layout = html.Div([html.Button("left", id="left"), html.Button("right", id="right"), html.Div(id="log")])


@app.callback(Output("log", "children"), Input("left", "n_clicks"))
def left(_):
    return "left"


@app.callback(Output("log", "children"), Input("right", "n_clicks"))
def right(_):
    return "right"


if __name__ == '__main__':
    app.run_server(debug=True, port=7777)

Henrikege · May 13, 2021, 6:00pm

Thank you for the quick reply Emil! Im a bit hesistated using this 3rd-party solution, because I might need to refactor my written code!

Emil · May 13, 2021, 7:23pm

I think you would only need to change the code which you are currently looking into changing anyway?

Anyhow, I would suggest that you try it out (i.e. in a feature branch) to test if it suits your use case. Worst case is that it doesn’t, then you can just go back

stu · April 21, 2022, 1:49pm

I have the same question. I refreshed three graphs in one callback, and the data volume of the callback reached 3.8 MB each time. In fact, compared to the two triggers, two-thirds of the data sent is the same.

Do we have any plans to use caching to optimize this problem?

Emil · April 22, 2022, 12:29pm

@stu If your intent is to reduce network traffic, I agree that caching is the best option. You could either use caching explicitly (as suggested in the docs), or you could take a look at the ServersideOutput component,

stu · April 22, 2022, 2:28pm

Hi @Emil, thank you for the reply. But I don’t think it’s a good idea to always look for a compromise, especially when it comes to some problem from the component itself. I also would like to try some extensions when I need to use some new features. However, this is not a new issue, it has been with us since we started using callbacks, and dash relies heavily on callbacks. The difficulty I have, is that my graphs consist of several traces, and I may only change one or two of them each time, but up to the current version I can only call back the whole graph. When I mention caching, I don’t mean simply putting data on the server or client side. The multiplexer transform solution you provided is actually a good idea. But the question is, as a user, why should I manually split my callbacks? This issue is not even my fault, it’s a limitation of callback design. So in fact, we can talk about a solution where the programs at both ends can intelligently determine which part of the serialized data has been modified between two triggers, and then only send and receive that part.

Emil · April 22, 2022, 3:57pm

@stu I suggested the MultiplexerTransform because the initial poster had combined multiple callbacks into one to circumvent the general limitation of Dash not being able to target an output by multiple callbacks (which the MultiplexerTransform lifts). Not because I generally think you should split a callback into smaller ones. In fact, I really don’t think you should (unless you want to for readability).

I wouldn’t consider the use of caching (either explicit or via the ServersideOutputTransform) as a “compromise”. Rather, it’s a tool that’s useful for some use cases (e.g. sharing of data between callbacks). However, for your use case (partial updates of figures), it’s obviously not the right tool. It sounds like your are looking for something like the extendData property,

Dash is designed to update properties. What they represent is up to the component designer. It could be a complete figure, or it could be some subpart of the figure. Hence, if the relevant properties are not available, I would consider it a missing feature of the component rather than a flaw in the design of Dash.

Conceptually, I think delta updates is a very interesting idea. However, it doesn’t really fit easily with the key design principle of the Dash server being stateless. If the client only sends the “delta” data, where should “the rest” come from?

stu · April 22, 2022, 5:47pm

You are right, stateless is indeed a difficult threshold to cross. But what if the cost of callbacks is far greater than the need for stateless? I mean, when dash is deployed on some metered services.

Or we may not necessarily use delta data. As you mentioned, multiplexing, which happens to be splitting a large callback into several small ones. And the process of splitting may be judged and processed by the program, and then the two ends maintain the same principle. When the serialized data is split into several small pieces, the server can send the hash value before sending the actual data. Then, theoretically, it probably still remain stateless.

Since you know that’s not the right tool for my use case, you still sent it to me, so it’s a compromise for me.

In fact, when I update the graphs, it comes from the data processed by sklearn, and those inputs are just a series of model parameter adjustments. Whether splitting or locating relevant properties, it may make my code very complex. And I don’t think it should be up to me to do that.

I didn’t say that was a flaw, either. But it does bring in redundant data. So what I’m saying is, don’t mess component designers and users around , but find a way to optimize this problem.

This extendData property might be more suitable for my MQTT example.

Topic		Replies	Views
Can't we make a callback function that shares output? Dash Python	8	1971	July 11, 2022
Best approach to accomodating multiple callbacks for an output Dash Python	8	8459	October 2, 2021
How to return multiple outputs while using pattern matching callback Dash Python	4	2521	December 10, 2021
It's not possible to have multiple callbacks/Output for same object(graph)? Dash Python	2	6856	August 22, 2019
Manage many callbacks to the same output object Dash Python	5	3010	March 18, 2021

Patterned callbacks - Reduce data traffic over inputs/state and output

Related topics