Multiple callbacks with the same input or one callback with multiple outputs

I have a dag.AgGrid table that can be filtered. The table is not large (about 6,000 rows). When the table is filtered I update various graphs on the page. Everything works fine in testing but when I deploy the app it takes forever for the graphs to update. In part, think this is because some pages have 7+ graphs, each with their own callback.

For the best performance, is it better to have multiple callbacks with one output or one callback with multiple outputs? Based on some testing my suspicion is the later.

@callback(Output('graph1', 'figure'),
          Input('datatable', 'virtualRowData'))
def graph1(virtual_row_data):
    """Create  graph1 based on the filters in place"""
    if virtual_row_data:
        df = pd.DataFrame(virtual_row_data)

       # do some transformation 

        fig = px.pie(...)

        return fig
    return no_update

@callback(Output('graph2', 'figure'),
          Input('datatable', 'virtualRowData'))
def graph2(virtual_row_data):
    """Create graph2 based on the filters in place"""
    if virtual_row_data:
        df = pd.DataFrame(virtual_row_data)

        # do some transformation 

        fig = px.bar(...)

        return fig
    return no_update

VS

@callback(Output('graph1', 'figure'),
          Output('graph2', 'figure'),
          Input('datatable', 'virtualRowData'))
def graphs(virtual_row_data):
    """Create the graphs based on the filters in place"""
    if virtual_row_data:
        df = pd.DataFrame(virtual_row_data)

        # do some transformation 

        fig1 = px.pie(...)
        fig2 = px.bar(...)

        return fig1, fig2
    return no_update, no_update

My guess is doing df = pd.DataFrame(virtual_row_data) on 6000 rows of data seven plus times is causing the slowness but before I go and restructure all of my callbacks and wanted to ask here.

Hi @PyGuy welcome to the forums.

Let me put it like this,it doesn’t help the app responsiveness to create 7x a dataframe. so I would rewrite the callbacks.

2 Likes

It depends on the amount of data being passed to/from the callback. If it is small, the performance should be similar. If it is large (which seems to be the case here), performance will be better with a single callback, as you only need to the transfer the data once.

@Emil thank you for the response. I restructured the callbacks into one and it did help slightly but is still pretty slow. Things are fine when the users applies a filter but when the user clears a filter and virtualRowData is sending 6,000 rows and 20 columns via _dash-update-component things slow down.

Is there a better alternative to virtualRowData - i.e., is there a way just to get the index of the filtered rows instead of all the data?

I have thought about changing the trigger of the callback to filterModel so less data is sent but doing all of the filtering via SQL based on the filterModel would take a lot of work since I have a bunch of dashAgGridFunctions.

Hello @PyGuy,

You could make your own parser to pull in the index of the filtered data, this would require using the grid api and an event listener to go through and build the filtered index.

If you did this, then you would also need to work around allowing the data to sync to the server, which is interesting.

What you could do is, use dash-extensions with an infinite row model and build filter model server side, then store the data in a serverside output.

You may also be encountering a spamming of the callbacks, especially if the filter is performing as you are typing. To combat this, you could introduce a debounce on your virtualRowData via a clientside callback and a button.

3 Likes

If the issue is the amount of data being transferred, you could create a client side callback that calculates the indices, and then triggers a server side callback, passing only the necessary data.

2 Likes

@Emil Thanks, after doing a bunch of searching that is kind of what I was thinking. I am thinking of adding an index column to the table that I will hide using columnState and use the client-side callback to just send the index rather than every column and row in the filter.

I will give that a try and report back.

I’m dumb. I forgot that a copied my Procfile from another project, which of course had the wrong gunicorn configs. Changing the guincorn configs along with combining the callbacks and using dash_extensions.enrich.ServersideOutputTransform to store my DataFrames in dcc.Store has the callbacks responding in 400ms.

2 Likes