Interval Updating Data Pipeline for Dashboard

This Dash framework is awesome and I have been having a blast since I got a multi-page app up and running.

Now I am trying to scale up the app as I develop more pages and visualizations. Speed is the most important thing for me. For example, I am looking into moving towards sqlite queries, Oracle queries, etc.

I’m trying to wrap my head around best practices for live-updating data. I have read everywhere you should not be doing anything outside of the call back functions. I have a multi-page app up and running that works great for simple plots and one-off csv imports into DataFrames.

Please see the sample pseudo-code below. In this page, the data would exist in a class that could have multiple different DataFrame attributes, and additional methods say for updating those. Changes in a dropdown would update a given figure so that it only looks at a certain client’s data that is stored within one of the class instance’s DataFrames. Obviously a page could have multiple graphs, dropdowns, etc. all fed by that same data_object.

import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output

import pandas as pd
import plotly.graph_objs as go
from data import SomeDataClass


# Initialize data object to be used in creating all figures, etc.
data_object = SomeDataClass()

# Create layout structure.
layout = html.Div(
    children = [
        dcc.Dropdown(id='client-dropwdown-1'),
        dcc.Graph(id='graph-1')
    ]
)

@app.callback(
    Output('graph-1', 'figure'),
    Input('client-dropwdown-1', 'value'),
)
def update_figure(client):

    # Grab data from data object, slice on client from dropdown.
    df_cut = data_object.df_client.copy()
    df_cut = df_cut.loc[df_cut['Client'] == client]

    # *** Generate updated figure with df_cut***
    figure = {
        'data': [
            go.Scatter(
                x = df_cut['Product'],
                y = df_cut['Revenue']
            )
        ],
        'layout': go.Layout(
            xaxis={...}
            yaxis={...}
        )
    }
    
    return figure

It seemed a logical next step to use a callback that would be able to update this data_object, either by re-instantiating it entirely or perhaps calling some method like data_object.update_client_data_only(). This would be called by watching some interval component. In the case where the data refresh is extremely intensive - I don’t want to have this done in a callback each time a graph is updated. It seems to make more sense to have this treated within the class itself to update only certain sections of what I need, etc.

The interval method works great for updating what the user is seeing, but what about the actual core data itself?

I saw something about storing this object in a hidden div - again this seems like a lot of work and an indirect solution. It makes the most sense in my head to just have all the callbacks referencing the attributes of the class, this would keep the callback functions less bloated as well.

Could someone help me better understand this problem and best practices on this? Any advice or pointing out any flaws in why this won’t work would be really helpful. Thanks!