Need help for architecture for a real-time dashboard featuring multiple pages / devices

goulouboudou · February 23, 2023, 12:22pm

Hi,

I need to create a real-time dashboard that displays the latest minute of MQTT data from several devices, updated every 100ms.

I’ve created a multi-page Dash application with a page for each device or process. I used the dash_mqtt component to subscribe to my topic on each page, with an extendData Output to display my live changes on my plotly charts.

Unfortunately, I have a problem because if I switch to page 2 with device #2, my subscription will restart from that point (another dash_mqtt component) and I won’t have the previous 1 minute of data for device #2.

What I would like is to have page 1 display real-time (last 600 points) data from device #1, switch to another page to see another device, and have my 600 points served.

To achieve this, I thought of two methods:

Having a dash_mqtt component on my main page (always up) writing to a dcc.Store component. Each page will have access to this main dcc.Store component and pick the last 600 points of their device every 100ms with a dcc.Interval component. However, this method could be resource-intensive because all the MQTT traffic is going to be sent directly to my client, along with parallel requests from the dcc.Interval.
Having the server execute a Background Callback at start and silently subscribe to the MQTT topics with the regular paho mqtt library. It could put the data (last 600 points of all devices) in its Disk Cache, and the client would ask for their device’s data with a dcc.Interval component. Alternatively, the data could be stored in a dcc.Store() on the main page, and the client could fetch the data from there. However, this method will create extra traffic, but sending the data to the dcc.Store() in batches could help mitigate this.

Do you think either of these propositions is viable and scalable, or would you recommend something else?
Thanks in advance.

jinnyzor · February 23, 2023, 3:04pm

Hello @goulouboudou,

Welcome to the community!

If you are concerned about the dcc.Store getting too much info, then I would go with #2 as this will allow you to only pull the info you need from the backend. Assuming that these points would be shared for all users, you should store these points in a db that is independent of dash and that you can query the applicable data from each page as desired.

DiskCache is innately setup to be each session having their own cache, so I dont know if that would necessarily work for you or not.

If you are not going to have multiple users, then you could just use the DiskCache as it is designed and skip the db part.

goulouboudou · February 23, 2023, 4:30pm

Thank you for your answer!

So I could implement a solution that will create a DiskCache in the main.py:

import diskcache
cache = diskcache.Cache("./cache")
background_callback_manager = DiskcacheManager(cache)

Then create a background callback that will fire at start up and essentially loop for MQTT messages and write it in the server’s disk:

@dash.callback(
    output=Output("?", "?"),
    inputs=Input("?", "?"),
    background=True,
    manager=background_callback_manager,
)
def get_MQTT_data(?):
    # subscribes
    while True:
        # wait for message, then:
        cache['device_1'] = mqtt_msg
    return ?

And then, in my tab, create a callback that will read that cache:

@app.callback(Output('graph-for-device-1', 'extendData'),
              Input('interval-component-100ms', 'n_intervals'))
def update_graph(n):
     values_to_plot = cache.get('device_1')
     return values_to_plot

Sounds like DiskCache is robust with concurrencies as well.

jinnyzor · February 23, 2023, 6:59pm

This may or may not work.

With you creating your own cache inside of the DiskCache, you would step on toes with multiple users and/or multiple tabs.

If you want to keep the background cache, you could use Server Side caching with dash-extensions instead.

Emil · February 23, 2023, 8:10pm

For this kind of use case, I would recommend an architecture with a separate worker process, which collects the data via mqtt, and writes it to a state store (a file, a Redis cache, etc.). You would then simply read the data from the state store and display it in your Dash app.

With this approach, your should be able to support multiple devices (all connected to your data collection worker process), as well as multiple views/users.

goulouboudou · February 24, 2023, 7:00am

Thank you both for your insights! I’ll try to implement a solution today and keep you updated!

goulouboudou · February 27, 2023, 1:21pm

I’ve been able to make some progress on my issue by creating a FastAPI that forwards the MQTT data to my Dash server by leveraging the EventSource component from the dash_extensions library.

The EventSource appends the data to a dcc.Store component on the main app:

@app.callback(
    Output("store", "data"),
    State("store", "data"),
    Input("sse", "message"),
)
def update_store(store_data,sse_data):    
    if sse_data is None:
        return dash.no_update
    else:               
        store_data.append(json.loads(sse_data))
        if len(store_data) >= MAX_DATA_POINTS:
            store_data.pop(0)
        
    return store_data

I have a callback that fills the whole chart from the dcc.Store if i’m opening my page by returning the whole fig:

@app.callback(
    Output('graph', 'figure'),
    State('graph', 'figure'),
    State("store", "data"),
    Input("div-page", "children")
)
def update_page_load(fig,store_data, div_children):
    traces = []
    for i in range(8):
        trace = go.Scattergl(
            x=# [data from my store],
            y=# [data from my store],
            mode='lines',
        )
        traces.append(trace)
    
    fig = go.Figure(data=traces)
        
    return fig

and another callback will just append the chart with new data points on dcc.Store change:

@app.callback(
    Output('graph', 'extendData'),
    Input("store", "data")
)
def update_chart(store_data):               
    data = store_data[-1]
    # get x and y from data
    ...
    return [{'x': [x]*8,'y': y}, [x for x in range(8)], MAX_DATA_POINTS]

I’m able to switch between my pages and have my last X data points loaded, however I’m sometimes missing some data, probably because of some CPU usage spikes to 100%.

I’ll investigate on how to lower my CPU usage (maybe I need to replace the dcc.Store by some Redis cache, maybe my update_chart needs to be a client-side callback) and keep you updated.

jkunstle · February 27, 2023, 3:07pm

Hey @goulouboudou

I had a similar problem to yours at one time.
I’m echoing the sentiment of @Emil.

What your solution is doing is more than sufficient as a patch, it’ll likely work just fine. However, it also consumes one application server worker every 100ms to exchange the dcc.Store data. This will also incur the time penalty of moving the currently cached dcc.Store data from the browser and to your server, before transporting the new data back to the browser, which is then shuttled back to be visualized.

You could simplify this data pipeline by having an asynchronous worker (something as simple as a concurrently running python service) that grabs the MQTT data once per 100ms and caches it under a known key in a shared Redis instance. Then on the client side you could have an Interval component trigger a re-render every 100ms or so, and in the executing callback you can check the Redis cache for updated data. (Redis is single-threaded so you don’t have to worry about reading data while the python process is writing.)

This would save you a lot of network and worker overhead.

goulouboudou · February 27, 2023, 3:20pm

Hi @jkunstle ,

Yes you’re right, I wanted to avoid installing another stack, so I was looking into the ServersideOutput from the dash_extensions.enrich library to save a dcc.Store exchange with the following code:

@app.callback(
    ServersideOutput("store", "data"),
    State("store", "data"),
    Input("sse", "message"),
)
def update_store(store_data,sse_data):    
    ...        
    return store_data

but I’m running into the same problem (I think) than this thread, although I’m on the latest version (available in Anaconda anyway) and this whole series of patches might not worth the hassle comparing to installing Redis.

I’ll dig into Redis and post an update, thank you.

jkunstle · February 27, 2023, 4:10pm

My experience is that learning Redis and using Redis are the easiest things about Redis (implying that everything about using Redis is easy).

Here’s a great reference piece on using Redis: https://www.openmymind.net/redis.pdf

Topic		Replies	Views
dcc.Interval for multipage apps Dash Python	9	557	April 20, 2023
Multipage and background data refresh with dcc.Interval Dash Python	2	251	January 22, 2024
Multipage dash app Dash Python question	6	844	March 12, 2024
Multi-Page Dash App with dcc.Store Dash Python question	2	259	November 14, 2023
Implementing dcc.Store on multi page app Dash Python	28	6548	October 14, 2023

Need help for architecture for a real-time dashboard featuring multiple pages / devices

Related topics