Dash-Charts: Reduce amount of data sent to browser for Chart (use same X-Axis Datapoints)

HI, I have the problem that callbacks for certain charts return huge amounts of data, making them very slow to load.

Looking at what the browser actually receives, the json contains multiple series (->multiple lines in chart) inside the list “props”: {“figure”: { “data”: []

looking at that a little closer it becomes obvious, that what is creating megabytes of data here is simply the huge amount of timestamps transported as strings like “2024-05-14T00:00:00+02:00”.

I am aware that when we show series that come from different sources they might have different x-values.

But if as is the case here (and OFTEN the case) - the series come from the same pandas DataFrame and share the index, they ALL have the same X axis.

Now I get 10x the same X-Axis in those wasteful strings, instead of transporting it once and telling all other series that they share the same x, sending just the y values.

So the big question for now:
Is it possible to somehow tell dash/plotly to share the X-Axis Datapoints and only send lists of Y-datapoints?

And the question for later:
Why not transport Datetime as a single large integer (maybe plus timezone-info also as integer (2 instead of “+02:00” in my case) and convert back to datetime in the browser (if even necessary). That would reduce the amount of data hugely and speed up Dash in the process.

@marcmuc , I think you could save a bunch of bandwidth by sending a partially filled out version of the figure to a dcc.Store intermediate and then chaining a clientside callback that copies the x values from trace 1 to all the other traces and outputs to the figure you want to update. That way you only need to send the x data across the network once.

Additionally, I wonder if the responses from your dash app are being gzip compressed. It’s not enabled in dash by default because dash apps are often run behind nginx servers that manage compression but you can also turn it on within dash by installing it with pip install dash[compress] and then doing app = dash.Dash(__name__, compress=True) in your dash app. I think with that enabled it would use significantly less space than “2024-05-14T00:00:00+02:00” does as a string

3 Likes

Thanks @michaelbabyn, setting the compress option solves the issue for now, it brings down the payload from 2MB → 100kB.

I am still wondering
A) are there examples for the clientside callback that you mention, on how to go about doing this on the browser side?
B) Wouldn’t it generally be a good idea to improve dash/plotly in this form, by enabling shared axis data somehow and/or using timestamps instead of strings to reduce the overall data throughput? Compression helps, but the underlying problem remains…

I don’t think there’s one online but here’s an example of what I meant:

import dash
from dash import dcc, html, Input, Output, State

import plotly.graph_objs as go
import pandas as pd
import numpy as np

app = dash.Dash(__name__)

exchanges = ['NYSE', 'NASDAQ', 'LSE', 'TSE']
all_stocks = {}

for exchange in exchanges:
    stock_names = [f"{exchange}{i:03d}" for i in range(1, 21)]
    dates = pd.date_range(end=pd.Timestamp.today(), periods=365, freq='D')
    values = np.random.randint(50, 500, size=(365, 20))
    df = pd.DataFrame(values, index=dates, columns=stock_names)
    all_stocks[exchange] = df

# App Layout
app.layout = html.Div([
    dcc.Dropdown(
        id='exchange-dropdown',
        options=[{'label': x, 'value': x} for x in exchanges],
        value=exchanges[0]
    ),
    dcc.Store(id='figure-store'),  # Store the initial figure
    dcc.Graph(id='stock-graph')
])

@app.callback(
    output=Output('figure-store', 'data'),
    inputs=Input('exchange-dropdown', 'value'),
)
def update_graph(exchange):
    df = all_stocks[exchange]
    fig = go.Figure()
    for stock in df.columns:
        fig.add_trace(go.Scatter(x=df.index, y=df[stock], mode='lines', name=stock))
    fig = fig.to_dict()
    for i in range(1, len(fig['data'])):
        del fig['data'][i]['x']
    return fig

app.clientside_callback(
    """
    function(figureDict) {
        const figure = {...figureDict}; // Create a copy of the figure data
        const x_data=figure.data[0].x;

        for (let i = 1; i < figure.data.length; i++) {
            figure.data[i].x = x_data; 
        }
      

        return figure;
    }
    """,
    Output('stock-graph', 'figure'),
    Input('figure-store', 'data'),
)


if __name__ == '__main__':
    app.run_server(debug=True, port=8055)

This results in a 1/5 of the data transfered compared to a version of the app that repeats the x values (~50 KB vs 250 KB) but after enabling compression it’s only 10% better (13.8 KB vs 15.3 KB). If you’re still interested in this feature feel free to create an issue in Issues · plotly/dash · GitHub

inefficient app:

import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.graph_objs as go
import pandas as pd
import numpy as np

app = dash.Dash(__name__)

# Generate Fake Stock Data
exchanges = ['NYSE', 'NASDAQ', 'LSE', 'TSE']
all_stocks = {}

for exchange in exchanges:
    stock_names = [f"{exchange}{i:03d}" for i in range(1, 21)]  # 20 stocks per exchange
    dates = pd.date_range(end=pd.Timestamp.today(), periods=365, freq='D')
    values = np.random.randint(50, 500, size=(365, 20))
    df = pd.DataFrame(values, index=dates, columns=stock_names)
    all_stocks[exchange] = df
print(all_stocks.keys())
# App Layout
app.layout = html.Div([
    dcc.Dropdown(
        id='exchange-dropdown',
        options=[{'label': x, 'value': x} for x in exchanges],
        value=exchanges[0]  # Default to the first exchange
    ),
    dcc.Graph(id='stock-graph')
])

@app.callback(
    Output('stock-graph', 'figure'),
    Input('exchange-dropdown', 'value')
)
def update_graph(exchange):
    df = all_stocks[exchange]  

    fig = go.Figure()
    for stock in df.columns:
        fig.add_trace(go.Scatter(x=df.index, y=df[stock], mode='lines', name=stock))

    fig.update_layout(
        title=f'Top 10 Stock Values on {exchange} (Last Year)',
        xaxis_title='Date',
        yaxis_title='Value'
    )
    return fig

if __name__ == '__main__':
    app.run_server(debug=True, port=8056)

2 Likes

Thanks for sharing this!