App displaying slices of large dataset (best practice)

bjonen · January 7, 2019, 9:18am

Dear all,

I have been playing around with ideas from https://dash.plot.ly/sharing-data-between-callbacks for a while now.

I have an application made up of ~10 charts which depend on one dropdown. The data is essentially one very large pandas dataframe. All the charts are just slices or other relatively fast computations of the same dataset in different dimensions. The computation cost is rather low. The df gets updated with new data over time and is persisted in an hdf5 store for fast reading by another ‘data update process’.

The data from the persisted hdf5 file is pushed to the application on a regular basis (every few hours) using the page reload layout and an intervalupdater along the lines of https://dash.plot.ly/live-updates.

I see now two ways to organize the data flows:

Have only one worker process where we update the df globally. I understand that this is against the fundamental principles of dash. But as long as I do not run the app with multiple workers and all users should have a view on the same (perhaps updated) data, is this really an issue?
Pull the up to date version of the df from a global store on each graph callback. I am using a redis server to store the data in memory to avoid having to read from hdf5 at each callback. However it seems to me the cost of loading the dataset each time outweighs by far the performance gain of having several worker nodes.

Are there other problems with 1) that I do not see right now? Are there maybe other approaches that you would recommend?

Thanks!

Topic		Replies	Views
Improving App Performance [ Help Please ] Dash Python	10	1795	December 11, 2021
Working on large datasets -- comparison with shiny Dash Python	12	27892	November 16, 2019
How to work with large DataFrames Dash Python	0	298	April 28, 2020
Dash - MATCH and performance - Data Handling Dash Python question	5	529	November 8, 2023
Dash - Caching Data / Handling Multiple Users Dash Python	5	3913	July 1, 2020

App displaying slices of large dataset (best practice)

Related topics