Dash datatable large dataset slow performance, scrolling callback?

notincontrol · February 10, 2020, 5:12pm

Hey guys,

I am using dash datatable in webpage that is updated from a dropdown callback and the performance is very slow. The dataset I have is about 50mb with over 65,000 rows and 20 columns.

I have noticed the bottle_neck is converting the pandas dataframe to a dictionary to be injected by the datatable interface on every filter iteration of my data (performing server side filtering). It took like 15-20 seconds to convert the data to a dictionary.

So what I have done is converted the dataset to a dictionary once, kept a reference to the dictionary and then filter the list of dictionaries on every callback, the performance is much faster, but now I notice it appears the bottleneck is transferring the entire dataset to the client after every filter (about 8 seconds locally).

Is there a way to stream the data to the client, or only load a subset and then when scrolling through the dataset, if the user gets outside of the dataset window, a callback on scrolling can be triggered to load more rows?

I want to avoid using pagination, I just want the user the ability to scroll the entire dataset, but in order to increase performance, I think I need to load only the dataset in demand.

I am thinking something similar to this exampe: https://datatables.net/extensions/scroller/examples/initialisation/server-side_processing.html

I do have virtualization enabled.

As a test to the above, I modified the return of the callback to just return the first 2000 rows of the full list just to reduce the data sent to the client (all filtering is done on the full set of items), and the performance is amazing, which leads me to believe if I can figure out how to stream the data, it will solve the issue.

Also if there are any other recommendations on using larger datasets with data tables, I am open to all suggestions, or if I should change my approach completely, please let me know.

Thanks,
Kevin

chriddyp · February 11, 2020, 3:18am

we don’t have server side virtualization right now, but that’d be really cool. for now, maybe try pagination but with very large pages (like 5k rows).

notincontrol · February 11, 2020, 10:10pm

Thanks for the quick reply. Will go the pagination route for now.

Topic		Replies	Views
Dash DataTable Speed Issues Dash Python	9	2814	August 15, 2020
Dash Datatable inflating my data size during callback? Causing slow performance / loading Dash Python	3	1348	April 30, 2020
Slow performance when dealing with large Datasets Dash Python question	3	2311	January 13, 2023
Datatable Per-Column Dropdown filtering lags datatable when filtering over large data Dash Python	0	359	April 8, 2019
[Dash Plotly] Data Table is slow to update Dash Python	0	1129	September 14, 2018

Dash datatable large dataset slow performance, scrolling callback?

Related topics