Announcing Dash Bio 1.0.0 🎉 : a one-stop-shop for bioinformatics and drug development visualizations.

Bug in datatable virtualization when combined with cell dropdowns

When combining the datatable dropdown and virtualization example from the Dash docs (Dropdowns Inside DataTable | Dash for Python Documentation | Plotly and Virtualization | Dash for Python Documentation | Plotly respectively), it turns out that changing the cell dropdowns seems to completely break the datatable, but only for cells that are farther down and require scrolling. Even though the assumptions for virtualization seem to be met, it returns the following error:

Invalid argument `data` passed into DataTable with ID "table-dropdown".
Expected an array.
Was supplied type `object`.

An MWE to reproduce this error:


import dash
import dash_html_components as html
import dash_table
import pandas as pd
from collections import OrderedDict
import numpy as np

app = dash.Dash(__name__)

n=100
climates=['Sunny', 'Snowy', 'Rainy','Cloudy']*int(n/4)
temperatures=[x for x in np.random.randint(0,60,n)]
cities =  ['NYC', 'Montreal', 'Miami', 'NYC']*int(n/4)
df = pd.DataFrame(OrderedDict([
    ('climate', climates),
    ('temperature', temperatures),
    ('city',cities)
]))

app.layout = html.Div([
    dash_table.DataTable(
        id='table-dropdown',
        data=df.to_dict('records'),
        columns=[
            {'id': 'climate', 'name': 'climate', 'presentation': 'dropdown'},
            {'id': 'temperature', 'name': 'temperature'},
            {'id': 'city', 'name': 'city', 'presentation': 'dropdown'},
        ],
        editable=True,
        dropdown={
            'climate': {
                'options': [
                    {'label': i, 'value': i}
                    for i in df['climate'].unique()
                ]
            },
            'city': {
                 'options': [
                    {'label': i, 'value': i}
                    for i in df['city'].unique()
                ]
            }
        },
        page_action ='none',
        virtualization=True,
        fixed_rows={ 'headers': True,
                    'data': 0 },
        style_cell={
            'whiteSpace': 'normal'
        },
        style_data_conditional=[
            {'if': {'column_id': 'climate'},
             'width': '50px'},
            {'if': {'column_id': 'temperature'},
             'width': '50px'},
            {'if': {'column_id': 'city'},
             'width': '100px'}
        ],
    )
])

if __name__ == '__main__':
    app.run_server(debug=True)

Am I missing something here in my data-table setup code or is this a bug in the data-table code, and if so, is it potentially (easily) solvable?

Although it seemingly has been quite a while since this question was posted, thank you @raypalmertech for asking it in the first place. Your diagnosis of virtualization as a causal factor saved me time and headache when I encountered this issue for the first time today. While there are no answers posted – yet? – sometimes just sharing what one knows about the problem can be immensely useful to others.

In-table dropdowns seem to rely on assumptions about table data (e.g. “static” row indexing and associated data, post-scroll) which do not apply when virtualization is in effect. This issue also can break diff-style callbacks that determine which table value(s) a user has recently changed.

In my case I was able to simply remove virtualization from an affected table, and use custom paging and serverside outputs to score necessary performance benefits in a different way. It’s a workaround rather than a solution, and won’t be adequate for all foreseeable situations or needs. But asking “What am I really trying to achieve here?” and considering alternate approaches was enough to semi-solve my issue… this time.

Surely others have better ideas than mine. Maybe this note will bring new visibility to the issue.

2 Likes