Dash-table: Graded Color Scale (conditional formatting) slow

bjonen · October 1, 2019, 9:45am

Hi all,

I am trying to replicate the MS Excel graded color scale in a dash table. The only way I see from the docs to create this behavior is to create a formatting rule that sets a backgroundcolor based on the id column and a filter query. I am dealing with a table of 1000 rows. The table becomes unusable, both with and without virtualization.

Is there a better way to proceed?

My code looks like this

sub = df[[idcol, fld]].sort_values(fld)
colors = pd.Series(data=sns.diverging_palette(h_neg=10, h_pos=251, s=64, l=48,
                                              sep=26, n=len(sub)), index=sub[idcol])
colors = colors.apply(matplotlib.colors.to_hex)
style_cond = [
    {
        'if': {
            'column_id': fld,
            'filter_query': '{} eq "{}"'.format('{'+idcol+'}', myid)
        },
        'backgroundColor': colors[myid],
        'color': 'black'
    } for myid in sub[idcol]
]

Emmanuelle · October 1, 2019, 1:54pm

Hi Benjamin, we would expect the table to behave nicely even with 1000 rows and 20-30 columns without virtualization. Could you please share

a standalone code example (much faster to reproduce your issue for us)
and also the version of Dash you’re using (some specific versions suffered from significant performance degradation )
Thanks!

bjonen · October 1, 2019, 5:08pm

Hey Emmanuelle,

attached please find a minimal example showing how the graded color scale makes the table very slow. On my machine with 1000 observations takes very long to load (if at all). If you then try to sort for example the table freezes.
When using virtualizaiton=True the table loads well but then scrolling is impossible.


import dash_html_components as html
import dash_core_components as dcc
import dash_bootstrap_components as dbc
import dash
import dash_table
import pandas as pd

# pip install -U --user dash
# dash versions:
# dash_html_components 1.0.1
# dash_core_components 1.2.1
# dash 1.3.1
# dash_table 4.3.0
# dash_daq 0.1.5
# dash_bootstrap_components 0.6.3

# ====== Number of rows in the table =====
# with 100 it runs on my machine, with 1000 it becomes very slow
Nobs = 1000


# SET UP SAMPLE TABLE
def get_id(N=15):
    import random
    import string
    return ''.join(random.choice(string.ascii_uppercase + string.digits) for _ in range(N))
def get_ids(N=1000):
    return [get_id() for _ in range(N)]
ids = get_ids(N=Nobs)
nvalcols = 10
index = ['id']+['val{}'.format(ix) for ix in range(nvalcols)]
data = [ids] + [list(pd.np.random.randn(Nobs)) for _ in range(nvalcols)]
df = pd.DataFrame(data=data, index=index).T


col_renames = {}
FORMATS = {}
TYPES = {}
cols = [
    {
        'name': col_renames.get(i, i),
        'id': i,
        'hideable': True,
        'format': FORMATS.get(i, {}),
        'type': TYPES.get(i, {}),
    } for i in df.columns
]

def add_graded_color_scale(df, idcol, fld, style_cell_conditional):
    import pandas as pd
    import matplotlib
    import seaborn as sns
    # creating a color palette sns.choose_diverging_palette()
    sub = df[[idcol, fld]].sort_values(fld)
    colors = pd.Series(data=sns.diverging_palette(h_neg=10, h_pos=251, s=64, l=48,
                                                  sep=26, n=len(sub)), index=sub[idcol])
    colors = colors.apply(matplotlib.colors.to_hex)
    style_cond = [
        {
            'if': {
                'column_id': fld,
                'filter_query': '{} eq "{}"'.format('{'+idcol+'}', myid)
            },
            'backgroundColor': colors[myid],
            'color': 'black'
        } for myid in sub[idcol]
    ]
    style_cell_conditional.extend(style_cond)
    return style_cell_conditional

style_data_conditional = []
style_cell_conditional = []
for fld in ['val{}'.format(ix) for ix in range(nvalcols)]:
    style_cell_conditional = add_graded_color_scale(df=df, idcol='id', fld=fld,
                                                    style_cell_conditional=style_cell_conditional)

print(df.head())


table = dash_table.DataTable(
        id='color-scale-table',
        data=df.to_dict('records'),
        columns=cols,
        style_data={
            'fontSize': '12px',
            'height': 'auto',
            ###################### elipsis block
            'overflow': 'hidden',
            'textOverflow': 'ellipsis',
            'maxWidth': 0,
            #####################
            #'minWidth': '100px',
        },
        fixed_rows={'headers': True, 'data': 0},

        style_header={
            'whiteSpace': 'normal',
            'textAlign': 'center',
            # Coloring
            'backgroundColor': 'black',
            'color': 'white',
        },
        style_filter={
            'backgroundColor': '#B2BEB5',
        },
        style_data_conditional=style_data_conditional,
        style_cell_conditional=style_cell_conditional,
        sort_action='native',
        sort_mode='multi',
        export_format='xlsx',
        style_as_list_view=True,
        hidden_columns=['sid'],
        tooltip={
            'property': 'sec_sl',
            'type': 'text',
            'value': 'Hello World',
        },

        # virtualization=True,
        page_action='none',
        filter_action='native',
        style_table={
            'minHeight': '80vh',
            'height': '80vh',
        },

)

def get_layout():
    row1 = html.Div([
        table
    ], style={
        'padding': '1%',
    })
    return row1


def gen_callbacks(app):
    pass


def getapp(server=True):
    app = dash.Dash(name=__name__, server=server,
                    url_base_pathname='/')
    app.layout = get_layout()
    gen_callbacks(app=app)
    return app


if __name__ == '__main__':
    app = getapp()
    app.run_server(host='0.0.0.0', debug=False, processes=1,
                   threaded=False, port=8050)

Emmanuelle · October 3, 2019, 9:14pm

Thank you for sharing this code @bjonen, it makes it much easier to test what’s going on, So what happens if that for each cell you have a very large number of queries to run (10000 I think), which is why it takes such a long time (on my laptop it took about one minute for the app to load). What you would like here I think is a gradient, that is a mapping between cell values and another property (background color): This would be only one simple operation per cell, so much faster. It is a feature that we’re interested in implementing here at plotly, but it’s likely that we’ll need to find first a sponsor interested in funding this development (https://plot.ly/products/consulting-and-oem/). So please stay tuned…

bjonen · November 25, 2020, 11:44pm

In case someone else runs into the same problem: The documentation now contains an example of the gradient solution to this issue:

Topic		Replies	Views
Automatic Gradient Highlighting for a data frame value Dash Python	5	2838	February 10, 2022
Row-wise ColorScale - Conditional Formatting Dash Python	0	745	November 30, 2021
Display large number of columns in Dash datatable Dash Python	3	830	June 1, 2022
Dash DataTable - Conditional Formatting Dash Python	11	10970	May 2, 2020
DataTable formulaic backgroundColor setting Dash Python	5	1275	November 22, 2020

Dash-table: Graded Color Scale (conditional formatting) slow

Related topics