Black Lives Matter. Please consider donating to Black Girls Code today.

Merging dash datatables - best way to do this?

My default view is a datatable with a list of items. Each item has statistics associated with it - for example, max, min, stddev, etc… Because the statistics take a while to execute, they are disabled by default and only loaded if the user checks a checkbox to show statistics.

If the box is checked I load a second dataframe with the raw data, calculate the statistics, merge it with the master dataframe, then display that as a Dash datatable.

Several columns are formatted differently (some as currency, others a number, others a link, etc) so I cannot use the snippet of code from the documentation:

        columns=[
            {"name": i, "id": i} for i in sorted(df.columns)
        ],

If there is a column that exists in the columns= variable, but not in the dataframe, and error will occur.

I intend to have several other sources pulled from the database and merged into the master df in the same way as the stats data, so the columns being loaded/displayed will depend on what the user has selected.

What is the best way to format the resultant datatable columns if I am not sure which columns are going to exist in the final output? Is there a function that allows two separate datatables to be merged together?

If not, my thought is to have a dictionary with all possible column names and how they are formatted, and look this up during the call and format accordingly. I wasn’t sure if there was a better way to do this.

I got this working with the following code:

    table = dash_table.DataTable(
        id='contracts_table',
        columns=[
            next((item for item in all_cols if item["id"] == c), {"name": c, "id": c}) for c in df.columns
        ]

all_cols will define how everything looks. These columns don’t need to be in the final dataframe in order for this to work:

all_cols = [
{
    'id': 'itemname',
    'name': 'Item Name',
    'type': 'text'
}, {
    'id': 'max',
    'name': 'Max',
    'type': 'numeric',
    'format': FormatTemplate.money(4)
}, {
    'id': 'url',
    'name': 'URL',
    'type': 'text',
    'presentation': 'markdown',
}
]

If you want the columns to appear in the order you listed in all_cols, use this function:

# Sort dataframe columns
sort_order = []
for item in all_cols:
    id = item['id']
    if id in df.columns:
        sort_order.append(id)
df = reorder_df_columns(df, first_cols=sort_order)

Reordering function:

    def reorder_df_columns(df: pd.DataFrame, first_cols=[], last_cols=[], drop_cols=[]):
        columns = df.columns
        columns = list(set(columns) - set(first_cols))
        columns = list(set(columns) - set(drop_cols))
        columns = list(set(columns) - set(last_cols))
        new_order = first_cols + columns + last_cols
        df = df[new_order]
        return df