Announcing Dash Bio 1.0.0 🎉 : a one-stop-shop for bioinformatics and drug development visualizations.

How can I save a big DataFrame into a hidden Div

Hi!

I have a big DataFrame with data and I would like to:

  1. select some rows with customized filtering settings
  2. save this filtered table into a hidden Div for further uses (display graphs, …)
  3. display the filtered table with dash_table.DataTable and its pagination settings

Problem: My DataFrame is big (more than 2500 rows) and when I would like to save the entire table as json into the hidden Div, it does not work. But when it is filtered to be ‘small enough’, everything works perfectly. What should I do ? Did someone had the same issue ?
I was thinking about changing the way I share my filtered table, either with a dcc.Store or with a save into a .txt file that is linked to the user IP address. Is this a good idea ?

Here is an example of code:

dff = get_data()

external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
app = dash.Dash(__name__, external_stylesheets=external_stylesheets)

app.layout = html.Div(children=[
    
    html.Div(id='filtered-tab', style={'display': 'none'}),

    dcc.Dropdown(id="drop-example-filter",
                        options=[{'label': i, 'value': i} for i in liste_options],
                        multi=True,
                        value=[],
    ),
    html.Button('Submit', id='button'),
    dash_table.DataTable(
            id='tab',
            columns=[{"name": i, "id": i} for i in dff.columns],
            pagination_settings={
                'current_page': 0,
                'page_size': 25
            },
            pagination_mode='be'
    )
])

@app.callback(
    Output('filtered-tab', 'children'),
    [Input('button', 'n_clicks')],
    [State('drop-example-filter', 'value')])
def update_filtered_tab(n_clicks, filter_value):
      filtered_dff = filtering(dff, filter_value)
      return filtered_dff.to_json(date_format='iso', orient='split')

@app.callback(
    Output('tab', 'data'),
    [Input('filtered-tab', 'children'),
     Input('tab', 'pagination_settings')])
def update_filtered_tab(filtered_tab, pagination_settings):
      filtered_dff = pd.read_json(filtered_tab, orient='split')
      return filtered_dff.iloc[
           pagination_settings['current_page']*pagination_settings['page_size']:
           (pagination_settings['current_page'] + 1)*pagination_settings['page_size']
      ].to_dict('rows')

I tried with a dcc.Store and a data file, and I have the same problem. Did someone face the same problem ? Or maybe somebody with another idea I can try ?

Thanks!

One option could be to save the data in a file serverside. I guess performancewise this might even be better than the div tag, as you exchange less data between the client and the server. The file name could either be e.g. a session ID, or a custom ID stored in a hidden div. An efficient approach for writing the data to the file could be via numpy.

However, if you need to do lots of querying, a DB might be a better option.

Thank you Emil!

But the problem was coming from Git actually… I was not able to store more than 10 Mo (?) in a Div. So I split the data I need into several hidden Div + filtered the data upfront (see Example 2).