I am using the dash-table-experiments (https://github.com/plotly/dash-table-experiments). I have a large dataframe (~150,000 rows) which is used to fill the rows of the DataTable. Unfortunately, when I try to use all the rows I get a memory error or other problems using the sortable/filterable features of the DataTable. The thing is I don’t need to display all 150,000 rows in the DataTable, it’s fine to have a more manageable number like 1,000 or 10,000 but I need to be able to search/filter the entire dataframe not just the 1,000 or 10,000 used as the rows value. It would be awesome to have an option in the DataTable like “max_rows_to_present” which would enable the filtering of the entire database, but only present <= “max_rows_to_present”. Is this possible already? Could this feature be implemented? Do you have any ideas about how I could implement this myself?
Dash and the DataTables feature are super super super cool! @chriddyp
I’m a react noob, but do you think it would be possible to change this._absolute.rows to some new parameter (yet to exist) which represents all the rows, perhaps something like this._absolute.all_rows? I’m going to clone and play around with this!
I resorted to doing something like the following where the filtering/sorting is handled separately with the knowledge of the whole dataframe…
import dash
from flask import Flask
import dash_table_experiments as dt
from dash.dependencies import Input, Output
import dash_core_components as dcc
import dash_html_components as html
import pandas as pd
df = pd.DataFrame({'number':[i for i in range(100000)],'value':[str(i)+'_value' for i in range(100000)],'value1':[str(i)+'_value1' for i in range(100000)]})
app = dash.Dash(__name__)
app.scripts.config.serve_locally = True
app.layout = html.Div([html.H4('Fake Table'),
html.Div([html.P('Search a number (up to 100,000): ', style={'display':'inline-block'}), dcc.Input(type='text', value='', id='input1'),
html.Button('SORT',id='input2')], style={'display':'inline-block'}),
dt.DataTable(
rows=df.head(100).to_dict('records'),
columns=sorted(df.columns),
filterable=False,
sortable=True,
selected_row_indices=[],
id='datatable'
)])
def filter(val, clicks):
"""
For user selections, return the relevant in-memory data frame.
"""
if clicks:
if clicks % 2 == 1:
df.sort_values('number', ascending=False, inplace=True)
else:
df.sort_values('number', ascending=True, inplace=True)
return df.loc[df.number.astype(str).str.contains(val)]
@app.callback(Output('datatable','rows'),[Input('input1', 'value'), Input('input2', 'n_clicks')])
def update(val, clicks):
df = filter(val, clicks)
return df.head(100).to_dict('records')
if __name__ == "__main__":
app.run_server(debug=True)