I took the generic crossfiltering recipe from Part 4. Interactive Graphing and Crossfiltering | Dash for Python Documentation | Plotly and tried it on my dashbord.
However, I can not make it work if my dataframe is pre-filtered and there is no correspondence between pointNumber and df.index anymore.
Here is a minimal working example which illustrates the problem:
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.express as px
import numpy as np
import pandas as pd
external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
app = dash.Dash(__name__, external_stylesheets=external_stylesheets)
df = pd.read_csv('https://raw.githubusercontent.com/ChrisG60/Diams/master/diamonds.csv')
app.layout = html.Div([
dcc.Graph(id='carat-graph', config={'displayModeBar': False}),
])
@app.callback(
Output('carat-graph', 'figure'),
Input('carat-graph', 'selectedData'),
)
def update_carat(selection):
filtered_df = df.copy()
# FIXME: Breaks crossfiltering:
filtered_df = filtered_df[filtered_df.cut == 'Ideal']
selectedpoints = filtered_df.index
if selection and selection['points']:
selectedpoints = np.intersect1d(selectedpoints, [p['customdata'] for p in selection['points']])
fig = px.scatter(filtered_df,
x='carat',
y='price',
marginal_x='histogram',
marginal_y='histogram',
)
fig.update_traces(selectedpoints=selectedpoints,
customdata=filtered_df.index,
)
fig.update_layout(dragmode='select')
return fig
if __name__ == '__main__':
app.run_server(debug=True)
Everything works perfectly undless the dataframe is filtered (See line with # FIXME).
If I do this, there are two problems:
the scatter plot shows a weird selection pattern on loading - usually only a few points show up and I have to double click the plot to reset it
On selection, different points show up in the selection afterwards
From the documentation, I see that selectedpoints takes a list of pointNumbers but the order is messed up on dataframe filtering.
Is there any way around this issue? Why do I have to use the customdata in the first place, if this field can not be used to show the selected points anyways?
In that case, there are different curveNumbers and if you specify a list of selectedpoints it will show the selection per curve, i.e. if there are 6 curves and 4 selected points, then you will get 6*4 selected points - 4 in each curve.
See this example:
import dash
import dash_core_components as dcc
import dash_html_components as html
from dash.dependencies import Input, Output
import plotly.express as px
import numpy as np
import pandas as pd
external_stylesheets = ['https://codepen.io/chriddyp/pen/bWLwgP.css']
app = dash.Dash(__name__, external_stylesheets=external_stylesheets)
df = pd.read_csv('https://raw.githubusercontent.com/ChrisG60/Diams/master/diamonds.csv')
app.layout = html.Div([
dcc.Graph(id='carat-graph', config={'displayModeBar': False}),
])
@app.callback(
Output('carat-graph', 'figure'),
Input('carat-graph', 'selectedData'),
)
def update_carat(selection):
# Reset Index is necessary to create a correspondence between index and selectedpoints
filtered_df = df.copy().sample(100, random_state=1337).reset_index()
selectedpoints = filtered_df.index
if selection and selection['points']:
selectedpoints = np.intersect1d(selectedpoints, [p['pointNumber'] for p in selection['points']])
fig = px.scatter(filtered_df,
x='carat',
y='price',
color='cut',
marginal_x='histogram',
marginal_y='histogram',
)
fig.update_traces(selectedpoints=selectedpoints)
fig.update_layout(dragmode='select')
return fig
if __name__ == '__main__':
app.run_server(debug=True)
How can you properly do the selection in that case?
I read that plotly.graph_objects.Figure ā 4.14.3 documentation has some options to select the trace.
But, for each curve, the numbering starts at zero - hence there is again no correspondence between pointNumber and the index - thus I do not know which point Iām looking at.
Also setting customdata does not work here, because I would need to know beforehand which points are in which trace.
How should this work? Does anyone has a hint for me?
I think the last straw was to have something that can be tracked along the brushing, such as an ID.
However, I could not figure out how that could be done reliably and thus I did not bother with it any longer - it was just a course work after all and I got a grade on it also without that
I think this is something that has to be implemented in plotly directly and is hard to work around.