Black Lives Matter. Please consider donating to Black Girls Code today.

Crossfiltering on timeseries plots

I’m trying to create multiple time-series line plots (with range slider) in Dash, and I’d like to be able to cross-filter on them. What’s the best way to do this? I want the other time-series plots to update when I drag and select a time range within a given plot. Naively I would expect a drag and select of a range to count as “selectedData”, but that fails to trigger the callback. clickData seems to work, but that’s not the desired behaviour I think for a time-series plot. Here’s my code so far. Any thoughts?

app = dash.Dash()

layout = dict(
    xaxis=dict(
        rangeselector=dict(
            buttons=list([
                dict(count=1,
                     label='1m',
                     step='month',
                     stepmode='backward'),
                dict(count=6,
                     label='6m',
                     step='month',
                     stepmode='backward'),
                dict(step='all')
            ])
        ),
        rangeslider=dict(),
        type='date'
    )
)

app.layout = html.Div([
    dcc.Graph(id='metrics', figure={'data': [go.Scatter(
          x=mdf['date'],
          y=mdf['counts'])], 'layout': layout}),
    dcc.Graph(id='metrics2', figure={'data': [go.Scatter(
          x=mdf['date'],
          y=mdf['uniqvisits'])], 'layout': layout})
])

@app.callback(
    Output('metrics2', 'figure'),
    [Input('metrics', 'selectedData')])
def display_selected_data(selectedData):
    traces = [go.Scatter(
          x=mdf['date'],
          y=mdf['uniqvisits'])]
    return {
        'data': traces,
        'layout': layout
    }

Good question, I can see how this is confusing. The rangeselector doesn’t actually fire the selectedData event. Instead, you could try drawing a line chart without the range slider and use the “Lasso Select” or the “Box Select” in the plot toolbar. For an example, see https://plot.ly/dash/gallery/new-york-oil-and-gas/ and select a region in the bottom-left histogram time series. You’ll also want to change the default drag mode (layout.dragmode) to be select or lasso: https://plot.ly/python/reference/#layout-dragmode

Thanks for the reply. I’d like to be able to use the rangeSelector/slider though. I like the slider at the bottom since it allows to still see the entire series. The rangeSelector should be selecting data within that range. I’d call this more of a bug than a confusion, that it doesn’t trigger the proper event. Unless you’re saying that you can just never cross-filter time-series data in a plot that uses the rangeSlider.

I also tried what you said, turning off the rangeSlider completely, and changing layout.dragMode to “select”. I can now drag a select box around my data but when I do so and print the selectedData inside my display_selected_data function, I get nothing.

selectedData {u'points': []}

I also tried dragMode=‘zoom’. That mode seems to behave the same as the rangeSelector zoom. Can we make zooming trigger the same selectedData as a ‘select’ or ‘lasso’?

Is there a method I can call that grabs all the data visible within a current figure window? Or does the rangeSelector have its own events that I can trigger off of. At least then I could write some custom code.

I’d like to be able to use the rangeSelector/slider though

Yeah, makes sense. I’ll create an issue in the plotly.js repo about it.

This data may come through the relayoutData property. I haven’t tested this myself for the rangeSelector events but I know that it updates data on zoom.

I also tried what you said, turning off the rangeSlider completely, and changing layout.dragMode to “select”. I can now drag a select box around my data but when I do so and print the selectedData inside my display_selected_data function, I get nothing.

Hm, that sounds like a bug. What chart type are you using?

Yeah, makes sense. I’ll create an issue in the plotly.js repo about it.

Awesome, thanks.

This data may come through the relayoutData property. I haven’t tested this myself for the rangeSelector events but I know that it updates data on zoom.

Ok, I’ll try the relayoutData and see if that does anything.

Hm, that sounds like a bug. What chart type are you using?

I’m using the scatter plot chart, go.Scatter. I basically just followed the last example here https://plot.ly/python/time-series/ “Time Series with Range Slider”

Here’s an update on this. Cross-filtering with time-series plots works if you trigger off the relayoutData. The data returned is the x-axis bounds of the selected range. Thanks for the tip on this. It might be a good idea to update the user guide documentation on interactivity to include a section on relayoutData. It only mentions click, hover, and selected at the moment.

Hi havok

I think I’m trying to do what you are talking about but I simply cannot get my head around it.

What I was thinking about was to used the rangeselector and used the zoomed window data to perform a calculation on. Like looking at audio data and doing FFT at the zoomed data.

As I have trouble getting my head around how to solve this, would it be possible for you to post and example code on how you get the zoomed data on to a different subplot?

Regards
Tarl0ck

Hi Tarlock,

It should be something like this. out_plot is the figure you want to update. in_plot is the figure that you’re zooming around and selecting data. relayOutData is the zoomed-in data. The data is a list of x-axis start and end values. Then you basically access the selected zoom data and format it, then do whatever you want on your data, then create a new plot and figure, and return the figure.

@app.callback(
    Output('out_plot', 'figure'),
    [Input('in_plot', 'relayoutData')])
def display_selected_data(data):

    startx = 'xaxis.range[0]' in data if data else None
    endx = 'xaxis.range[1]' in data if data else None
    sliderange = 'xaxis.range' in data if data else None

    # get the x-range of the zoomed in data
    if startx and endx:
        xrange = [data['xaxis.range[0]'], data['xaxis.range[1]']]
    elif startx and not endx:
        xrange = [data['xaxis.range[0]'], thedates.max()]
    elif not startx and endx:
        xrange = [thedates.min(), data['xaxis.range[1]']]
    elif sliderange:
        xrange = data['xaxis.range']
    else:
        xrange = None
   
    # grab your y-value data of the new x-range and perform your computations
    new_data = my_data[xrange]
    .. do stuff ...
  
    # make a new plot
    traces = [go.Scatter(
        x=my_x_data,
        y=new_data)]

    # return a new figure
    return {
        'data': traces,
        'layout':  dict(
            title=title,
            xaxis=dict(
                title='My Title',
                rangeselector=dict(
                    buttons=list([
                        dict(count=1,
                             label='1m',
                             step='month',
                             stepmode='backward'),
                        dict(count=6,
                             label='6m',
                             step='month',
                             stepmode='backward'),
                        dict(step='all')
                    ])
                ),
                rangeslider=dict(),
                type='date',
                range=xrange
            )   
         )
    }
1 Like

And just for completeness, here is a recipe for doing crossfiltering with a box-select instead of through zoom:

import dash
from dash.dependencies import Input, Output
import dash_core_components as dcc
import dash_html_components as html

import numpy as np
import pandas as pd

app = dash.Dash()

df = pd.DataFrame({
    'Column {}'.format(i): np.random.rand(50) + i*10
for i in range(6)})

app.layout = html.Div([
    html.Div(dcc.Graph(id='g1', selectedData={'points': [], 'range': None}), className="four columns"),
    html.Div(dcc.Graph(id='g2', selectedData={'points': [], 'range': None}), className="four columns"),
    html.Div(dcc.Graph(id='g3', selectedData={'points': [], 'range': None}), className="four columns"),
], className="row")

def highlight(x, y):
    def callback(*selectedDatas):

        index = df.index;
        for i, hover_data in enumerate(selectedDatas):
            selected_index = [
                p['customdata'] for p in selectedDatas[i]['points']
                if p['curveNumber'] == 0 # the first trace that includes all the data
            ]
            if len(selected_index) > 0:
                index = np.intersect1d(index, selected_index)

        dff = df.iloc[index, :]

        color = 'rgb(125, 58, 235)'

        trace_template = {
            'marker': {
                'color': color,
                'size': 12,
                'line': {'width': 0.5, 'color': 'white'}
            }
        }
        figure = {
            'data': [
                dict({
                    'x': df[x], 'y': df[y], 'text': df.index, 'customdata': df.index,
                    'mode': 'markers', 'opacity': 0.1
                }, **trace_template),
                dict({
                    'x': dff[x], 'y': dff[y], 'text': dff.index,
                    'mode': 'markers+text', 'textposition': 'top',
                }, **trace_template),
            ],
            'layout': {
                'margin': {'l': 20, 'r': 0, 'b': 20, 't': 5},
                'dragmode': 'select',
                'hovermode': 'closest',
                'showlegend': False
            }
        }

        shape = {
            'type': 'rect',
            'line': {
                'width': 1,
                'dash': 'dot',
                'color': 'darkgrey'
            }
        }
        if selectedDatas[0]['range']:
            figure['layout']['shapes'] = [dict({
                'x0': selectedDatas[0]['range']['x'][0],
                'x1': selectedDatas[0]['range']['x'][1],
                'y0': selectedDatas[0]['range']['y'][0],
                'y1': selectedDatas[0]['range']['y'][1]
            }, **shape)]
        else:
            figure['layout']['shapes'] = [dict({
                'type': 'rect',
                'x0': np.min(df[x]),
                'x1': np.max(df[x]),
                'y0': np.min(df[y]),
                'y1': np.max(df[y])
            }, **shape)]

        return figure

    return callback

app.css.append_css({"external_url": "https://codepen.io/chriddyp/pen/bWLwgP.css"})

app.callback(
    Output('g1', 'figure'),
    [Input('g1', 'selectedData'), Input('g2', 'selectedData'), Input('g3', 'selectedData')]
)(highlight('Column 0', 'Column 1'))

app.callback(
    Output('g2', 'figure'),
    [Input('g2', 'selectedData'), Input('g1', 'selectedData'), Input('g3', 'selectedData')]
)(highlight('Column 2', 'Column 3'))

app.callback(
    Output('g3', 'figure'),
    [Input('g3', 'selectedData'), Input('g1', 'selectedData'), Input('g2', 'selectedData')]
)(highlight('Column 4', 'Column 5'))

if __name__ == '__main__':
    app.run_server(debug=True)
2 Likes

Hi folks. Late to the party here. So I’m using Plotly.js and I have several separate plots on a page that are loaded from csv. They all have separate rangesliders, but I’d like folk to be able to change the range on one plot and it updates the others. Is this possible?

@gnasher - This thread is for Dash and crossfiltering with Dash. Are you using Dash or are you only using plotly.js? If you’re just using plotly.js, then could you create a separate issue in the plotly.js room? https://community.plotly.com/c/plotly-js

hours ago
I’ve created a small dash app that implements bostock’s http://square.github.io/crossfilter crossfilter.js example here: https://gist.github.com/nite/aff146e2b161c19f6d553dc0a4ce3622 - not quite the same level of realtime & slick UI/UX as the original, but good enough for a PoC.
Currently hosted on https://crossfilter-dash.herokuapp.com, otherwise create a venv, pip install -r requirements.txt & run app.py

1 Like

Also worth linking this in here: https://github.com/plotly/plotly.js/issues/1316

@chriddyp - can you suggest any improvements to my crossfilter app?

I need to do more of this, across a dashboard with many plots, many tabs etc - essentially the app looks similar in structure, but my ‘update plots’ and ‘update datagrid’ callbacks are separate. I’ve got another callback where I store user selections in a hidden div to also overlay those selections on the plots. so this is essentially what I’m after - how would you refactor/implement the above app if these were separated? I’ve just added crossfiltering to a few of the plots and am only updating the grids currently - so I’ll need to extract my cross-filtering code & wire it into the update viz section as well.

In js-land everything could be working off a shared store & wiring it up would be fairly straight forward - non-trivial, but at least not duplicated code & cpu cycles.

It’d also help if I didn’t have to put empty plots & empty grids in at load time to avoid js exceptions for missing inputs etc!

One thing that jumps to mind as I think of this is to take all of the crossfilters & output a crossfilter selection to a hidden div that becomes the input to both update grids & update plots - but again, this is 2 hops & still re-filtering for both.