Scattergl points disappear on FigureWidget update

I’m trying to make a plot that can handle big amounts of data. For the following example, I based myself on the datashader case study but instead of plotting an image I update the points being shown based on the datashader mapping.

The issue is that when I use Scattergl the points turn invisible, they are still hoverable. Try chucking the code below in a notebook cell and see for yourself.

import plotly.offline as py  # 3.7.1
import plotly.graph_objs as go
import datashader as ds  # 0.6.9
import numpy as np
import pandas as pd


def ds_image_to_data(x_range, y_range, plot_width, plot_height):
if x_range is None or y_range is None or plot_width is None or plot_height is None:
    return None

cvs = ds.Canvas(x_range=x_range, y_range=y_range, plot_height=plot_height, plot_width=plot_width)
agg_scatter = cvs.points(df, 'x', 'y', ds.any())

# get a boolean pixel mapping with index x and columns y
agg_scatter = agg_scatter.to_pandas().transpose()
# get a dataframe with columns x, y and boolean for pixel state
agg_scatter = agg_scatter.stack().reset_index()
# get only values with pixel set to True
agg_scatter = agg_scatter.loc[agg_scatter[agg_scatter.columns[2]]]
print(f'Plotting {len(agg_scatter)} points')
return agg_scatter['x'], agg_scatter['y']

def update_layout(layout, x_range, y_range, plot_width, plot_height):    
# Update with batch_update so all updates happen simultaneously
with fig.batch_update():
    fig.layout.xaxis.range = (x_range[0], x_range[-1])
    fig.layout.yaxis.range = (y_range[0], y_range[-1])
    fig.data[0].x, fig.data[0].y = ds_image_to_data(x_range, y_range, plot_width, plot_height)


size = 20000
df = pd.DataFrame({'x': np.arange(0, size),
               'y': np.sin(np.arange(0, size))})
x_range=[df.x.min(), df.x.max()]
y_range=[df.y.min(), df.y.max()]
plot_height=400
plot_width=800        

trace = go.Scattergl(
x = df['x'],
y = df['y'],
mode = 'markers',
)
layout = {'width': plot_width, 'height': plot_height,
     'xaxis': {'range': x_range},
     'yaxis': {'range': y_range}
     }
fig = go.FigureWidget(data=[trace], layout=layout)
fig.layout.on_change(update_layout, 'xaxis.range', 'yaxis.range', 'width', 'height')

fig

If we change size to 5000 the plot works fine and you can even see the points being recomputed. If we keep 20000 points and change to Scatter (remove the WebGL) the plot works but it’s extremely slow. This was the minimum reproducible example I was able to create. I know that 20000 work fine with just using WebGL without datashader but the amount of data I’m trying to plot is much bigger than this .

Here’s a gif showcasing what is happening to me (I didn’t share the non WebGL cause it was really slow and took too long for a gif)
20k points with ScatterGL
webgl20k

5k points with ScatterGL

I’ve got some other questions related with this which are:

  1. Is there any way I can disable the on_change callback or add some delay between the update calls? In my implementation, if the first update code is not finished and I start doing more zooms the plot breaks.
  2. Is there any callback that enables me to change what the Reset Axes or autoscale button do?

Hi @neuronist,

When I tried your example using plotly.py 3.7.1 I don’t see the point disappearing behavior, even with 20k points. There have been some scattergl fixes going into plotly.js recently so If you’re not on the latest plotly.py I’d recommend trying to upgrade and see if that helps. Another thing to try is changing browsers, I’m working on Chrome at the moment FWIW.

There’s nothing built-in to throttle the callbacks, but they should arrive in sequence and shouldn’t, in theory, cause anything to break. What exactly is breaking?

There aren’t callbacks for the reset_axes/autoscale. What I useually do in cases like this, where the data is changing based on zoom level, is add an extra invisible 2-point scatter trace with points at the outer corners of where I want reset axes to reset to.

Here’s an example

import plotly.graph_objs as go
import datashader as ds  # 0.6.9
import numpy as np
import pandas as pd


def ds_image_to_data(x_range, y_range, plot_width, plot_height):
    if x_range is None or y_range is None or plot_width is None or plot_height is None:
        return None

    cvs = ds.Canvas(x_range=x_range, y_range=y_range, plot_height=plot_height, plot_width=plot_width)
    agg_scatter = cvs.points(df, 'x', 'y', ds.any())

    # get a boolean pixel mapping with index x and columns y
    agg_scatter = agg_scatter.to_pandas().transpose()
    # get a dataframe with columns x, y and boolean for pixel state
    agg_scatter = agg_scatter.stack().reset_index()
    # get only values with pixel set to True
    agg_scatter = agg_scatter.loc[agg_scatter[agg_scatter.columns[2]]]
    print(f'Plotting {len(agg_scatter)} points')
    return agg_scatter['x'], agg_scatter['y']

def update_layout(layout, x_range, y_range, plot_width, plot_height):    
    # Update with batch_update so all updates happen simultaneously
    with fig.batch_update():
        fig.layout.xaxis.range = (x_range[0], x_range[-1])
        fig.layout.yaxis.range = (y_range[0], y_range[-1])
        fig.data[0].x, fig.data[0].y = ds_image_to_data(x_range, y_range, plot_width, plot_height)


size = 20000
df = pd.DataFrame({'x': np.arange(0, size),
               'y': np.sin(np.arange(0, size))})
x_range=[df.x.min(), df.x.max()]
y_range=[df.y.min(), df.y.max()]
plot_height=400
plot_width=800        

trace = go.Scattergl(
    x = df['x'],
    y = df['y'],
    mode = 'markers',)

trace_bounds = go.Scatter(
    x = [df['x'].min(), df['x'].max()],
    y = [df['y'].min(), df['y'].max()],
    mode='markers',
    marker = {'opacity': 0},
    showlegend=False
)

layout = {'width': plot_width, 'height': plot_height,
     'xaxis': {'range': x_range},
     'yaxis': {'range': y_range}
     }

fig = go.FigureWidget(data=[trace, trace_bounds], layout=layout)
fig.layout.on_change(update_layout, 'xaxis.range', 'yaxis.range', 'width', 'height')

fig

-Jon

Hi! Thanks for the reply @jmmease.

Strangely I re-ran my example and it was working fine for 20k points. And I didn’t change anything, I was also using chrome and at the time I already had 3.7.1. Maybe I updated chrome or my computer was slower when I was preparing the first example.

Anyway the problem persists if you start increasing the size, at least for me it started failing around 100k points. For my use case I think datashader shouldn’t plot that many points anyway but the bug persists as you can see from the gif below.

100k

Thanks for the tip on the reset_axes/autoscale.

Regarding the callback throttling answer, I might do another post if necessary as I believe it requires a better example. But I’m running datashader with dask dataframes (input to datashader, plotly still only sees the converted pandas dataframes as it is not compatible with ask) of 20M+ points and plotting them using code similar to the example. In these cases, the datashader operations take a bit more time to complete and I notice that if I don’t wait for the computation to end before doing another zoom the plot eventually breaks, but I haven’t delved too deep on that issue yet. I have only tried to put a sleep on the update function and zoom really fast interestingly it worked fine on that case but without the sleep function and zooming really fast it broke so maybe it has to do with datashader.

buggedzoom

Hi @neuronist,

Ok, when I bump of the number of points to 100k I do start to see the behavior where the points disapear. I also get this error in the browser console:

This is probably something that will need to be addressed in the JavaScript library, but could you open a quick bug report with plotly.py at https://github.com/plotly/plotly.py/issues. I’ll work on translating the bug into pure plotly.js when I have some time.

Could you check the Chrome JavaScript console when the second example breaks. I’m wondering it it’s the same error cropping up.

Thanks!
-Jon

Hi @jmmease,

I always forget about checking the console (I deal way more with python than js) :confused: . I get the same error.

I got the following errors for the second bug

Hmm, I don’t see any reference to the plotly widget code in that stack trace. Thanks for opening the issue for the first problem (https://github.com/plotly/plotly.py/issues/1478).

If you are able to reproduce the second issue as well feel free to open a second issue.

Thanks!
-Jon