How to use Mapbox clusters?

Hey,

I am doing a dash that contains a map with multiple markers.

I am currently doing it with the scattermapbox type that is available, but I would like to use the clustering funcs that mapbox has.

Is it possible to do a map with this?

Cheers

Have you found a solution to this? :smile:

That looks really cool! I am not sure how to do it with scattermapbox, but you can do it in Dash Leaflet like this,

http://dash-leaflet.herokuapp.com/#marker_cluster

Made an account specifically to leave a comment on this - would be SO SO nice to have enumerated clustering for mapbox maps!

These are very useful and beautifully handled by mapbox as in this example I made in the past using ClusteredCircleViz via

from mapboxgl.viz import *
from mapboxgl.utils import *

Or like this example direct from the mapbox team:

For those wondering what sort of interest there is in this functionality beyond just my own… leaflet’s markercluster library’s 3,000 github stars shows that there is a lot of love and demand for this. If you have any tips for implementing would love to hear em!! :slight_smile:

Unfortunately, dash-leaflet does not handle clustering of more than 13,000 points… In truth… it does… But it locks up the browser for over 60 seconds before showing the page. So unusable in any practical sense. It also appears to be dropping thousands of points. Sadness.

So yeah, if anyone can guide us on how to cluster with mapbox that would be lovely :slight_smile: looked into making a custom-dash component for this but currently seems daunting.

To be clear - when I referred to clusteredcircleviz earlier, I was referring to the Jupyter notebook method created by the mapbox team.

This would definitely be great!

1 Like

I am going to have to dive in a little more on all your links (thank you for sharing those!) but as I do, I have a hunch maybe the most cost/time effective approach to this would be to assist with improving dash-leaflet's implementation. I have posted about it in their issue tracker here.

The reason I suggest this is because leaflet has demonstrated being able to support upwards of 50 thousand clustered data points by leveraging it’s chunked loading feature. Alas, it may simply be a case here that chunked loading didn’t make it into their wrapper. Will report back with findings

Alternatively - is there a way to delay the rendering of a specific component so that it is not blocking the load of all others? If so, that could be a good way to achieve 90% of DOM paint with the trade off of having to wait a little for, in our case here, dash-leaflet to finish rendering. The user would be able to interact with the page in the meantime.

Looks like a confounding factor on clustering via leaflet is the 6 to 8 second load time of async-plotlyjs.js. Any thought’s on how to bring that time down?

Hey Chris,
Just wanted to say thanks for your help in this and other threads related to my project.

In working with the maintainers behind dash-leaflet, we did manage to find that simplifying the marker JSON payload was a way to speed up time to first render in the dash app - especially at 20k+ markers. With these tweaks, we see dash still takes a long time to show anything, apparently due to time spent constructing Marker objects in React and then in Leaflet.

So I was wondering if there was any way to instruct react / dash client to delay rendering of a particular graph to only after all other graphs have been rendered? That way the user can still see and hopefully interact with something while the rest of larger load visuals are rendering.

hm good question. I believe this would be done entirely in the React layer outside of Dash’s front end, so I’d recommend looking into something like “asynchronous rendering” or “non blocking rendering”

1 Like

I’d also recommend verifying the rendering performance in a bare bones React example and try to optimize that first, just to verify it’s not something due to the Dash layer.

Finally, i’d make sure that these 20k components aren’t Dash children. Rendering a large number of components as children is still a bit slower than React itself, see the discussion here for details: Slow rendering of large tables

good luck!

1 Like

As indicated by @chriddyp, the major performance issues are related to the sheer number of components involved. I am currently drafting a new component that omits the construction of Marker objects in React and renders the markers on-the-fly on the map based on the view port state, i.e. only visible markers will be constructed. The performance difference is huge. My initial test (the leaflet 50k marker examples), by far too many for the MarkerClusterGroup component, renders in the blink of an eye.

3 Likes

I’ve already thanked @Emil but I just wanted to say thank you again so much for making this! The dashboard I am still working on loads blazing fast considering the nearly 30,000 markers that are being drawn. You can see for yourself here. For those of you new to the conversation, Emil implemented Mapbox’s SuperCluster component in dash-leaflet. It’s a lean and mean cluster mapping machine :slight_smile: As far as I can tell this is the best large data clustering dash component available, so definitely give it a try!

1 Like

Sweet! would love to see some screenshots and code examples of the cluster once it’s ready :slight_smile:

1 Like

I have now released the first version of the component. It does not support the full feature set of the leaflet marker cluster, but the basics are in place. In terms of performance, i have tested up 1 million markers. It takes a few seconds to load (on top of the time it takes to transfer the data), but it works just fine. In terms of code, a minimal working example would look something like this,

import dash
import dash_html_components as html
import dash_leaflet as dl
import dash_leaflet.express as dlx
import random

# Create some markers.
markers = [dict(lat=56 + 0.015*random.random(), lon=10 + 0.015*random.random()) for i in range(10000)]
geojson = dlx.markers_to_geojson(markers)
# Create example app.
app = dash.Dash()
app.layout = html.Div([
    dl.Map([dl.TileLayer(), dl.SuperCluster(data=geojson, superclusterOptions={"radius": 100})],
           center=(56, 10), zoom=10, style={'width': '100%', 'height': '50vh', 'margin': "auto", "display": "block"})
])

if __name__ == '__main__':
    app.run_server()

I have added an interactive example to the documentation that you can try out.

1 Like

Is it possible to integrate this feature into the plotly mapbox scatter instance? There are some advantages in terms of callbacks that Mapbox still has over the dash leaflet plug in, but this is a pretty key feature for large data sets.

Thanks

What are you missing? Selection callbacks?

Sorry for reviving a dead thread potentially, but I’m very happy with the super cluster function of the dash-leaflet wrapper.
The feature that I am indeed missing is the selection callback from plotly. I have a selection interaction between coordinate points on a scattermapbox and a graph that graphs the underlying data attribute of these points with plotly-dash. I tried doing the same with the EditControl feature but I am not sure if this achieves the same goal.