Hi Remi,
I’ll try to break it down to topics:
Heat map
In Plotly terms, the heat map is a grid with the color channel representing a value in 2D. This is the description of the necessary data for a regular heatmap.
And here’s the most basic example with minimal data.
There are other examples here, showing the data needs.
If the heatmap is based on something physical e.g. geography, then you probably need square elements, which can currently be achieved by setting axis domains in sync with chart width/height, as in this example.
Heat map on a map
While there are choropleth trace types (https://plot.ly/python/choropleth-maps/) with which a heat map can be approximated, currently there’s no dedicated geo or Mapbox overlaid heat map. Map projections, especially those covering larger areas have distortions, causing that a heatmap grid that is of uniform pitch in screen space will have tiles that cover diverse land areas and shapes, so making the grid panels as a choropleth map might be better than just overlaying a map with a heatmap. If the places are discrete, point-like, then using scatter points is another alternative: The opacity of the scatter points can help tune the results such that it looks a bit more like density / heatmap rather than a scatterplot with very salient points, a bit like this.
Also, if a scatter plot is sufficient or preferred, Plotly has a Mapbox scatter integration, though the user needs to provide their own key.
Scalability, aggregation
Regarding the dynamic overlay creation based on larger datasets- with hundreds of thousands of points, the first bottleneck will be the network pipe, and the second bottleneck will be the scalability of the particular plot type. In fact, most of our current plots don’t support hundreds of thousands or millions of points, but some of them do, e.g. WebGL based scatter plots. Yet it’d be best to aggregate spatial distribution data into a heatmap (rectangular grid elements or county / municipality shapes) structure on the server side.
it’s also possible for the client side code (not Plotly.js code, just general userland code) to hold the entire dataset, as transmitting hundreds of thousands of data points on the network is resource intensive but feasible, and then filter this data on the client side. The result of the filtering is that the heatmap / choropleth data (per grid cell or per county / municipality shape) gets updated. The current version of plotly.js doesn’t do interactive multidimensional filtering on its own but we can suggest solutions.
Plotly also has filter
and groupby
options as well but in case of a crossfiltering solution external to the plotly.js charts itself, maybe it won’t be needed.
Events
It’s possible to register events on mouse action such as clicking, documented here. The event handler will have sufficient information to inform which county was clicked on, and appropriate action can be taken. The action, such as drilling down, including or excluding that county etc. needs to be developed - there’s a wide rage of possible things, for example, clicking on a chart causes a rerendering of another chart on the same page.