I have a question about the performance of some plotly express figures. If I use parallel coordines or densitiy heatmap with a dataframe of 100k rows and 4 columns its not possible to show the figure. Jupyter and Jupyterlab freezes. Is there any possibilty to use some method arguments for disabling some interactivity or binning some points or any other thing to make this possible?
Hi @Varlor could you help us narrowing down the diagnosis by benchmarking on dummy data? For example the code below (corresponding to one million of rows) executes correctly on my Ubuntu laptop, on Firefox. How is it for you? What is the size limit causing a freeze of Jupyter/lab?
import plotly.express as px import numpy as np N = 1000000 x, y = np.random.randn(2, N) fig = px.density_heatmap(x=x, y=y) fig.show()
In order to downsample your data you can either slice it (
x[::5]) or take random samples from you data
import plotly.express as px import numpy as np N = 1000000 x, y = np.random.randn(2, N) mask = np.random.random(N) > 0.9 # keep roughtly 1/10th of data fig = px.density_heatmap(x=x[mask], y=y[mask]) fig.show()
or do some binning of data (
x = 0.5 * (x[1:] + x[:-1])), if it makes sense to average together the data
The best method depends on the type of data you have :-).