Hi, I’m working with a large amount of data (10e5 to 10e6 number of data points), and was wondering about how Plotly goes about rendering its charts. I was thinking about reducing the number of points (and thus the load on Plotly) by using an algorithm to filter out unimportant data points when working with large quantities of data (https://en.wikipedia.org/wiki/Ramer–Douglas–Peucker_algorithm). Does Plotly already implement something like this? What quantities of data (generally speaking) is Plotly designed to handle?
Thanks for writing in.
plotly.js can draw lines using either SVG (via the
scatter trace type) or WebGL (via the
scattergl trace type - see docs).
In the SVG trace type, we do simplify the lines using a clustering algorithm. I personally don’t know the details of it - but you can take a look at the source code here if you’re interested. The SVG trace type offer solid performance for up to 1e5 point lines.
In the WebGL trace type, the algorithm is a little more brute force and designed to take advantage of the GPU’s capabilities. The source code here. It can handle plotting 1e6 point lines easily on most hardwares.
Hey, thanks for the info! How exactly do I switch trace types? Can both trace types be used for all types of plots? I’m having a bit of trouble finding any info on them in the docs.
Hi, I have the same troubles. I’m using RDP simplification before passing the data to Plotly.
My current main troubles are after the simplification is the size of the Plotly file (5 MB compressed hdf is giving me 32+ MB of HTML/JS).
The reason is that the Plotly JS data management is far to be minimalistic (adding a lot of extra epsilon numbers and spaces).