>>> t=go.Scatter(x=[0,1,2,3], y=[0,3,3,3], mode='markers')
(0, 1, 2, 3)
This is very problematic if you want to do live graphs where data arrives over time and you want to continuously append new data to existing data in the graph.
Above design will cause O(n^2) performance issue for live graphs because for each new data point, Plotly needs to rebuild entire list. Any reason why is this done this way? Any way to avoid this so trace.x remains appendable list?
I also want to point out that Matplotlib doesn’t have this issue and won’t face this performance issue.
The reason this was changed in version 3 was to support the improved validation login and to support the
FigureWidget class. In each of these cases, plotly has to be aware of the state of each property and this can’t be done if the properties are presented back to the user as mutable objects.
When performance is a concern, you should fall back on constructing figures from standard
list instances. e.g.
>>> t = dict(type='scatter', x=[0,1,2,3], y=[0,3,3,3], mode='markers')
If you don’t want to pay the validation cost even once at the end, you can set the
validate=False argument to the
Hope that helps clear things up a bit.
Thanks for your response. I’m using FigureWidget using imperative code to create live graphs in offline mode. One big issue is that whatever trace I pass on to FigureWidget, it recreates this object so FigureWidget.data is no longer same as trace instance I originally passed to it. So to modify the data I must use FigureWidget.data which again has rewritten entire list to tuple. So each time I add new data point in the graph, FigureWidget.data.x and FigureWidget.data.y must be recreated which is huge issue due to O(n^2) perf.
Yeah, this is a current limitation of streaming data with a
FigureWidget. Out of curiosity, for your use case how many points are displayed before the performance degrades significantly? Also, have you tried using a
go.Scattergl trace? This won’t change the n^2 nature of the trace data construction, but it will improve the render time of the figure itself.
Towards providing better support for streaming, I think the way forward would be to provide an operation in the Python API that wraps the Plotly.js
y values are stored as lists (unless they are numpy arrays), so this could involve an efficient
list.extend operation on the Python side.
Would this kind of API meet your needs?
It might be also good idea to think about this architecture. I think it would be confusing to many users if you give an instance of class A to class B but class B does deep copy instead of simply saving the reference. It is obviously not great for performance as well. May be more pythonic approach is have everything duck-typed and not making deep copies. You can then simply add redraw() or update() method that accepts flags to indicate if trace or layout should be redrawn:
fig internally shouldn’t worry about if trace data is tuple or list, as long as it is array-like. Caller can then update their array-like objects as they prefer and occasionally call redraw on fig.
Thanks for taking the time to write up the suggestion. Here are some thoughts off the top of my head:
redraw call, rather then serializing only the new elements in an