Hello,
I am plotting large timeseries (20 traces of 8760 data points each) with px.area
and it takes around 3s to generate the figure (with px.area).
I found a way to speed up significantly (x5/x6) the generation of the figure by using the following ptimisation:
- generate the figure with only the first row of data so that we end up with a figure with the traces, legend, etc properly defined
- loop through the traces and change x,y with the index, col of the original dataframe
It gives the following decorator you can use to decorate any function generating a figure based on a dataframe as first argument
def two_step_plotting(f):
# get name of first argument of function f (expected a dataframe)
df_arg_name = next(iter(inspect.signature(f).parameters))
@wraps(f)
def wrapper(*args, **kwargs):
# get the dataframe argument
if args:
df, *args = args
else:
df = kwargs.pop(df_arg_name)
# generate the figure with the first row of df
kwargs[df_arg_name] = df.iloc[:1]
fig = f(*args, **kwargs)
# inject in data traces the full dataframe
traces = {trace["name"]: trace for trace in fig["data"]}
x = df.index
for col, y in df.items():
trace = traces[str(col)]
trace["x"] = x
trace["y"] = y
return fig
return wrapper
# to speedup the function my_figure_generator_function, just use the decorator
@two_step_plotting
def my_figure_generator_function(df, ...):
...
return fig
If you do have the same performance issue as me, would this work for you ?
For the plotly team, could plotly natively be as performant (to avoid needing such decorator ? or is there some intrisic complexities that make this not possible ?