I wanted to share my experience finding a faster way to generate plots for large datasets. I’m by no means an expert on these things so I caution those who might try the same, I just found that it worked for my specific purposes without any outright drawbacks for me.
I have a large geojson dataset featuring some rather weird shapes that I wanted plotted using the px.Choroplethmapbox() functio. Ideally, I wanted it to be built quick- but building the plot and rendering it took a while. To save time, I generated the plot using px.choroplethmapbox, saved it as a json file, and used ujson.load along with go.Figure() to more quickly generate it. Speed gains were limited.
I ran fig.show() through a profiler and found that 30% of the time was being spent in deepcopy and deepcopy list. I read on StackOverflow that deepcopy was fairly slow, and that pickle or ujson could be used as alternatives.
To test this I went into the plotly basedatatypes.py, where I traced the calls to being made and changed the deepcopy calls to feature ujson.loads(ujson.dumps(var)).
Here are the results of my test:
Still testing, but results don’t seem to be adversely impacted by the changes so far and has helped me work much faster.