So I have a bunch of data (over 5million rows) that I would like to graph in a violin plot of the length in seconds of a certain event. The problem I have encountered is that my python script is extremely slow when trying to import all that data into a pandas df. So a solution I have used is to create a “pre-binned” table that gets the count of events in each minute, then importing that table is extremely quick. This works for Plotly’s histogram table, as they have the ‘histfunc’ and ‘histnorm’ attributes that allow me to graph this pre-binned data.
Is there any way of doing something similar for violin plots? My current solution is to get a 5% sample of my data and import that, but that is obviously not ideal.