Speeding up pandas with modin

I found this the other day and wanted to post it here in case it helped anyone. I have not used it yet myself, but it looks promising.


There are many tools out there to optimize or improve your workflow when you need to.

for example Dask let’s you run your dataframe lazily, across multiple processes, and across multiple machine: http://docs.dask.org/en/latest/

It also has a parallel reader for csv files.

I just tried it in my dash/plotly app. When feeding a modin.pandas.dataframe.DataFrame into px.scatter(), the DataFrame forgets its column names. Anyone with some experience with modin+plotly?