I have some callback functions that take forever and I was already using RQ. as a task queue, but some things were taking too long due to processing power and you can’t call a
multiprocessing.Pool.apply_async from gunicorn using eventlets…so I made
multi-rq to do the same thing but with RQ.
It’s a little different, in that you don’t create a list of objects and then call
.get() on them, and unlike RQ you don’t have to manually do the checking to wait until it’s finished. Instead, you just provide the function to apply and a list of argument iterables. You can optionally provide your own completion-checking and data processing functions.
# basic_test.py import time def wait(i,j): print(i) return sum((i,j))
import rq from multi_rq import MultiRQ from basic_test import wait mrq = MultiRQ() nums = [(i,j) for i,j in zip(range(10),range(10))] mrq.apply_async(wait, nums, mode='results') # >>> [0, 2, 4, 6, 8, 10, 12, 14, 16, 18] # mode 'results' means you get the results of the jobs; 'jobs' means you get the jobs themselves.
Use with Dash
Example: Doing a big processing task asynchronously in chunks of 5000
def df_process_func(df, my_arg): # do something with a df return df @app.callback(...) def do_this(...): mrq = MultiRQ() df = pd.read_csv('large_data.csv') df = [ (df.iloc[i:i+500], 'arg') for i in range(0,df.shape,5000) ] # a list of the outputs of df_process_func, 5000 at a time output = mrq.apply_async(df_process_func,df) # put them all back together, for example df = pd.concat(output)
This is a little more about web development than about Dash, but it’s an issue I ran into when trying to learn how to deploy Dash. I welcome feedback, issues, and pull requests.