I just finished going through the docs about using gunicorn, redis, celery and flower. I’ve managed to set everything up where I can see the tasks being queued in Flower. Cool. Dev environment is an M2 with 8 cores. uvloop
is installed. orjson
is installed. Python versions I tested with: 3.10.0, 3.11.2. Dash version is the latest one.
I have an app that runs locally, a portfolio optimizer that runs some heavy computations and returns a JSON
to a dcc.Store
. Some plotting callbacks then pull data from there and graph stuff out. Using tqdm
on the loops inside the app, I see that for scenario X, I have 50 iterations/second. I’m using polars
dataframes so I can see all cores are at 100%.
Adding server=app.server
in app.py
and running the server with gunicorn -w 2 app:server
I can see the performance drop to a maximum of 2 iterations/second. CPUs are mostly idle. Changing the number of workers to 2*cores + 1 (17) yields the exact same results. I guess it’s important to mention here that I don’t return anything until the end of the computation.
To test this differently I created a main.py that spawns a FastAPI() server. I have a GET method there that just replicates some compute logic from the Dash app. I call that method from my dash app with a background_callback
since it takes longer than 30s.
Using uvicorn main:app --log-level info --workers 17 --port 8001
for FastAPI(), the iterations are the same as before i.e. ~50 and everything runs error-free, CPUs at full blast.
Using the gunicorn
production server gunicorn -w 17 -k uvicorn.workers.UvicornWorker -b '127.0.0.1:8001' main:app
as suggested in their production-ready documentation (using gunicorn with Uvicorn workers), this again slows down to a max of 2 iterations per second, not to mention I get timeouts and errors like resource_tracker: There appear to be 1 leaked semaphore objects to clean up at shutdown
.
I’ve tried all sorts of combinations from this forum and SO with workers types (including gevent), threads, and the results are the same. I’m sure I’m missing something here.
I understand from docs and forums that the dev servers are not configured to have workers and they just use everything that’s available but I don’t see a performance increase/drop by fiddling with the -w command in gunicorn.