Hello all,
With the help of the excellent Charming Data youtube channel I have successfully deployed several small webapps to PythonAnywhere. I have spent several months working on my latest webapp and have succesfully deployed it to PythonAnywhere. However upon load testing it, I found that as soon as my number of concurrent users exceeded my number of PythonAnywhere web workers, my website totally freezes and any additional visitors are not even able to view the landing page which is not good !
Through searching this forum I found the announcement post about Background Callbacks which seem to be designed to solve exactly this problem:
However I note on that post it says
To deploy an app with background callbacks, youāll need:
- A deployment platform that can run two commands:
gunicorn app:server --workers 4
- For running the Dash app and āregularā callbacks. In this case, 4 CPU are serving these requests.celery app:server --concurrency=2
- For running the background job workers. In this case, 2 CPU will be running the background callbacks in the order that they are submitted.
- A Redis database available with network access available to both of those commands. Dash will submit and read jobs to and from Redis.
My question:
I am not familiar with Redis or Celery, but a quick google search seems to suggest that neither are supported on PythonAnywhere. Does this mean I will not be able to use the Background Callback feature to solve my problem, and if so please could anyone suggest other hosts that do allow this and any guide to deployment for beginners like me?
Explanation of my app for context:
-
My home page is a Dash layout consisting of welcome text and image, a dcc.input component (see below), a āHelpā button which triggers a modal when clicked, and a āShareā button whose target URL is updated by callback (see below).
-
The user provides an ID for analysis, for example 123456. Currently I allow input into the dcc.input component box, or alternatively into the address bar using dcc.location. In this case the user would enter mydomain.com/123456.
-
My Dash callback checks to see whether this is a returning user and I have already done the processing for this ID this week - if so there will be a pickled finished dictionary already stored on local disk, and the callback reads this file and serves the dictionary content back to the user through the callback output updating the text on the landing page and also the share button target URL on the landing page. This reading/serving seems to be very quick, on the order of 0.1 seconds.
-
If the dictionary does not exist on local disk, then I have not yet done the processing for this ID. I then call my fetching function which gets the relevant ID data from an external API (speed varies but circa 5 seconds), and then calls my processing function which does the processing on it (speed varies but circa another 5 seconds), and saves the processed dictionary to disk for any future returning users (see step 3). So roughly a 10 second total āloadingā time (fetching+processing) for the user, which is acceptable. The callback output updates the text boxes and the share button target URL with the processed data.
From the userās perspective, after they input their ID number then there is a dcc.spinner which spins for 10 seconds in the top corner to show that the app is doing something, and then the landing page content is replaced with their processed content. This is the desired behaviour.
Once launched, I donāt know how many visitors my app will have, but if the number of users becomes high, I would like the thing that varies to be the fetching/processing time that each user faces - increasing from 10 seconds to 100 seconds, for example, but the landing page and hopefully the buttons/modals would always be accessible. I imagine then that paying for additional web workers would help to decrease this additional fetching/processing time. It would be great if returning users (those with the processing already done and saved to disk, as described in point 3 above) could somehow be prioritised, such that they get served their processed data immediately and do not have to wait in the queue behind all of the unprocessed data requests.
Many thanks for any help anyone is able to provide !