Dash Background Callbacks on PythonAnywhere

Hello all,

With the help of the excellent Charming Data youtube channel I have successfully deployed several small webapps to PythonAnywhere. I have spent several months working on my latest webapp and have succesfully deployed it to PythonAnywhere. However upon load testing it, I found that as soon as my number of concurrent users exceeded my number of PythonAnywhere web workers, my website totally freezes and any additional visitors are not even able to view the landing page which is not good !

Through searching this forum I found the announcement post about Background Callbacks which seem to be designed to solve exactly this problem:

However I note on that post it says

To deploy an app with background callbacks, you’ll need:

  1. A deployment platform that can run two commands:
  • gunicorn app:server --workers 4 - For running the Dash app and “regular” callbacks. In this case, 4 CPU are serving these requests.
  • celery app:server --concurrency=2 - For running the background job workers. In this case, 2 CPU will be running the background callbacks in the order that they are submitted.
  1. A Redis database available with network access available to both of those commands. Dash will submit and read jobs to and from Redis.

My question:

I am not familiar with Redis or Celery, but a quick google search seems to suggest that neither are supported on PythonAnywhere. Does this mean I will not be able to use the Background Callback feature to solve my problem, and if so please could anyone suggest other hosts that do allow this and any guide to deployment for beginners like me?

Explanation of my app for context:

  1. My home page is a Dash layout consisting of welcome text and image, a dcc.input component (see below), a “Help” button which triggers a modal when clicked, and a “Share” button whose target URL is updated by callback (see below).

  2. The user provides an ID for analysis, for example 123456. Currently I allow input into the dcc.input component box, or alternatively into the address bar using dcc.location. In this case the user would enter mydomain.com/123456.

  3. My Dash callback checks to see whether this is a returning user and I have already done the processing for this ID this week - if so there will be a pickled finished dictionary already stored on local disk, and the callback reads this file and serves the dictionary content back to the user through the callback output updating the text on the landing page and also the share button target URL on the landing page. This reading/serving seems to be very quick, on the order of 0.1 seconds.

  4. If the dictionary does not exist on local disk, then I have not yet done the processing for this ID. I then call my fetching function which gets the relevant ID data from an external API (speed varies but circa 5 seconds), and then calls my processing function which does the processing on it (speed varies but circa another 5 seconds), and saves the processed dictionary to disk for any future returning users (see step 3). So roughly a 10 second total “loading” time (fetching+processing) for the user, which is acceptable. The callback output updates the text boxes and the share button target URL with the processed data.

From the user’s perspective, after they input their ID number then there is a dcc.spinner which spins for 10 seconds in the top corner to show that the app is doing something, and then the landing page content is replaced with their processed content. This is the desired behaviour.

Once launched, I don’t know how many visitors my app will have, but if the number of users becomes high, I would like the thing that varies to be the fetching/processing time that each user faces - increasing from 10 seconds to 100 seconds, for example, but the landing page and hopefully the buttons/modals would always be accessible. I imagine then that paying for additional web workers would help to decrease this additional fetching/processing time. It would be great if returning users (those with the processing already done and saved to disk, as described in point 3 above) could somehow be prioritised, such that they get served their processed data immediately and do not have to wait in the queue behind all of the unprocessed data requests.

Many thanks for any help anyone is able to provide !

I’ve now been reading about Render being a recommended host for Dash apps. I would prefer this to Heroku as it seems to allow local disk storage, whereas I understand Heroku does not - and because I was expecting to host on PythonAnywhere, my app has quite a few reads and writes to local disk. (Also I don’t know anything about external cloud storage!)

Please could anyone confirm whether Render would allow me to use this Background Callback feature? Many thanks again !

hi @Pippo

Sounds like an interesting project you got there. From my experience, Pythonanywhere is meant more for smaller projects. You might be able to use Diskcache, although it might not work if there is no persistent storage and we don’t generally recommend it for production environments.

Regarding Render, I found this info on background workers in their docs.

Thanks Adam, PythonAnywhere seems to have confirmed that indeed anything with Celery will not work on PA:

I will now try to add Background Callbacks to my app and deploy on Render, and will start a new topic if I have further issues (as it will no longer relate to PA). Thanks !

1 Like

I believe background callback would suit you needs - or simply ‘pure’ Celery. However, your suggested approach of using disk cache for background callbacks is not recommended in production. It may work if you only have a single instance, but if you have multiple, you will get spurious hard-to-debug errors as the state (of their respective local disk) will not be in sync.

3 Likes

Hi Emil,
Is there a difference in Background Callbacks and “pure” celery that you mention?

As mentioned in my initial post, ideally it would be great to be able to have “normal” callbacks always available to quickly open modals and serve returning users with processed data from dictionaries saved on disc. Ie this can happen quickly even if there is a queue in the background for new users who are doing new processing. Is this possible with Background Callbacks hosted at Render (I have made a separate topic asking if anyone knows how to set this up) or would I need to learn about “pure celery” for this?

Many thanks!

Yes, by using Celery you offload the work from the webserver to Celery worker processes. Hence, the webserver can remain responsive, even if work is piling up for the Celery workers.

By scaling the number of webserver processes, you can control the number of concurrent users you can serve, and by scaling the number of Celery workers, you can control how fast the queue of ‘large jobs’ is executed.

It shouldn’t matter if you use background callbacks, or if you schedule the jobs yourself using Celery. That’s more a matter of syntax.

1 Like