So i have a multiple page app where i generate different reports of interest since we use the same filter inputs for each page i have a function that dynamically creates a 2 callbacks for each figure.
One that receives all the filters input, calls a function that is passed as a parameter which returns a data frame. This data frame is stored in a hidden div. The second callback receives this hidden div as a input and based on other parameters modifies the figure every time the data changes.
This approach works fine but whenever the function which creates the dataframe in the first callback is to heavy (takes a lot of time about 3-4 minutes) the callback just dies ( loading animation dissapears but no figures are refreshed). The curious thing is that the data function is still executing in the backend ( my cache stores the dataframe generated so nextime the page loads it loads imediatly).
This sounds to me like the webpage script has timeout of some sort but i cant find a way to manipulate it from python does anybody knows something about this?
In my case I have a callback that sends out multiple emails in parallel, using Dask. When the callback takes more time (because I send more emails, say 1000+), after some time it simply “snaps” and it seems it does not return any value, yet it keeps running somehow “in the background” because emails keep being sent.
I run it on gunicorn with gthread workers, I also have tried increasing the number of workers and threads. The thing is, with the same Dockerfile, it happens only when I deploy the app with AWS Elastic Beanstalk, while if I run it locally it works smooth.
At last I got to end of the rabbithole and I saw in the browser that my request gets a 504 Gateway Timeout.
To the best of my abilities, I think that the picture is as follows:
my browser sends a HTTP request to my app;
the requests goes through a load balancer (AWS Elastic Beanstalk) and then hits my Dash callback that is dispatched to a gunicorn worker;
the worker start working, sending emails, without interruptions nor timeouts, since it is an async worker;
(on the client-side) the browser in the meantime is still waiting for the response, and after 60 seconds the load balancer sends back a timeout, the app goes on working but does not listen anymore for the output of that callback;
(on the server-side) on the gunicorn the callback is still running and ultimately completes the sending.
The problem here is, the server has all the time to process the request but it is not safe to leave the client with an HTTP request open for so long. I do not know how to get out of this. I am stuck, again.
We have the same issue. Same setup gunicorn timeout set to infinite. But there is no way to set/increase the callback timeout. There are multiple open (!) questions in this forum concerning this issue.
Does anyone know how to at least catch this error and display it to the user?
I am having the same problem using a EKS cluster, and running the application with gunicorn and gevent as worker: after one minute the frontend stops interaction and doesn’t display computation outputs, which still runs on the background.
The “timeout” doesn’t happen if running Flask/Dash server from python.
While it might not be what you want to hear, I would recommend moving calculations which are that long, to a separate process. If you do so, the timeout will no longer be an issue.
It’s not clear to me why this behaviour only happens when using gunicorn on EKS/Kubernetes.
Running the same application from Docker image built on a EC2 machine works as expected.
The same thing happens when using multiprocessing.Process.
If it matters, the callback does a computation which runs for 1 to 3 minutes.
Can this be an issue with the kubernetes Ingress config?
Not working with dash anymore i moved to another job. but as a remember i was actually having this issue on ec2 not on kubernetes. After a while i gave up and scale up the instance so it took less time. If your data is static You may try to store it on cache.
There are often several servers in between Dash’s flask server and the browser. Gunicorn, Nginx, K8S Ingress, Load Balancers, etc. Each of them often have a 30s timeout by default. If you’re using Dash Enterprise, these timeouts are configurable in the Server Manager. Otherwise you’ll have to dig into your network stack. In some platforms, like Heroku, the 30 second timeout is not configurable.
To get around this, you can use long_callback which internally uses polling and background job queues.
As noted already, it depends on the configuration of your infrastructure. As pointed out by Chris, a good approach is to schedule the jobs through Celery, either via longcallbacks or manually. Here is a small example of the latter,