Does callbacks have timeout? if so how to manipulate it?

So i have a multiple page app where i generate different reports of interest since we use the same filter inputs for each page i have a function that dynamically creates a 2 callbacks for each figure.
One that receives all the filters input, calls a function that is passed as a parameter which returns a data frame. This data frame is stored in a hidden div. The second callback receives this hidden div as a input and based on other parameters modifies the figure every time the data changes.

This approach works fine but whenever the function which creates the dataframe in the first callback is to heavy (takes a lot of time about 3-4 minutes) the callback just dies ( loading animation dissapears but no figures are refreshed). The curious thing is that the data function is still executing in the backend ( my cache stores the dataframe generated so nextime the page loads it loads imediatly).

This sounds to me like the webpage script has timeout of some sort but i cant find a way to manipulate it from python does anybody knows something about this?

3 Likes

Having the same problem. Has anyone found a solution?

I’m also having this issue. Any updates?

Could it be something about websocket / ingress timeout?
If you deploy your app on Google Kubernetes Engine, try to create a BackendConfig, and set the following values to e.g 5 minutes:
spec.timeoutSec : 300
spec. connectionDraining. drainingTimeoutSec: 300
More info on https://cloud.google.com/kubernetes-engine/docs/how-to/configure-backend-service#creating_a_backendconfig

+1

In my case I have a callback that sends out multiple emails in parallel, using Dask. When the callback takes more time (because I send more emails, say 1000+), after some time it simply “snaps” and it seems it does not return any value, yet it keeps running somehow “in the background” because emails keep being sent.

I run it on gunicorn with gthread workers, I also have tried increasing the number of workers and threads. The thing is, with the same Dockerfile, it happens only when I deploy the app with AWS Elastic Beanstalk, while if I run it locally it works smooth.

I run it on gunicorn with gthread workers…

I’m no expert in gunicorn but couldn’t the --timeout parameter help? Settings — Gunicorn 21.2.0 documentation

I am not an expert either, but shouldn’t it be solved using asynchronous workers? Also:

  1. (since I am using the async workers) I am not getting timeout warnings anymore;
  2. in my mind a timeout is not scalable, I can set it so I know it can send 2000 emails, but what if one day they want to send 3k or more?

The thing that drives me crazy is that this does not leave any trace/message in any log, I really do not know where to look at.

At last I got to end of the rabbithole and I saw in the browser that my request gets a 504 Gateway Timeout.

To the best of my abilities, I think that the picture is as follows:

  • my browser sends a HTTP request to my app;
  • the requests goes through a load balancer (AWS Elastic Beanstalk) and then hits my Dash callback that is dispatched to a gunicorn worker;
  • the worker start working, sending emails, without interruptions nor timeouts, since it is an async worker;
  • (on the client-side) the browser in the meantime is still waiting for the response, and after 60 seconds the load balancer sends back a timeout, the app goes on working but does not listen anymore for the output of that callback;
  • (on the server-side) on the gunicorn the callback is still running and ultimately completes the sending.

The problem here is, the server has all the time to process the request but it is not safe to leave the client with an HTTP request open for so long. I do not know how to get out of this. I am stuck, again.

1 Like

You could use an interval component to pull for updates, or alternatively a websocket.

We have the same issue. Same setup gunicorn timeout set to infinite. But there is no way to set/increase the callback timeout. There are multiple open (!) questions in this forum concerning this issue.

Does anyone know how to at least catch this error and display it to the user?

1 Like

Any update on this issue?

I am having the same problem using a EKS cluster, and running the application with gunicorn and gevent as worker: after one minute the frontend stops interaction and doesn’t display computation outputs, which still runs on the background.

The “timeout” doesn’t happen if running Flask/Dash server from python.

While it might not be what you want to hear, I would recommend moving calculations which are that long, to a separate process. If you do so, the timeout will no longer be an issue.

It’s not clear to me why this behaviour only happens when using gunicorn on EKS/Kubernetes.
Running the same application from Docker image built on a EC2 machine works as expected.
The same thing happens when using multiprocessing.Process.
If it matters, the callback does a computation which runs for 1 to 3 minutes.

Can this be an issue with the kubernetes Ingress config?

Not working with dash anymore :frowning: i moved to another job. but as a remember i was actually having this issue on ec2 not on kubernetes. After a while i gave up and scale up the instance so it took less time. If your data is static You may try to store it on cache.

There are often several servers in between Dash’s flask server and the browser. Gunicorn, Nginx, K8S Ingress, Load Balancers, etc. Each of them often have a 30s timeout by default. If you’re using Dash Enterprise, these timeouts are configurable in the Server Manager. Otherwise you’ll have to dig into your network stack. In some platforms, like Heroku, the 30 second timeout is not configurable.

To get around this, you can use long_callback which internally uses polling and background job queues.

3 Likes

As noted already, it depends on the configuration of your infrastructure. As pointed out by Chris, a good approach is to schedule the jobs through Celery, either via longcallbacks or manually. Here is a small example of the latter,

4 Likes