Flask Rate Limiter does not seem compatible with Background Callbacks - any other options?

Hi,
I have been using Flask Limiter in my Dash apps to limit the number of requests from one IP address in a given timeframe to stop malicious users or bots from overwhelming my site with requests. This seemed to work quite well by including the below lines in my Dash app:

from flask_limiter import Limiter
from flask_limiter.util import get_remote_address
from flask import Flask
import flask

server = Flask(name)
app = Dash(name, title=‘myapp’,server=server)

limiter = Limiter(app=app.server, key_func=get_remote_address)
app.server.config[‘RATELIMIT_HEADERS_ENABLED’] = True
app.server.config[‘RATELIMIT_STORAGE_URL’] = ‘memory://’

@callback(Output(mydashoutputs), Input(Mydashinputs))
@limiter.limit(“10/minute”) # Change the rate limit according to your needs
def mydashcallback()

However I am now experimenting with Dash Background Callbacks and it seems that I get an error if I use a Background Callback with the Flask Rate limiter. If I set background=False then it works, or if I remove the @limiter decorator and have background=True then it works, but I immediately get errors when my callback fires if I have both background=True and the @limiter decorator.

The error messages are long but the main one seems to be “TypeError: cannot pickle ‘_contextvars.ContextVar’ object”

To make a reliable app I think it is important to be able to have Background Callbacks to put requests into a queue, and also some kind of rate limiter to stop one IP address overwhelming or filling up your queue.

Is there a way to make Flask Limiter work with Background Callbacks, or is there any other way to rate limit on Dash?

Many thanks !

Hello @Pippo,

Background callbacks work by pinging the server for updates every second by default.

I also agree that this can be an issue to think about with limiting requests. It still doesnt keep the requests from spamming your load balancer however. You can disable the background callback from being triggered again through different scenarios, for example disabling the button that triggered it.

@Emil also has some ways that you can keep requests from the same session stacking up.

You could also look into clientside caching somehow (each callback goes to a specific endpoint) with different arguments of course. This would be tricky to manage with callbacks, but would also allow for your server not to get spammed by users.

Another thing you could do is to use a load balancer and configure your settings that way. (Not sure you can access payloads with that though)

1 Like

Hi and thank you for your response! I have several questions below. But just in case it makes a difference I will explain my specific use case first:

  1. My home page is a Dash layout consisting of welcome text and image, a dcc.input component (see below), a “Help” button which triggers a modal when clicked, and a “Share” button whose target URL is updated by callback (see below).
  2. The user provides an ID for analysis, for example 123456. Currently I allow input into the dcc.input component box, or alternatively into the address bar using dcc.location. In this case the user would enter mydomain.com/123456.
  3. Upon any change of the Input box or the URL,My Standard Callback checks to see whether I have already done the processing for this ID this week - if so there will be a pickled finished dictionary already stored on local disk, and the callback reads this file and serves the dictionary content back to the user through the callback output updating the text on the landing page and also the share button target URL on the landing page. If the dictionary does not exist on local disk, then I need to do the fetching+processing so I put the ID in a dcc.Store to trigger my Background Callback.
  4. My Background Callback reads the ID from the dcc.Store and runs my 10 second fetching+processing function and saves the processed dictionary to disk for any future returning users (see step 3). The callback output updates the text boxes and the share button target URL with the processed data.

I have been able to get the fetching/processing described in step 4 working as a Background Callback, but as mentioned I do not seem able to use Flask Rate limiter on a background callback. As a workaround I have found I can still rate limit the Standard Callback which effectively triggers the Background Callback, although an advanced user could probably still trigger my Background Callback if they tried.

What do you mean by load balancer? If you mean the celery queue then I think it would be ok, as the limiter stops the callback from firing and I believe the celery queue is only working on the fired Callbacks.

Yes that sounds like a good idea. Although as described above i am currently allowing input via the URL with dcc.location, so I am not sure if I can disable or limit this.

I am not sure I follow what you mean, please could you explain your idea further?

Do you mean it’s fundamentally incompatible with Flask rate limiter? Just to be clear the error I received was not a rate limited error, and it was the same error even if I set an infinite limit - that’s what made me think it might be more of a bug rather than a fundamental incompatibility.

Is there anywhere I can read about these methods?

Many thanks again (and apologies for all the questions! I am pretty new to programming but Dash has been a great entry point for me )

Sure, you can check out here:

This will block multiple requests for the same callback from the same session.

I’d use this in combination with somewhere to store what IDs are being run currently. Since, even with using caching and a limiter, requests from a different session would still cause the work to happen.

To do this, you could potentially use a database to store what id’s are being processed, then when it is done being processed, you can add the date that it was completed, if it has a valid completion date within the date range you explained, then it would pull from the file. If the completion date is not within the correct range, then it will add a new line with the id and a blank completion date, to keep other session requests from trying to begin the process again.

If a user requests one that has a blank completion date, then the background callback just waits until the completion date has been filled in, instead of starting the process again.

Hope this helps you and gets your juices going for how you could potentially solve this issue.

1 Like

Thanks that is interesting ! I will certainly take a look at that and the other functions on that page. If I understand correctly though, this serves a slightly different purpose to rate limiter as I imagine a malicious user could still invoke the callback many times in separate sessions (ie it is just limiting it for the session, not limiting it by IP). So it would still be great to know if the Background Callback incompatibility with Flask Limiter is a fixable bug or a fundamental incompatibility. It would also be interesting to know if the Dash app/layout itself could have a similar IP limit applied in the same fashion as the callbacks, to stop any malicious attacks on the homepage itself.

Yes those are along the lines I am thinking regarding the database of IDs being processed. As my files will be stored on Amazon S3 I am also looking at the option of having a Callback try to load the finished file from S3, and if it does not exist then send a short message with the ID to an Amazon SQS (simple queue system), and then an Amazon Lambda function doing the long processing (so that I can better control scaling) - I believe the SQS system can drop any duplicate requests if multiple Dash callbacks send the same user ID before it reaches the Lambda processing. If I go down this route I might not need Background Callbacks - or maybe I would to somehow poll the S3 file storage to find out when the file is ready. That’s a question for another day :slight_smile:

Personally, I wouldnt rely on Flask/Dash to be the limit factor, that is really what load balancers are for. Load Balancers can cache as well, and I think it is better than the server itself doing it, since it doesnt take any resource of the server to do so.