Global variables sharing with mutex

Hi,

after reading the tutorial about the data sharing between the callbacks, I would like to ask for the reason why global variables sharing is not safe in dash.

That’s basically my understanding after reading, the reason are

  1. Because of the data are not shared between process in Dash (from author of Dash). But actually in my code I have used that a lot (even with very complex logic) and for single user in local mode, everything seems working properly.

  2. Because of the non-atomic modify of global variables from different callback processes. That’s a basic issue in programing and can be well handled with mutex (or Lock from multiprocessing lib).

  3. Because of multi-user case maybe in a deployment senario. But I don’t see the point, why the global variables will be corrupted in this case. From my understanding: As soon as a new users arrived, a new Dash instance will be created (new Process and new resources), if the code does a perfect job in synchronization, the global variable of user_1 should not be affected by another users.

I don’t know if I have understood correctly especially about point 3, it would be very nice if someone could solve my confusion. Thx in advance.

VG
Zhiwei

1 Like

The data management will become differently once you decide to scale up your app and use multiple processes which is necessary for handling more than one callback at once (in the case of multiple users interacting with the app at once or multiple callbacks being fired at once). Modifications to your global data won’t persist across processes.

This is actually non-issue because the data isn’t shared across processes. If you want to share data across processes, and this isn’t recommended because of point 3 and point 4, you could use these multiprocessing objects 16.6. multiprocessing — Process-based “threading” interface — Python 2.7.18 documentation but these data structures are simple - you couldn’t share modifications with e.g. a pandas dataframe.

This doesn’t happen. Dash processes are shared across all users. A request can be handled by any of the available Dash processes. There is no concept of “sessions” on the backend. This allows Dash to scale really well: 2-4 dash processes could handle hundreds of simultaneous users (depending on the computations required in the callbacks), with each user’s “app state” stored in the user’s browser (instead of hundreds of separate session stores in the backend).


There is also
4. Multi-session case. If you modify a global variable in your session, then when you reload your page you will see the modified version of the variable instead of the initial value of the variable (and if there are multiple users, then point 3 applies - the next user might see your modifications instead of their modifications).

3 Likes

Hi @chriddyp,

Thank you for your kind answer!

This doesn’t happen. Dash processes are shared across all users. A request can be handled by any of the available Dash processes. There is no concept of “sessions” on the backend. This allows Dash to scale really well: 2-4 dash processes could handle hundreds of simultaneous users (depending on the computations required in the callbacks), with each user’s “app state” stored in the user’s browser (instead of hundreds of separate session stores in the backend).

I don’t know if I understand correctly, no matter how many users reach the webpage (assume in a deployed mode), there should be only 1 dash application process in the backend operating system (Server side). If any callback functions are fired because of the request from user, a dash subprocesss will be created (run the callback function) and send the result again to the user. This result or new “app staus” after callback return will be then kept by users browser and the Dash process an server will be idle meanwhile wait for the new request of user.

  1. Multi-session case. If you modify a global variable in your session, then when you reload your page you will see the modified version of the variable instead of the initial value of the variable (and if there are multiple users, then point 3 applies - the next user might see your modifications instead of their modifications).

This explanation is super helpful!

LG
Zhiwei

This isn’t how Dash works. When you launch a Dash app, there will be a single Python process which handles all callbacks from all users – no sub-processes are created. This means that all of your callbacks share the same memory. So if one user were to trigger a callback that changed a global variable, every other request from any user would then make use of the changed global. So your callbacks have to treat the data structures as being immutable.

While this may seem annoying, as @chriddyp mentioned, it is this design decision that allows Dash to scale to a large number of requests from potentially many users, as they are being serviced by a single process.

1 Like

Greetings,

After reading the above I am wondering, then, what alternatives are left for managing would-be global variables? For instance, I am using the Dash interface (inputs and outputs) to alter an object from an external python module. That is, the component properties available to me in Dash would be discrepant to the module object. I did see the multiprocessing mention but is this the only solution? If so, then can it be reasonably assumed that it is safe to use global variables ONLY for single-user local mode?

If you can guarantee that the app is going to be used by a single user, and you only run with one process (which is the default) then I believe it should be OK. Of course the potential concern is that you may forget or a subsequent person running the app may not realise this, resulting in unhappy times.

If you don’t need to persist the changes to the object in the other variable, you could also make a copy of it in your callback, rather than modifying it.

So it sounds like there are no other options for using global variables in multi-user mode. Would it be possible at least to create a memory sharing feature between callbacks (feature request)? More specifically, memory sharing between callbacks that isn’t integral to the interface at all. Perhaps this would go well with the multiple outputs in callback that I have seen suggested as well?

Still good to know that everything will still work in single user mode, thanks for confirming!

There’s a couple of options for sharing data between callbacks. One is to save and read information from disk, the other is to use a database, or in-memory object store like Redis. This is discussed in the Dash User Guide.

1 Like