What is the best way to pass a class containing dataframes across callbacks initialized from URL path arguments?

Hey everybody. I’m brand new to Dash and trying to learn how to create a multi page application. In my case I have pages that contain path arguments used in the path template and passed in through a layout(*args) function. I started out setting a class there with a global instance (singleton pattern) but that does not work in Flask, which will allow me to serve it up.

I have looked into Stores but those seem to be for client side data in JSON format. That’s not what I need. I need something that lives on the backend as a shared class that is initialized during the request based on parameters then used by other callbacks. I don’t know where to place that logic and how to share it.

Any recommendations on how to accomplish this? I am not seeing anything online really that answers my question and I have spent many hours looking for anything. I would greatly appreciate any advice you could offer!

Also if you by any chance have any information on how to integrate Dash into an Object Oriented class structure I would really appreciate any pointers there as well. I like Dash so far but I am not a fan of the pure functional programming style and I’m hoping there are others that use Dash that feel the same way and that there are good ways to integrate the Dash registration and callback process into a class hierarchy with the only exception the Gunicorn app gateway.

Thanks!

Serverside outputs could mitigate some of the challenges,

That does look interesting. I’ll play around with it. Thanks for that recommendation @Emil !

For anyone who has the same problem, I gave the ServersideOutputTransform a try, and along with the RedisStore backend (also in the dash-extensions package), it solved my needs nicely.

  1. I have a DashProxy with the ServersideOutputTransform with the RedisStore backend defined for the app
  2. Then I add hidden inputs with the URL parameters,
  3. A callback has Inputs for those parameters that loads the data and instantiates the class and returns to the store component. I did have to add a base class with from_json(), to_json() methods because my dataframes were not coming up JSON serializable, which apparently is still a requirement
  4. I then pass the store data to other callbacks as an Input and have a from_json() method that converts the parameter to the original class instance.

I then have a great way to pass dynamic objects across callbacks initialized from URL parameters. I recommend using Redis over the default file store, as that keeps the Dash container stateless and you don’t have to deal with mounting shared storage in autoscaling situations. Hope this helps.

2 Likes

Could you elaborate on the JSON serialization issue? I am asking because it was not intended to be a requirement.

Sure @Emil. I first tried to pass the class (which contains a property that is a dataframe along with other numbers and strings) and I was getting a JSON parse error because of the dataframe not being JSON serializable. When I dumped the class dict and rendered the dataframe with to_json I no longer got the issue. This created a dictionary with a JSON encoded dataframe in a data property.

Now things got a little stranger when I tried to turn that dictionary with JSON dataframe back into a class instance. I first tried to load the dictionary and them call pandas.read_json() to create the dataframe but on inspection after dataframe conversion errors I realized that something (probably the ServersideOutputTransform class) was already copverting the JSON back into a dataframe even though it was rendered within a dictionary.

So I have to explicitly call dataframe.to_json() when I return but it automatically parses the JSON back into the dataframe even if nested. It seems to me the nested processing is missing from the ServersideOutput return but operating during the callback input process.

I am happy to answer any more questions or file an issue if you think this is a bug, which it at least seems like a weird design pattern if not a bug.

Could you post a small example demonstrating the issue? It should not be necessary to convert to JSON at all, i.e. it sounds like you are doing something unintended, or that you have encountered a bug.

Sure @Emil:

This is a small untested example that strips out everything that is not relevant but “should” be functional

objects/base.py

import copy


class Base(object):

    @classmethod
    def from_dict(cls, attributes):
        return cls(**attributes)


    def __init__(self, **attributes):
        for key, value in attributes.items():
            setattr(self, key, value)


    def validate(self, attributes, *fields):
        missing_fields = []

        for field in fields:
            if field not in attributes or attributes[field] is None:
                missing_fields.append(field)

        if missing_fields:
            raise AttributeError("Attributes {} are required for object {}".format(
                ", ".join(missing_fields),
                self.__class__.__name__
            ))
        return attributes


    def to_dict(self):
        return copy.deepcopy(self.__dict__)

objects/my_class.py

from .base import Base

import pandas


class MyClass(Base):

    def __init__(self, **attributes):
        super().__init__(**self.validate(attributes, 'id'))

        # In my application CSVs are accessed via ids passed from parameters
        # This is just a random example to theoretically make the app functional so no params needed
        self.data_url = "https://data.wa.gov/api/views/f6w7-q2d2/rows.csv?accessType=DOWNLOAD"
        self.data = pandas.read_csv(self.data_url)


    def to_dict(self):
        attributes = super().to_dict()
        attributes['data'] = self.data.to_json() # This is the part I wish I didn't have to do
        return attributes

pages/test.py

from dash import html, dcc, callback, Input, Output
from dash_extensions.enrich import ServersideOutput

from objects.my_class import MyClass

import dash


dash.register_page(__name__, path_template="test/<instance_id>")


def layout(instance_id):
    return html.Div(children = [
            html.H3("This should be replaced!", id = 'app_title'),
            dcc.Input(
                id = 'instance_id',
                value = instance_id,
                style = {
                    'display': 'none'
                }
            ),
            dcc.Store(id = 'instance')
        ]
    )

@callback(
    ServersideOutput('instance', 'data'),
    Input('instance_id', 'value'),
    memoize = True
)
def load_data(instance_id):
    return MyClass(
        id = instance_id
    ).to_dict() # Can't return class instance itself because dataframe parsing fails (not JSON serializable)


@callback(
    Output('app_title', 'children'),
    Input('instance', 'data'),
    prevent_initial_call = True
)
def set_title(instance):
    instance = MyClass.from_dict(instance) # MyClass comes in as a dictionary to be converted to class instance
    return "Instance: {}".format(instance.id)

app.py

from dash_extensions.enrich import DashProxy, ServersideOutputTransform, RedisStore

app = DashProxy(__name__,
    transforms = [ServersideOutputTransform(backend = RedisStore())],
    use_pages = True
)
if __name__ == '__main__':
    app.run(port = 5000, debug = True)

The error is caused by incorrect imports. In general, all imports from dash should be replaced by imports from dash_extensions.enrich. Specifically, it’s the import of callback that causes the problem. Hence if you replace,

from dash import html, dcc, callback, Input, Output
from dash_extensions.enrich import ServersideOutput

by

from dash_extensions.enrich import html, dcc, callback, Input, Output, ServersideOutput

I believe it should work without the need for JSON serialization (and faster too).

That would be awesome. I’ll give that a try and let you know how it goes.

I tried it but I am getting an error:

Traceback (most recent call last):
  File "/home/adrian/.local/lib/python3.10/site-packages/dash_extensions/enrich.py", line 1230, in decorated_function
    unique_id = _get_cache_id(f, output, list(filtered_args), output.session_check, output.arg_check)
  File "/home/adrian/.local/lib/python3.10/site-packages/dash_extensions/enrich.py", line 1276, in _get_cache_id
    all_args += [_get_session_id()]
  File "/home/adrian/.local/lib/python3.10/site-packages/dash_extensions/enrich.py", line 414, in _get_session_id
    session[session_key] = secrets.token_urlsafe(16)
  File "/usr/local/lib/python3.10/dist-packages/flask/sessions.py", line 98, in _fail
    raise RuntimeError(
RuntimeError: The session is unavailable because no secret key was set.  Set the secret_key on the application to something unique and secret.

Do you know what could be going on there? It seems like dash_extensions is requiring a session id?

Per default, dash-extensions creates a session id if missing (which is used to enable session dependent caching). However, it fails due to the secret key not being set. That’s weird though, as it should be set explicitly here.

You could try setting it manually, i.e. something like

app = DashProxy(__name__, transforms=[ServersideOutputTransform()], suppress_callback_exceptions=True)
app.server.secret_key = secrets.token_urlsafe(16)  # add this line

However, if you end up needing to do that, please file a bug as it should be done automatically.

So after switching all the Dash imports excep for a top level dash to register pages I am getting weird connection errors with Redis (which I am guessing could have been there all along). Does the following stack trace mean anything to you? I see the request passes through the dash_extensions package but I am not sure if this is something I am inadvertantly doing.

redis://data:6379
Exception on /_dash-update-component [POST]
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 624, in connect
    sock = self.retry.call_with_retry(
  File "/usr/local/lib/python3.10/site-packages/redis/retry.py", line 46, in call_with_retry
    return do()
  File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 625, in <lambda>
    lambda: self._connect(), lambda error: self.disconnect(error)
  File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 690, in _connect
    raise err
  File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 678, in _connect
    sock.connect(socket_address)
OSError: [Errno 99] Cannot assign requested address

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 2525, in wsgi_app
    response = self.full_dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1822, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1820, in full_dispatch_request
    rv = self.dispatch_request()
  File "/usr/local/lib/python3.10/site-packages/flask/app.py", line 1796, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)
  File "/usr/local/lib/python3.10/site-packages/dash/dash.py", line 1274, in dispatch
    ctx.run(
  File "/usr/local/lib/python3.10/site-packages/dash/_callback.py", line 440, in add_context
    output_value = func(*func_args, **func_kwargs)  # %% callback invoked %%
  File "/usr/local/lib/python3.10/site-packages/dash_extensions/enrich.py", line 1232, in decorated_function
    if not output.backend.has(unique_id):
  File "/usr/local/lib/python3.10/site-packages/cachelib/redis.py", line 133, in has
    return bool(self._read_client.exists(self.key_prefix + key))
  File "/usr/local/lib/python3.10/site-packages/redis/commands/core.py", line 1697, in exists
    return self.execute_command("EXISTS", *names)
  File "/usr/local/lib/python3.10/site-packages/redis/client.py", line 1255, in execute_command
    conn = self.connection or pool.get_connection(command_name, **options)
  File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 1427, in get_connection
    connection.connect()
  File "/usr/local/lib/python3.10/site-packages/redis/connection.py", line 630, in connect
    raise ConnectionError(self._error_message(e))
redis.exceptions.ConnectionError: Error 99 connecting to localhost:6379. Cannot assign requested address.

The first line is the printed connection string. I eliminated the password temporarily to test if it could be an authentication issue, which it does not seem to be.

One thing I did notice is that it is trying to connect on localhost instead of the data host that the Flask cache Redis uses. Could it be switching out or using another Redis than the Flask cache Redis?

That looks like a Redis configuration error. I would recommend starting without Redis (i.e. just using the default disk backend) , just to make sure everything else workshop as intended. When it does, debugging the Redis configuration is the next step.

Ok, so I found the issue. This might be of interest to you. I did try with the file system and sure enough it worked just fine, exactly like you described and as I hoped. So the issue was Redis, but what I found is that I needed to pass the Redis connection information into the RedisStore constructor, so I guess it does not share the Flask cache. I did notice it extends the Flask Caching RedisCache class, which is what gave me the idea to pass the same Redis connection information I passed to the Flask cache, which worked all along.

You might consider adding (if you are a maintainer as it seems) the ability to share the RedisCache instances some how or have the option of passing that instance in to the constructor of the RedisStore. It still works like it is so I am happy but I think the examples might need an update at the very least, because I think I got my initial code from an example which had empty parameters.

Yes, the Redis configuration must be passed separately. This is not a bug, but a design choice. I made it this way to make the configuration more explicit/transparent, to reduce coupling with Flask caching, and to enable greater flexibility (e.g. using different Redis instances for Flask caching and callbaks; or even different Redis instances and/or other caching mechanisms for different callbacks). But I agree that more/better documentation on the usage would be good :slight_smile:

So, just to sum up, did you get it working without the need for JSON serialization?

That makes sense. I did indeed get it to work after that discovery. Great work!

1 Like