Dash 2.0 Prerelease Candidate Available!

We’re excited to share that a Dash 2.0 Prerelease Candidate is now available. Upgrade with:

pip install dash==2.0.0rc2

This is a 99% Backwards Compatible Feature Prerelease.

Dash 2.0 emerged out of the best ideas from dash-labs, our new testbed for prototyping and sharing ideas for the Dash framework. Many thanks to everyone in the community that contributed to the various dash-labs discussions over the last 8 months and to all that joined the Dash 2.0 webinar discussion this afternoon :bow:

Simplified Imports

:books: Documentation Reference: Imports Migration Guide

In Dash 1, every app would have several lines of the same boilerplate:

import dash
from dash.dependences import Input, Output, State
import dash_html_components as html
import dash_core_components as dcc
import dash_table

In Dash 2, we’ve merged these imports to a single line:

from dash import Dash, Input, Output, State, html, dcc, dash_table, callback

The old imports will continue to work but you will get a warning asking you to move to the new import statement.

Like in the 1.x series, installing dash will install dash-html-components, dash-core-components, and dash-table. However, these packages will become “stub” packages that simply re-export dash.html, dash.dcc, dash.dash_table for backwards compatibility. Their version will be pinned to 2.0.0.

Flexible Callback Signature

:books: View the docs: Flexible Callback Signatures User Guide

In Dash 1, app.callback(...) was based off of positional arguments:

  • Your functions could only accept positional arguments
  • The order of your function’s positional arguments needed to match the order of the Input and State declarations
  • Your functions could only return a positional list of outputs
  • The order of your function’s outputs needed to match the order of the Output declarations.

These types of positional arguments became unwieldy if you had many input arguments or outputs.

In Dash 2, app.callback accepts a wide variety of input and output argument shapes like dictionaries, grouped tuples, and more. Named dictionary arguments are particularly nice when dealing with a large number of outputs:

@app.callback( 
    output=dict(x=Output(...), y=Output(...)),
    inputs=dict(a=Input(...), b=Input(...)),
    state=dict(c=State(...))
)
def callback(a, b, c):
    return dict(x=a + b, y=b + c)

Dash 1-style positional callback arguments are still supported.

@app.long_callback

:books: View the docs: Dash Long Callback User Guide

@app.long_callback is a simple API for running long-running callbacks at scale in Production. Its API is a superset of @app.callback, allowing you to easily switch from callback to long_callback:

@app.long_callback(
	output=Output("paragraph_id", "children"),
	inputs=Input("button_id", "n_clicks"),
	running=[(Output("button_id", "disabled"), True, False),
             (Output("cancel_button_id", "disabled"), False, True)],
	cancel=[Input("cancel_button_id", "n_clicks")],
)
def callback(n_clicks):
	time.sleep(2.0)
	return f"Clicked {n_clicks} times"

@app.long_callback runs the callback in a separate Job Queue process(es). If long_callback is executed multiple times then the computations are queued up and executed one-by-one. Users can view progress bars while their callback is being executed.

In addition to providing a simple interface to scalable Job Queues, long_callback also provides some common Job Queue UI features like the ability to cancel jobs and custom & updatable progress bars.

Why Job Queues?

When your app is deployed in Production, a finite number of CPUs will be serving requests for that app. Dash apps can be scaled vertically with the --workers flag in gunicorn (e.g. gunicorn app:server --workers 4) or horizontally across multiple docker containers or nodes. (Dash Enterprise supports both modes of scaling).

Callbacks that take longer than 30s will often run into timeouts when deployed in Production. And even if a callback takes shorter than 30s, it can still tie up all of the available server resources if multiple people are accessing the app at the same time. When all CPUs are processing the callbacks, new visitors to your app will see a blank screen and eventually a “Server Timed Out” message.

The answer to long-running computations is Job Queues. Like the web processes serving your Dash app, Job Queues run with a dedicated number of CPU workers. These workers crunch through the jobs one at a time and aren’t subject to timeouts. While the Job Queue workers are processing the data, the web processes serving the Dash app and the regular callbacks can display informative loading screens, progress bars, and the results of the Job Queues. End users will never see a timeout and will always see a responsive app.

Job Queue Library

Under the hood, app.long_callback is using the open source, production-ready, and widely adopted Job Queue library Celery.

How Celery and long_callback Work Under the Hood

When @app.long_callback is invoked:

  1. The web processes running Dash will “submit” the job to Celery
  2. Submitting a job involves serializing the input arguments of the long_callback function (via pickle) and saving them in Redis, along with the timestamp when the job was submitted.
  3. A separate set of Celery workers poll Redis waiting for new tasks. They will take the oldest task, deserialize the input, and run the Python code within long_callback with those input parameters and save the output results back in Redis. Once finished, the worker polls Redis asking for another job.
  4. Meanwhile, the Dash app processes periodically poll Redis waiting for a result via a built-in dcc.Interval component. While it polls, it displays a customizable loading screen and progress bar. Once the result is available, it stops the interval and displays the result :tada:

The “state” of the system is stored in Redis. The Dash app workers and the Job Queue Celery workers are stateless. This allows you to easily scale the Dash app and the Job Queue CPUs independently. Long Live Stateless Architectures!

This is a classic & widely adopted architecture when scaling websites. It’s also complex! Our goal with app.long_callback is to abstract away all of these details and provide a best-in-class integration between Dash and Celery.

Development Environment

In development, we’ve implemented a “poor-man’s Celery” that uses multi-processing and saves data to the disk with the library diskcache. This enables you to build your Dash app without needing to run celery Job Queue process command in your terminal nor install and configure Redis locally. This DiskcacheLongCallbackManager is not suitable for Production deployments!

Note that if you are using Dash Enterprise, Redis is available in Workspaces, the built-in web IDE. This allows you to develop in a system that more closely resembles Production.

Built-in Callback Caching

:books: View the docs: See “Caching Results” at the bottom of the Dash Long Callback User Guide

In Dash 1.0, we recommended caching data using 3rd party libraries like flask-caching. This is still supported.

In Dash 2.0, we’ve added built-in caching support to long_callback. Caching improves performance by skipping computations if they have already been performed before with the same input arguments. The results of the computations are stored in a shared Redis database available to all workers and the “cache keys” are formed from the callback’s input parameters.

from dash import callback, clientside_callback

In Dash 1.0, callback and clientside_callback were bound to app:

@app.callback(...)
def update(..):
    # ...

app.clientside_callback(
    # ...
)

This is still supported in 2.0.

In Dash 2.0, callback and clientside_callback are now also available from the dash module and can be defined without app:

from dash import callback, clientside_callback

@callback(...)
def update(..):
    # ...

clientside_callback(
    # ...
)

Or equivalently:

import dash

@dash.callback(...)
def update(..):
    # ...

dash.clientside_callback(
    # ...
)

This is particularly useful in two cases:

  1. Organizing your Dash app into multiple files, like when creating multi-page apps. In Dash 1.0, it was easy to run into a circular import error and we recommended creating a new entry point to your app called index.py while keeping your app inside app.py. In 2.0, you no longer need to reorganize your code this way.
  2. Packaging components & callbacks for others to use. See All-in-One Components below :slightly_smiling_face:

There are three limitations with dash.callback:

  1. The global level prevent_initial_callbacks via app = dash.Dash(__name__, prevent_initial_callbacks=True) is not supported. It defaults to False. This is still configurable on a per-callback level.
  2. @dash.callback will not work if your project has multiple app declarations. Some members of the community have used this pattern to create multi-page apps instead of the official dcc.Location multi-page app solution.. The multi-app pattern was never officially documented or supported by our team.
  3. dash.long_callback is not yet supported.

All-in-One Components

:books: View the docs: All-in-One Components User Guide

All-in-One Components is a convention for encapsulating layout and callbacks into a reusable structure. This pattern uses standard Dash components with Pattern-Matching Callbacks and Dash 2.0’s dash.callback interface.

All-in-One Components can be useful when:

  • You have an interactive component and callback that you use multiple times across your app or project.
  • Your component’s interactivity is best performed in Python on the server rather than written in JavaScript and encapsulated in React. For example:
    • Python-specific Libraries - Your component may use Python-specific libraries (like pandas or scipy)
    • Large Data - Your component may filter or process large datasets and it would be inefficient, slow, or infeasible to send the data to the browser client.
    • Datastores - Your component may connect directly to databases or external datastores.
  • Your component is useful beyond your application and you want to contribute back to the Dash Community :tada:

The documentation provides:

  • :book: An outline of the All-in-One component convention
  • :wrench: An example of a rich DataFrame All-in-One component that performs filtering, paging, and sorting in the backend.
  • :floppy_disk: A redis_store implementation that that hashes and saves Pandas dataframes to Redis so that they can be shared across callbacks. This is useful for stateless All-in-One components that deal with too much data to send to the client via dcc.Store. We expect to make this a first-class member of the Dash API.

During our draft proposal phase, community member @RenaudLN shared this awesome All-in-One pagination component for their app:

I believe that there is a lot of opportunity for the community to build and share Python-enhanced component libraries. Imagine DataTables that automatically connect to databases and S3, image processing components that run scikit-image routines, or chart builders that take a dataframe and return a rich set of controls. Looking forward to seeing what everyone builds!

Faster Serialization

Dash will now use the orjson library if it is installed in the environment to serialize data to JSON before sending over the network. This can result in 50ms to 750ms improvements in callback performance.

orjson is not a dependency of Dash as some members of the community have had difficulty installing it with older versions of pip. So, for now it is opt-in: if it is installed then it will be used.

pip install orjson

Breaking Changes

There is only a single breaking change in Dash 2.0.0: @app.callback with state= is not supported unless input= is also provided. State without state= is still supported.

For example, this is no longer supported:

@app.callback(
    Output('graph', 'figure'),
    Input('dropdown', 'value'),
    state=State('store', data'))

But this is still supported:

@app.callback(
    Output('graph', 'figure'),
    Input('dropdown', 'value'),
    State('store', data'))

and so is this:

@app.callback(
    Output('graph', 'figure'),
    input=Input('dropdown', 'value'),
    state=State('store', data'))

templates and other dash-labs ideas

dash-labs contains a prototype for templates & plugins and we’ve discussed these ideas at length in the community forum.

Ultimately, we found that our prototype of templates had two shortcomings:

  1. Users were frustrated that you couldn’t introspect the templates and found the experience of extending or customizing a template to be too opaque.
  2. Templates and plugins “generated” a layout from a callback declaration. Unfortunately, this meant that they could not be declared within another callback as all callbacks need to be declared before the app starts. This meant that templates couldn’t be used in apps with dynamic content including multi-tab or multi-page applications. Most apps start out as single page apps and we didn’t want users to get very far with a template only to run into this limitation when they made their app multi-page.

We really appreciate the feedback on these ideas. We believe that there is still something there and that, with your help, we’ll find a sufficiently flexible solution through the All-in-One Components pattern. We encourage you to start with All-in-One components and explore the space!

dash-labs itself is published with the permissive MIT license on pypi. If you loved templates, you are welcome to use the code as is or fork it as you see fit :fist: We won’t be publishing any fixes there but we will be publishing new ideas there in the future.

Dash OSS 2.0.0 and Dash Enterprise

There were many questions in today’s webinar on whether these features are open source. The short answer is Yes! All of the features in Dash 2.0.0 and installable with pip are open source, available for free, and permissively licensed under the MIT License.

That being said, Dash 2.0.0 relies on two sets of 3rd party, open source software:

  1. celery for Production Use of long_callback
  2. Redis Database as a the shared, in-memory datastore for Production Use of celery, long_callback, caching, and sharing data between callbacks. Redis database is not installed with dash and must be installed and configured separately.

Deploying an app with long_callback requires two processes to be run independently:

  1. The Dash app process (e.g. $ gunicorn app:server --workers 4)
  2. The Celery Job Queue process (e.g. celery app:celery_app).

as well as a Redis database available to both of these processes.

The Dash Enterprise 4.x and the forthcoming 5.0 version support deployments of Dash Apps along with Celery processes, 1-Click Redis databases, and configurable number of CPUs & memory for each process. We may be biased, but we believe it’s the best platform for building, deploying, and scaling Dash Apps within an organization. These are turn-key features, no Dev-Ops required.

Feedback

Many thanks to everyone involved in this release :heart:

So, what features are you excited about? :slight_smile: Please give the Prerelease a try and let us know how it goes!

28 Likes

Very excited to try this out! Particularly the from dash import callback pattern, which will reduce the mental gymnastics I need for complex apps. Is support for from dash import long_callback on the roadmap?

1 Like

Mental gymnastics is a great way to put it :joy:

Yes we expect long_callback to the join the family dash import family soon.

Amazing work as always! Please note, with respect to the AIO components doc, i had to change

app.layout = MarkdownWithColorAIO(’## Hello World’)

for

app.layout = html.Div(MarkdownWithColorAIO(’## Hello World’))

1 Like

Quick question regarding the DataTableAIO component, how would you call from the main page app a function like page_df() of a specific and already created DataTableAIO? I was thinking of something like:

dtAIO_objt = DataTableAIO(df)
app.layout = html.Div(idtAIO_objt )

@callback()…
dtAIO_objt.page_df(page_current=1, page_size=10)

Dropped HTML components

Just a heads up that a few of the obsolete html components were dropped in Dash2.0.0:

html.Command
html.Element
html.Isindex
html.Listing
html.Multicol
html.Nextid

The obsolete and deprecated components are noted in the documentation and can be dropped at any time. Note that if you use them, there are no warnings in the console – so it’s important to review the docs periodically.

For an example see: html.Element | Dash for Python Documentation | Plotly

image

1 Like

Sorry to bother you guys with another not so simple question :see_no_evil:. What if I want to return a dynamically generated component to, for example, a Div from an All-in-One component’s callback? Returning something like html.Div(‘Hello’, id=ids.someid(aio)) would obviously not work. To clarify, all would be done inside the same AIO component’s class.

Hello,

I have a question regarding @long_callback.

Can I control for how long the results of the callback are stored in the DB and can I gain a direct access to the results?

My use-case is the following: I have a long process that i want to execute asynchronously with the @long_callback. After the process finishes, I want to store all the result data in Redis and to only send a summary table to the client.

User can then select a row in the summary table to see details of that particular record. Selecting the row triggers a callback that fetches detailed results from Redis and sends them to the client in the form of another table.

In order to fetch the data from Redis, I need the ID (key?) of the respective record. How do I get it?

The intention is for these methods to be stateless class methods so that you don’t need to instantiate an object: DataTableAIO.page_df()

@jorge243 - Sorry I don’t quite follow, could you write a simple example demonstrating what you’d like to do? It’s OK if the example doesn’t work

Hi @sislvacl,

Can I control for how long the results of the callback are stored in the DB… ?

Yes, you can use the expire argument to the CeleryLongCallbackManager constructor to control how long results are stored in the database. This represents the number of seconds that the cached value will be kept after it last use (when the value is accessed, the timer restarts).

…and can I gain a direct access to the results?

By itself, @app.long_callback only caches the return value of the function in the database. So in your case, this would only be the summary table. If you want to store more raw data you can still take advantage of @long_callback to run the job asynchronously, but you would need to store the intermediary value to the database yourself (and generate your own key).

Here’s a toy example of the idea, using the Celery Redis long callback manager.

  • The long_calculation generates a string where the character and the length are determined based on the n_clicks value of the “Run Job!” button.
  • This string is stored in Redis using the set method of the celery_app.backend object.
  • The key used to store this value is generated by hashing the inputs to the long_calculation function (n_clicks in this case).
  • This key is store in a Store component.
  • long_callback returns the summary value, which in this case is the length of the string.
  • A regular @app.callback is used to display the raw result when the “Show Details” button is clicked.
import string
import time
import hashlib
from dash import Dash, html, dcc, Input, State, Output
from dash.long_callback import CeleryLongCallbackManager
from celery import Celery

celery_app = Celery(
    __name__, broker="redis://localhost:6379/0", backend="redis://localhost:6379/1"
)
long_callback_manager = CeleryLongCallbackManager(celery_app)
cache = celery_app.backend

app = Dash(__name__)
app.layout = html.Div(
    [
        html.Button(id="job_button_id", children="Run Job!"),
        html.Div([html.P(id="summary_id", children=["Button not clicked"])]),
        dcc.Store(id="calculation_store", data={}),
        html.Hr(),
        html.Button(id="details_button_id", children="Show Details"),
        html.Div([html.P(id="details_id", children=["Button not clicked"])]),
    ]
)

@app.long_callback(
    manager=long_callback_manager,
    output=[Output("summary_id", "children"), Output("calculation_store", "data")],
    inputs=Input("job_button_id", "n_clicks"),
    running=[
        (Output("job_button_id", "disabled"), True, False),
        (Output("details_button_id", "disabled"), True, False),
    ],
    prevent_initial_call=True,
)
def update(n_clicks):
    n_clicks = n_clicks or 0
    raw_result = long_calculation(n_clicks)
    summary = generate_summary(raw_result)
    result_key = hashlib.sha1(bytes(n_clicks)).hexdigest()
    cache.set(result_key, raw_result.encode("utf-8"))
    return [f"Summary: {summary}", dict(result_key=result_key)]


@app.callback(
    Output("details_id", "children"),
    Input("details_button_id", "n_clicks"),
    State("calculation_store", "data")
)
def update(n_clicks, calc_data):
    result_key = calc_data.get("result_key", None)
    if result_key is not None:
        raw_result = cache.get(result_key)
        if raw_result is not None:
            return raw_result.decode('utf-8')

    return "No result available"


def long_calculation(n_clicks):
    time.sleep(2.0)
    return string.ascii_letters[n_clicks % len(string.ascii_letters)] * n_clicks


def generate_summary(raw_result):
    return len(raw_result)


if __name__ == "__main__":
    app.run_server(debug=True)

long_callback_external_cache

Hope that helps!
-Jon

3 Likes

very cool stuff. Looking forward to trying them out

1 Like

Dash 2 appears to released / the default version on Pypi. Is the documentation going to be updated soon to reflect Dash 2.0 syntax, simplified imports, etc.?

7 Likes

Question about long_callback and async multiprocessing:

If I wanted to run a function to parse a lot of data from one source in the background (like a function pulling RSS feeds and then running classifications), and then store it to REDIS, but I want to do it every hour to fill the REDIS, while using dcc.Interval to check REDIS… not a button push by the user, how would I do that? Is it possible?

What I see so far with long_callback is for each user to be doing this process through an upload, or button press… I want it done it in the background globally for the Dash, not everytime a person interacts. The person will visit the site and simply see an updating feed on an interval.

The function(s) I have performing this involve asynchronous multiprocessing (pool.imap that yields a generator of dictionaries I hope to serve to REDIS every hour for retrieval by dcc.Interval).

I know I can make this work in Heroku as they have a special way you set up background processes that interact with REDIS… My questions are: Can (and should) something like this be done using long_callback instead? (it would be easier I think…) and if it can be done, how and if I can’t/shouldn’t do it this way, why?

Sorry if this is too specific; I tried to keep it general to help future searchers. Thank you!!!

Hey @z.adamg ,

I think you actually don’t need long_callback for this. But you still need an async. processes for which you can use Celery and Redis (long_callback is using the same technologies).

What you will need is to setup a periodic async. process to read and analyse the RSS feeds and to store the results to Redis (or another database). This process will be completely independent from your dash app. It will not require any trigger, it will just run in the background with a certain period.

Then, you need to use the dcc.Interval component in your dash layout. This component will periodically load the data from Redis. You can have the dcc.Interval component always active (just set disabled=False).