Dash Club Dispatch #6: WebAssembly, Summer App Challenge & Show & Tell

We’re back! After a long hiatus, @adamschroeder and I are excited to bring you Dash Club Dispatch #6. Adam joined us in the Fall as Plotly’s Community Manager and you may know him as Charming Data from his Dash Youtube Channel.

We also have this Dash Club newsletter on Medium, in case you prefer to read or share it there.

Dash Club brings essays and updates to your inbox every 8 weeks. Sign up to stay in touch! :blue_heart:

In Dispatch #6

  • Version Check
  • WebAssembly in Dash Essay
  • App of the month
  • Component of the month
  • Coming soon in Dash
  • June Community App Challenge
  • Things happen

Version Check

WebAssembly in Dash

There’s been a lot of buzz around WebAssembly (WASM) in the last few years. You can think of Web Assembly like a new language that runs in the browser; an alternative to JavaScript. Folks are excited about it because other languages can compile into WASM executables which opens up the possibility of running various programming languages in the browser client without making network requests to a server. This includes Python! ICYMI, websites can now serve an executable that will run Python (see PyScript), Jupyter notebooks (see Pyodide), and Dash apps (see below!) natively within the browser client without making any network requests to the server.

Yes, WebAssembly Dash apps! Almost two years ago, community member Itay Dafna published WebDash, a WASM-powered interface to Dash, as part of a research project at Bloomberg. In WebDash, the Python callbacks and variables are compiled to WASM and executed in the app visitor’s browser client without making any additional network requests (Dash callbacks are typically executed on the server which requires sending data from the browser over the network and running on one of the available server CPUs).

Ditch the server? WebDash converts the entire Dash app to WebAssembly so that you don’t even need to run a Python web server to deploy the app. A simple static file hosting service could serve the WASM bundle and this bundle would contain all of the Python variables loaded in memory, the callback functions to process that data, and the library code. Serving static files is cheap! The software industry has spent decades reducing the cost of serving files and now many services, like GitHub Pages, offer hosting & serving these files for free with an account. Of course there is still a server somewhere, but it’s abstracted away from the developer and cheap enough for GitHub to run that they’ve made it free for users, under the expectation that the cost of hosting & bandwidth is outweighed by the benefit of growth and retention enabled by the feature. On the other hand, when hosting server-side code, hardware is dedicated to run the user’s code, load the data into memory, and always be available to process a request. The price of hosting server code is so much higher than serving static files that you’ll rarely see unlimited free hosting without restrictions (e.g. AWS’s 750 free hours or Heroku’s sleeping apps after 30 min of inactivity). The availability of free static file hosting services, and the fact that everyone can run code in their browser without downloading anything, helped make JavaScript the world’s most popular programming language. With WebAssembly, there is promise that Python will become even more accessible and widely adopted.

Are statically hosted, self-contained executable data apps the future? Not everywhere. Static websites have their own set of inherent limitations that can become especially apparent for data-intensive web apps.

The network cost tradeoff. In a self-contained WASM-ified Dash app all callback requests are made within the browser without a server. The entire dataset, Python code, and runtime needs to be transported over the network in the initial request that loads the executable. Consider a Dash app where a 1GB dataset is loaded into memory when the app starts and all subsequent callbacks filter, aggregate, or run derived computations on that dataset. In “regular” Dash, the original dataset is held in Python memory on the server or memory mapped to disk and only the derived computations are sent to the client. These derived computations can be small; if the app is displaying a sum of a column it’s only sending a single number (8 bytes!) over the network to the client. In the WebDash model, the entire 1GB dataset is sent to the client along with the WASM-ified callback code on page load. Summing that column is done in the client instead of the server (no network request!) but you’re trading a 8 byte network request at runtime for a 1GB network request on page load. That’s going to be a heckuva lot slower. And we’re only talking about data so far, not even the cost of shipping the WASM-ified Python callback code, runtime, and dependencies.

Another challenge is fetching the data to begin with. Almost no databases allow connections from a browser; there is (almost) always server code involved in the middle that handles database authentication and authorization (not all viewers of the app should be able to view all of the data in the database!). The exchange between the browser and the database is typically done by custom server code packaged as an HTTP API. You could do the same with WebDash by writing an external HTTP API that the WASM-ified Dash app makes requests to, but then we’re back where we started with a Python server and we might as well just let the Dash framework abstract away all of this complexity for us.

Refreshing the data presents another challenge. With traditional Dash, your callbacks can fetch data from the original or cached data source on-the-fly in callbacks. And if fetching that data is too slow to do on-the-fly then you can deploy a background job queue to periodically fetch the data and cache the data nearby in Redis. When everything is self-contained in a WASM executable you can’t employ the same techniques.

So there’s a sweet spot for the self-contained WASM app. First, the data must be relatively small and it can’t load from databases on-the-fly. Second, the time saved by running callback computations in the browser must be greater than the time lost by sending the dataset to the browser on page load.

There’s a middle ground, though! Imagine a world where some but not all callbacks are compiled to WASM; think @dash.clientside_callbacks written in Python instead of JavaScript. @dash.wasm_callback! So long as the callback isn’t pulling too much data from its broader scope then it’d be a pretty straightforward performance boost. Heck, we could even turn on WASM callbacks by default if the compiled callback and data was smaller than a certain threshold. That’d be sweet!

Another opportunity for WASM is to reduce barriers in the development experience. In the JavaScript world, many documentation sites are interactive and editable. Our Dash documentation at dash.plotly.com is interactive (it’s famously written in Dash itself) but the code examples themselves aren’t editable in the browser. I can imagine a world where we have an interactive console to edit the code in the browser just like these Ramda.js docs. Since the code editor would execute the code in the visitor’s browser, we wouldn’t have to pay for the server costs of spinning up an interactive IDE every time a visitor of the docs wants to mess around with the examples.

Will this make its way into the general code editing experience? Definitely in the global education market. The costs of serving a self-contained IDE in a WASM executable is so much lower than spinning up servers on-the-fly and, crucially, students don’t need to download anything! They only need a browser. But it’s less useful for app development in a company setting because you need that server in-the-loop to connect to databases, aggregate large datasets, and store your code. In Dash Enterprise Workspaces, we provide an in-browser IDE where the Python code is written in the browser but is executed on the server hosting the platform. You get the benefit of writing code in your browser (don’t need to download Python!) without the limitations of WASM (you can connect to databases!). Plus, the environment that’s executing the code in development is almost identical to the environment running the deployed applications; you have the same network access to data sources, same container with the same version of linux, same Python versions, same packages, same disk, same Python package server, same security proxy, etc. dev = prod.

Will WASM make its way into the UI layer? It’s hard to say. WASM doesn’t interact directly with the DOM (the DOM is the actual HTML & CSS that you see rendered when viewing a web page), so for now JavaScript will always be in the loop (WASM code will have to serialize the UI as data and “send” it to JavaScript which will read it and render it as HTML). Some high performance web UIs do away with the DOM and render everything in canvas or WebGL but it’s not easy nor inherently accessible (see Google Docs’s canvas-based renderer announcement or Plotly’s WebGL graphs). Some say WASM-powered DOM-less UIs will revive an era reminiscent of Applets or Flash but better. But even if WASM could write to the DOM, most of us will still need a higher level UI framework to be productive. It’s rare to build a UI without a framework these days; you’re almost always using something like React, Vue, Angular, Svelte, D3, etc. Besides reducing effort, frameworks enable community which is why we built Dash upon React. And building sophisticated UI is always going to be complex, regardless of the language. Our graphing library plotly.js is 200 releases in the making; writing it in Python compiled to WASM wouldn’t have accelerated this timeline.

There may be an opportunity for Dash components to expose WASM Python hooks that are executed in the browser. Think of the filtering logic in the Dash DataTable: the native logic provides built-in filtering and sorting executed in JavaScript but the filter_query property enables you to write your own Python callbacks for custom filtering code. In the future, perhaps you could embed your WASM-compiled Python code for custom filtering in the component itself rather than attaching separate pattern-matching callbacks. Keeping UI code within the component rendering code is a pattern that React popularized and could simplify Dash’s AIO pattern one day, too.

Exciting stuff!
When we started Plotly almost 10 years ago we had a vision that data science would move from desktop platforms to the browser and it’s really exciting to see the community’s effort in pushing this boundary.

If any of this sounds interesting to you and you’d like to contribute to this effort, let us know by responding to this email. We have a Dash Ambassadors program with availability to mentor community contributions to Dash. In fact, Dash 2.5’s latest feature, pages, was built by our community member @AnnMarieW under this program.

App of the month

@jhupiterz 's Dash app enables users to search for a research topic, just like you would do on your favorite academic search engine. We love the interactive network graph for citations. Gorgeous design, too!

See more Dash apps or share your own in the Community Forum’s Show and Tell tag. If you would like your app to be considered for the August edition of the Dash Club newsletter, please message Adam on the Forum.

Component of the month

We were delighted to see @muntakim publish Dash Cute-Charts. This component brings xkcd-style charts to Dash. See Demo, PyPi, Github.

Visit our community components index to see more components! Or share your own by creating a new topic with the Community Components tag.

Coming soon in Dash

We’re busy removing various limitations of app.long_callback by re-engineering the underlying polling mechanism. Long callbacks provide an API for long-running computations that would otherwise exceed most web servers’ 30-second timeout. Dash’s long callback API abstracts away all of the complexities in making long computations in data apps work at scale into a simple decorator so that you don’t need to worry about implementing polling, progress bars, background job queues, intermediate data caching, and more.

June Community App Challenge

We challenge the community to build a Dash app that will support people working in animal shelters in understanding their data better and allow for higher rates of animal adoption.

This data set was provided to Kaggle by the Austin Animal Center, the largest no-kill animal shelter in the United States. The data set contains statistics and outcomes of cats entering the Austin Animal Services system.

To enter the Community Challenge please, download the dataset, create an app, and share it by replying to this Forum topic. The goal is to raise awareness around the data and predict methods to increase adoption rates. The winners will be highlighted in the next Dash Club Dispatch and will receive a package of this season’s Plotly swag.

Feedback

We’re working with UserEvidence, a survey research firm, to collect feedback from Dash users. If you have a few minutes, we would greatly appreciate you filling it out.

Things happen

:football: Kansas City Chiefs share how they use Dash Enterprise
:100: Plotly makes it on the DBTA 100
:gift: New community forum emojis
:briefcase: Dozens of Dash job postings (:wave: we’re hiring too!)
:keyboard: TypeScript comes to Dash
:pie: Community member, Tolga, publishes Raspberry Pi x MQTT tutorial
:es: Staff member @Celia presents at PyData Madrid
:atom_symbol: @Alex Johnson presents The Physics of Data at Boston Data Summit
:chart_with_upwards_trend: Adam and @Kathryn Hurchla present Dash community development at Pycon US
:date: Bruno Rodrigues shares a calendar heatmaps recipe
:globe_with_meridians: IQT publishes a general purpose network graph visualizer called Atlas
:writing_hand: Kathy gives Dash a shout-out on Anaconda’s blog
:computer: ECOSTORE builds an awesome Dash app for battery storage in Germany
:construction_worker_man: @Muntakim is working on Dash-ThreeJS for interactive 3D visualizations
:stars: @Leonardo contributes a double-side sidebar template to the community

All the best,

@chriddyp & @adamschroeder

8 Likes

Thank you for documenting and sharing your position on the WASM matter. To me, this post conveys that ~“Dash aims to serve the enterprise user with data-intensive applications.” If that is the case, then I’d argue that there are better ways of doing so, and that there are easier/ better aims in the first place.

Dashboards shouldn’t be data intensive
Sure, data-intensive computation should take place on a server. However, dashboards (any interface really) should fundamentally be serving lightweight (summary data/ results/ 100-row previews) to end users while offloading heavy work to remote systems wherever possible via callbacks. That 1GB example you mention should be pointing to an S3 bucket via a pre-signed URL, not passing through a Dash app. Which brings me to my next point.

Plotly doesn’t have a competitive advantage in data capabilities
Even if data-intensive analysis is the correct strategy, where are Dash Enterprise’s cloud service/ datastore connectors and batch processing capabilities? If you Google “dash enterprise aws batch” or “dash plotly aws lambda” - nothing comes up. The celery-based (very webapp-centric way of thinking) k8s jobs implementation just reiterates the VPC/ on-prem enterprise theme.

Embrace your competitive advantage
Plotly has a unique core competence in that it deeply understands front-end development and is able to delivery intuitive APIs that enable developers to build things they wouldn’t otherwise be capable of. Plotly is better positioned that anyone else in the world to attack the WASM space. To me, you guys are front-end people, not data people. Why? (a) you chose celery, (b) you’re developing low-code features instead of data-focused features, (c) the data table components would be way better if you were, (d) there would be way more data-focused components, (e) you made Plotly+Dash and they are amazing!

Serve & monetize the community you’ve built
People (the massive, non-enterprise Plotly community) want Plotly to provide an easy, reasonably priced way to serve their dashboards into production. Why the hesitance to build a platform to enable the community you have worked so hard to build? Don’t let it go to waste! Streamlit’s acquisition means they will be dragged into enterprise work, which will open up the SaaS dashboard space again. Marketplace; let developers sell subscriptions to apps that are served on your platform and take a cut. Tap into assets of production that you don’t own and stop getting bogged down in 1-1 linear growth enterprise engagements that must be staffed.

WASM should be a way to get Dash apps in the hands of everyone for every use case (without running up Plotly’s cloud costs). Charge app developers based on the amount of static content served or some other API metering. It should be Plotly’s chance to definitely codify/ monopolize what the next 2 decades of realtime app development and the future of the internet looks like.

2 Likes