All data science products I’ve built were for internal use, so my perspective is biased.
Our data-apps typically have:
- few concurrent users (a popular data product will get 10 users per day, with 2-3 concurrent)
- slow initialisation or slow computations:
- ui elements (forms), calculate/run button
- loading the data from a data lake
- doing stuff to data (could be as heavy as fitting MCMC)
- data exploration, once data is computed
- written by data scientists, who are not web developers
Therefore other shiny-insipired python solutions work quite well: jupyter dashboards and bokeh-server. There is a server session for each client. Writing server is very easy for data scientists in my team - you can just copy-paste code from a notebook, add boiler-plate and UI callbacks.
After initial excitement with Dash (it is super cool to build a web app 100% in Python!), I am concerned that session-less architecture in dash is not well suited for data science products, that have a heavy computation layer (fitting a model, running simulations, accessing data lakes with spark/MR jobs) and several outputs (charts).
If the goal of dash was to make shiny for Python - I am not sure if the underlaying architecture is suitable. While caching in Redis/RDS/disk is possible, I am concerned that it would be too much for a regular data scientists, who just want to copy-paste code from jupyter notebook.
Are there any plans to add server sessions, to mimic workflow that is well understood by data scientists and speed-up interactions? What might be a workaround to add “server session” in the current architecture?