This is awesome, @emil! Really psyched you’re exploring this.
OK here are lots of scattered thoughts. Sorry for the length… if I had more time I would’ve written a shorter letter
Functional Data Transformations
One API idea I was excited about at one point was having inputs & outputs be functional data transformations. So you could write stuff like:
@callback(Output('figure', extend('figure', 'data.0.y')), Input('interval', 'n_interval'))
def update(_):
df = get_latest_data()
return df['sales']
which would extend the figure.data[0].y
property of the figure.
So then the question becomes, what grammar do we support beyond extend
?
With the nested property, you could also have this as inputs, so:
@callback(Output('figure', 'data.*.marker.color'),
Input('darken', 'n_clicks'),
State('figure', 'data.*.marker.color'))
def update(existing_colors):
new_colors = []
for color in existing_colors:
new_colors.append(darken(color))
return new_colors
*
might not be generic enough. We might consider looking at the jq
syntax for a more generic data accessing grammar.
Functional Grammar - Inspired by Ramda
At one point, I was thinking we could expose/adopt the Ramda API, which basically allows you to do any sort of functional transformations to data in a single expression: Ramda Documentation
The cool thing about ramda, and other functional paradigms, is that you construct the data transformation expression in a way that you can always “call”/“apply” with a value. So instead of having
my_list.append(5)
you write:
append(my_list)(5)
or in a more complex scenario:
extend(figure['data'][0]['x'], [1, 2, 3])
you write something like
concat(lensPath('data', 0, 'x'))(figure)([1, 2, 3])
(See in Rambda sandbox)
which lends itself well to what we’re doing where you write the data transformation expression in the callback and then “call”/“apply” with two things: 1. The component’s property, 2) callback’s return value. So generically, Dash’s front-end “simply” runs:
new_component_value = expression(property_value)(callback_output)
expression = extend(lensPath('data', 0, 'x'))
property_value = figure
callback_output = [1, 2, 3]
new_component_value = expression(property_value)(callback_output)
And the simple case of Output('my-graph', 'figure')
becomes “shorthand” for some simple operation like Output('my-graph', 'set(figure)')
Someone even wrote a Python Ramda, which could be a nice tool for debugging these expressions.
Ramda, and other functional data transformation systems like this, are nice because you can easily serialize them (it’s just one big expression) and execute them in JS. And enough thought has been put into their grammar to basically allow for any possible data expression within a single, perhaps very nested, command.
(Here’s a great read on these functional paradigms: Mostly adequate guide to functional programming)
Which Transformations?
Ramda may be too abstract. And we probably want some shorthands especially for simple accessors like data.0.x
(lensPath
will probably scare people away!).
The main things I can think of folks needing would be:
- Single value accessors: Being able to target any single part of a data structure
- Wildcard accessors: Being able to target patterns of a data structure, like
figure.data.*.marker.color
. - Lists: append, extend, replace slice, access slice
- Strings: set, regex replace, concacentate, suffix, prefix
- Dictionaries: set single value, merge, replace dictionary
- Numbers: set, math operations?
- List of dictionaries (e.g. list of records in
data
in datatable): Extract a column from a list of dictionaries, aka[row['x'] for row in data]
. This can be nicely expressed in rambda viapluck
: Ramda Documentation
Ah, the state machine
The beauty of most Dash apps is that they are completely defined by the current set of inputs on the page. If you opened up the DAG and fired all of the callbacks with the current set of inputs you’d get the same outputs every single time.
When we start introducing appending and extending transformations, the output can be defined based off of the number of times the transformation has been called - The current output is sort of implicitly the “State
” that is being applied on. This isn’t a bad thing, but it’s just kinda an interesting framework to think about things.
Now we already have this model in Dash with State
and applying the data transformations in a callback. All we’re doing is basically providing some formalism (and way better performance!) around things like this:
@callback(Output('mygraph', 'figure'), Input('button', 'n_clicks'), State('mygraph', 'figure'))
def update(_):
figure['data'][0]['y'].extend(get_data())
return figure
which could be written now as e.g.
@callback(Output('mygraph', extend('figure.data.0.y'), Input('button', 'n_clicks'))
def update(_):
return get_data()
Initial State
Related to above, one complexity is around initial state. If we have callbacks that are focussed on transforming things, then how does a user set the original state of the property? Seems like setting it as part of the layout is the way to go. Either in app.layout
or whatever callback returned the component in the first place. So:
app.layout = html.Div([
html.Button(id='button', n_clicks=0),
dcc.Graph(id='mygraph', figure=px.line(df, x='time', y='price', color='stock'))
])
@callback(Output('mygraph', extend('figure.data.0.y'), Input('button', 'n_clicks'))
def update(_):
return get_data()
Or even:
app.layout = html.Div([
html.Button(id='display', n_clicks=0),
html.Div(id='content')
])
@callback(Output('content', 'children'), Input('display', 'n_clicks'))
def update(_):
return html.Div([
html.Button(id='button', n_clicks=0),
dcc.Graph(id='mygraph', figure=px.line(df, x='time', y='price', color='stock'))
])
@callback(Output('mygraph', extend('figure.data.0.y'), Input('button', 'n_clicks'))
def update(_):
return get_data()
Resetting
Similar to above, what about reseting & transforming the property? Two callbacks?
app.layout = html.Div([
html.Button('Refresh data', id='refresh-button', n_clicks=0),
dcc.Dropdown(['a', 'b', 'c'], id='dropdown'),
dcc.Graph(id='mygraph')
])
@callback(
Output('mygraph', 'figure'),
Input('dropdown', 'value')
)
def update(_, value):
return px.scatter(get_latest_data(value), x='time', y='price')
@callback(
Output('mygraph', extend('figure', 'data.0.x')),
Output('mygraph', extend('figure', 'data.0.y')),
Input('refresh-data', 'n_clicks'),
State('dropdown', 'value')
)
def update(_, value):
df = get_latest_data(value)
return [df['x'], df['y']]
If it’s multiple callbacks, then what’s the resolution order for the DAG? Can we do some rule like “first call the callback without the transformations, and then call the callback with the transformations?” Feels a little ugly.
An alternative syntax would be combining into a single callback, either with the data transformations in the Output or returning a transformation.
Transformations in the callback:
app.layout = html.Div([
html.Button('Refresh data', id='refresh-button', n_clicks=0),
dcc.Dropdown(['a', 'b', 'c'], id='dropdown'),
dcc.Graph(id='mygraph')
])
@callback(
Output('my-graph', 'figure'),
Output('mygraph', extend('figure', 'data.0.x')),
Output('mygraph', extend('figure', 'data.0.y')),
Input('refresh-data', 'n_clicks'),
State('dropdown', 'value')
)
def update(_, value):
df = get_latest_data(value)
if ctx.triggered == 'dropdown.value':
return [px.scatter(df, x='time', y='price'), no_update, no_update]
else:
return [no_update, df['x'], df['y']]
Or, returning transformations:
app.layout = html.Div([
html.Button('Refresh data', id='refresh-button', n_clicks=0),
dcc.Dropdown(['a', 'b', 'c'], id='dropdown'),
dcc.Graph(id='mygraph')
])
@callback(
Output('my-graph', 'figure'),
Input('refresh-data', 'n_clicks'),
State('dropdown', 'value')
)
def update(_, value):
df = get_latest_data(value)
if ctx.triggered == 'dropdown.value':
return px.scatter(df, x='time', y='price')
else:
return [
extend('data.0.x', df['x']),
extend('data.0.y', df['y'])
]
Transformations in layout
In the Dash model, anything that you can return in an output is something that you can set in the layout. So if we allow returning transformations in a callback, then you should be able to set them in the layout.
But this is a little weird because transformations depend on a value already being defined (you’re transforming something!). In the previous example, that order-dependent logic is defined within the callback. But I think this can be defined functionally too with like a “default” or “if is none” transformation.
app.layout = html.Div([
dcc.Graph(figure=
ifNone(
default=px.scatter(df, x='time', y='price'),
else=extend('data.0.y', df['y'])
)
)
])
This isn’t that useful on it’s own, but imagine if you could also reference other properties on the page within this transformations. Then you could define clientside data transformations between components without callbacks which is pretty neat.
app.layout = html.Div([
dcc.Store(id='store', data=df.to_dict('records')),
dcc.Graph(figure={
'data': [{
'x': get('store.*.date'),
'y': get('store.*.price'),
}]
})
])
This is similiar to a prototype I wrote a few years ago in Dash Clientside Transformations by chriddyp · Pull Request #142 · plotly/dash-renderer · GitHub (note the ramda example!) and discussed recently in Improving on All-in-One (AIO) components - #9 by chriddyp