I’ve searched far and wide for a solution to this and have found a few close answers, but have not been able to apply them to this situation successfully.
I saw a presentation of Dash that allows users to select what axes they want to supply to a plotly_express graph at a meetup (https://www.youtube.com/watch?v=5Cw4JumJTwo) and wanted to take that idea a step further. Instead of predefining the data set being explored with this tool, what if I allowed the user to select what data to explore via SQL? (yes they can load in a csv file, but for this use case I need a direct SQL pull into Dash)
How can I allow users to pull in a data set from a database and have the dropdown options dynamically change to be populated with the new column names of the new data set pulled? As far as I know there are 2 issues with the code below:
- The dcc.Dropdown I’m using now is incorrectly being supplied the col_options from the data pull and it’s not clear why.
dash.exceptions.InvalidCallbackReturnValue: Invalid number of output values for …x.options…y.options…color.options… Expected 3 got 47 - I believe the dataset is needed globally in order to know what data to supply the dcc.Graph and the plotly_epress call. Do I make the SQL pull data globally accessible? I know this is bad in a multi-user context as shown in https://dash.plot.ly/sharing-data-between-callbacks under " Why
global
variables will break your app".
What are the alternatives?
import dash
import dash_html_components as html
import dash_core_components as dcc
from dash.dependencies import Input, Output
import plotly_express as px
import pandas as pd
import pyodbc
# Try to see if data is already globally defined. Once db query is run I'm hoping `data` will be replaced
try:
print(data.columns)
# It won't be defined on first run, so we'll load the data of the last query ran
except:
print('Pull new data')
# List of user inputs that col_options will populate
dimensions = ['x', 'y', 'color']
app = dash.Dash(__name__)
app.layout = html.Div([
html.Div(
# Supply any query using any tables in the db
[dcc.Input(
id='query',
placeholder='SQL Query',
value="""SELECT TOP 100 * FROM base.Visit"""
)] +
# Dropdowns with dynamic column options generated by most recent data pull
[html.P([d + ":", dcc.Dropdown(id=d, options=[])]) for d in dimensions]
),
dcc.Graph(id='main_graph')
])
@app.callback([Output(d, 'options') for d in dimensions],
[Input('query', 'value')])
def refresh_db_data(query):
connection = pyodbc.connect('Driver={ODBC Driver for SQL Server};'
'Server=my-sqlserver;'
'Database=my-db;'
'uid=username;' # Dont actually do this! Just for demonstration
'pwd=password;' # Dont actually do this! Just for demonstration
'Authentication=ActiveDirectoryPassword'
)
# Pull data into a pandas DataFrame using user inputs (db credentials and query) and save
db_data = pd.read_sql(query, connection)
# Close connection
connection.close()
# Make the data accessable
global data
data = db_data
# Dynamically define column options for `dcc.Dropdown` from most recent data pulled
col_options = [dict(label=x, value=x) for x in db_data.columns]
# Send back new col_options to `Dropdown` options
return col_options
@app.callback(Output('main_graph', 'figure'),
[Input(d, 'value') for d in dimensions])
def update_plot(x, y, color):
return px.scatter(data, x=x, y=y, color=color)
app.run_server()
Edit: Sorry for all the edits, it’s my first post and I’m continuing to try solutions to the problem above…learning as I go