I’m trying to connect my Dash app to a Databricks SQL Warehouse, but I’m facing issues when executing SQL queries dynamically through callbacks. Whenever I input a new value to filter the data, the page either crashes or takes too long to load. I’ve tried optimizing the SQL query, but it’s still not working. How can I efficiently handle dynamic SQL queries in Dash without overloading the system or causing timeouts?
For reference, I’ve looked into (web.dash.plotly.com/databricks-integration/sql-connector), but the issue persists. Any advice on improving performance or optimizing queries would be really helpful!
Here’s my current code:
import dash
from dash import Dash, html, dcc, Output, Input, callback
from databricks import sql
import plotly.express as px
import os
app = Dash()
server = app.server
# Set these as environment variables in Dash Enterprise or locally
SERVER_HOSTNAME = os.getenv("SERVER_HOSTNAME")
HTTP_PATH = os.getenv("HTTP_PATH")
ACCESS_TOKEN = os.getenv("ACCESS_TOKEN")
DB_NAME = "plotly_iot_dashboard"
TABLE_NAME = "silver_users"
app.layout = html.Div(
[
dcc.Loading(dcc.Graph(id="sample-chart")),
dcc.Input(id="val-selector", type="number", min=20, max=80, step=5, value=40),
],
style={"background-color": "white", "height": "100vh"},
)
@callback(Output("sample-chart", "figure"), Input("val-selector", "value"))
def create_chart(selected_val):
connection = sql.connect(
server_hostname=SERVER_HOSTNAME, http_path=HTTP_PATH, access_token=ACCESS_TOKEN
)
cursor = connection.cursor()
cursor.execute(
f"SELECT * FROM {DB_NAME}.{TABLE_NAME} WHERE age > {selected_val} LIMIT 100"
)
df = cursor.fetchall_arrow()
df = df.to_pandas()
cursor.close()
connection.close()
return px.scatter(df, x="height", y="weight")
if __name__ == "__main__":
app.run(debug=True)
Problem: The app works fine when first loaded, but as I adjust the filter (the input value), the queries seem to take forever to execute or the app crashes. What can I do to improve performance and avoid timeouts?