Handling Slow or Crashing Dash Apps with Dynamic SQL Queries in Databricks

JackShaw · January 31, 2025, 10:59am

I’m trying to connect my Dash app to a Databricks SQL Warehouse, but I’m facing issues when executing SQL queries dynamically through callbacks. Whenever I input a new value to filter the data, the page either crashes or takes too long to load. I’ve tried optimizing the SQL query, but it’s still not working. How can I efficiently handle dynamic SQL queries in Dash without overloading the system or causing timeouts?

For reference, I’ve looked into (web .dash.plotly.com/databricks-integration/sql-connector), but the issue persists. Any advice on improving performance or optimizing queries would be really helpful!

Here’s my current code:

import dash
from dash import Dash, html, dcc, Output, Input, callback
from databricks import sql
import plotly.express as px
import os

app = Dash()
server = app.server

# Set these as environment variables in Dash Enterprise or locally
SERVER_HOSTNAME = os.getenv("SERVER_HOSTNAME")
HTTP_PATH = os.getenv("HTTP_PATH")
ACCESS_TOKEN = os.getenv("ACCESS_TOKEN")

DB_NAME = "plotly_iot_dashboard"
TABLE_NAME = "silver_users"

app.layout = html.Div(
    [
        dcc.Loading(dcc.Graph(id="sample-chart")),
        dcc.Input(id="val-selector", type="number", min=20, max=80, step=5, value=40),
    ],
    style={"background-color": "white", "height": "100vh"},
)

@callback(Output("sample-chart", "figure"), Input("val-selector", "value"))
def create_chart(selected_val):
    connection = sql.connect(
        server_hostname=SERVER_HOSTNAME, http_path=HTTP_PATH, access_token=ACCESS_TOKEN
    )
    cursor = connection.cursor()
    cursor.execute(
        f"SELECT * FROM {DB_NAME}.{TABLE_NAME} WHERE age > {selected_val} LIMIT 100"
    )
    df = cursor.fetchall_arrow()
    df = df.to_pandas()

    cursor.close()
    connection.close()

    return px.scatter(df, x="height", y="weight")

if __name__ == "__main__":
    app.run(debug=True)

Problem: The app works fine when first loaded, but as I adjust the filter (the input value), the queries seem to take forever to execute or the app crashes. What can I do to improve performance and avoid timeouts?

Datenschubse · January 31, 2025, 11:37am

Hi @JackShaw,

Do you know how long your queries take on avg? Most web-server have a timeout of 30 secs. So. if your query takes longer then that, the server will terminate the connection. Does your app already crash in your local dev environment or only when deployed? You can set the timeout in Gunicorn for example.

If your queries are under 30 secs, you may want to check your memory usage, maybe the query result surpasses your server specs? Hard to say without some stats.

What I do in dashboards to improve performance is to cache n permutations based on the filters. But this wouldnt directly solve your problem when the query has a cache miss.

Hope this helps, if not try to provide some informations like memory size, dataframe dimension and execution time.

Here is a simple wrapper to track execution times:

def exec_timer(f):
    @functools.wraps(f)
    def wrapper(*args, **kwargs):
        start_time = time.time()
        result = f(*args, **kwargs)
        end_time = time.time()
        print(wrapper.__name__, datetime.now(), flush=True)
        print(f'Execution of function {wrapper.__name__} took {end_time - start_time} seconds.')
        return result
    return wrapper

@exec_timer 
def expensive_function():
    pass

you could a print to display the result memory size.

Kind regards,
Christian

Topic		Replies	Views
Load the data using databricks sql connector in a optimized way Dash Python	2	241	January 18, 2024
Using callbacks to query SQL Server not working Dash Python	1	532	June 2, 2020
Dash callback stops working when interacting with sqlite Dash Python	5	1130	April 13, 2020
Dash App - Run & Loading Slowly Dash Python	7	394	May 30, 2023
Progress Bar for SQL Query Dash Python	5	1360	January 9, 2023

Handling Slow or Crashing Dash Apps with Dynamic SQL Queries in Databricks

Related topics