Dash + aws API gateway + aws lambda

I am using aws sam here.
Please solve my problem

A request will come from UI and first hits the API gate way and that event will be sent to aws lambda function and it extracts the query params having group here and gives it to build_app(group) and will quey the data from redshift by giving it create_df(group) containg the code which queries the data based on the group and returns the dataframe and based on that build_app() will plot graph

The problem I am facing

I have a problem here. When I am giving event to lambda it should go to build_app and query the data from the redshift it will give df and based on the df, it plots the graph. Now build_app(group) returns app. How can I start the dash server here. I have tried all the approaches if it build_app() and I am harcoding the query in db_con() it is working. As it starts the dash server directly. But when it takes parameters build_app(group_filter) the webpage it showing is loading and 403 error when I inspect the page showing failed to load resource. I hope plotly team will solve my problem.

Please go thorigh the below code. I have explained where problem is coming

This is app.py code

import json

import dash

from dash import dcc,html
from functools import lru_cache

from apig_wsgi import make_lambda_handler

from dash_app import build_app
from dbcon import create_df


# group_fil = None

@lru_cache(maxsize=5)
def build_handler(url_prefix: str,group_filter:str) -> "Dash":


    print("URL prefix:",url_prefix)
    # If there's not prefix, it's a custom domain
    if url_prefix is None or url_prefix == "":

        print("Inside IF block:")
        return make_lambda_handler(wsgi_app=build_app(group_filter=group_filter).server, binary_support=True)

    # If there's a prefix we're dealing with an API gateway stage
    # and need to return the appropriate urls.

    print("Outside IF block:")
    return make_lambda_handler(
        wsgi_app=build_app(group_filter).server,
        binary_support=True,
    )


def get_raw_path(apigw_event: dict) -> str:
    """
    The "raw" path that was requested (i.e. including the stage prefix) is hidden
    under the requestContext object.
    """

    return apigw_event.get("requestContext", {}).get("path", apigw_event["path"])




def get_url_prefix(apigw_event: dict) -> str:
    """
    Returns the stage url prefix if the request arrives from the API Gateway and
    an empty string if it arrives via a custom domain.
    """

    apigw_stage_name = apigw_event["requestContext"]["stage"]
    prefix = f"/{apigw_stage_name}/"
    raw_path = get_raw_path(apigw_event)

    if raw_path.startswith(prefix):
        return prefix

    return ""


def lambda_handler(
    event: dict[str, "Any"], context: dict[str, "Any"]
) -> dict[str, "Any"]:

    # global group_fil
    print("Event:", event)

    # We need the path with the stage prefix, which the API gateway hides a bit.
    event["path"] = get_raw_path(event)
    query_params = event.get("queryStringParameters", {})

    if query_params is None:  # Check if query_params is None
        query_params = {}  # If None, assign an empty dictionary

    print("query_params:", query_params)


    
    print("query_params:",query_params)
    # Extract the filter parameter, defaulting to 'Intuceo' if not provided
    group_filter = query_params.get("group", "group")
    print("group_filter:",group_filter)
    
    # group_fil = group_filter

    handle_event = build_handler(get_url_prefix(event),group_filter=group_filter)

    response = handle_event(event, context)

    print(response)
    return response

This is dash_app.py

import plotly.express as px
import pandas as pd
import dash
import dash_core_components as dcc
import dash_bootstrap_components as dbc
from dash.dependencies import Input, Output
from plotly.subplots import make_subplots
import plotly.graph_objs as go
from dash import Dash
from dash import dcc, html
from datetime import date, datetime
from dash import Dash, html, dcc, Input, Output, callback
from dbcon import redshift_con,create_df

def build_app(group_filter:str="Intuceo",dash_kwargs: dict = None) -> Dash:
        


        dash_kwargs = dash_kwargs or {}

        app = Dash(
            name=__name__,
            **dash_kwargs,
        )



        # fig,df= scatterplot(group_filter=group_filter)
        df=create_df(group_filter=group_filter)

        app.layout = dbc.Container([
            html.A('Open Graph in New Tab', id='graph-link', target='_blank', href='graph.html'),
            dbc.Card([
                dbc.Button('🡠', id='back-button', outline=True, size="sm",
                            className='mt-2 ml-2 col-1', style={'display': 'none'}),
                dbc.Row(
                    dcc.Graph(
                            id='graph',
                            figure=scatterplot(group_filter=group_filter)
                        ), justify='center'
                )
            ], className='mt-3')
        ])



       /*** Graph code--------------****/
    

        return app


def scatterplot(group_filter:str):
/*******. ----scatter plolt funciton---------------***********/
    return fig

if __name__ == "__main__":
    build_app().run(debug=True)

----------------------------> I have a problem here. When I am giving event to lambda it should go to build_app and query the data from the redshift it will give df and based on the df, it plots the graph. Now build_app(group) returns app. How can I start the dash server here. I have tried all the approaches if it build_app() and I am harcoding the query in db_con() it is working. As it starts the dash server directly. But when it takes parameters build_app(group_filter) the webpage it showing is loading and 403 error when I inspect the page showing failed to load resource. I hope plotly team will solve my problem.

db_con.py

import psycopg2
import pandas as pd

def redshift_con():

    conn = psycopg2.connect(
                    dbname="dev",
                    user="awsuser",
                    password="Intuceo345",
                    host="redshift-cluster-1.c5fy6xm1hxjl.ap-south-1.redshift.amazonaws.com",
                    port=5439
                )
    return conn

def create_df(group_filter:str):
    #get_data_command = "SELECT * FROM city_day_data_group_level WHERE group = 'intuceo'"
    get_data_command = f"SELECT * FROM city_day_data_group_level_dash WHERE group_level = '{group_filter}'"
    #get_data_command = "SELECT * FROM city_day_con"
    
    conn =redshift_con()
    cursor= conn.cursor()
    cursor.execute(get_data_command)


    print ("Query executed succefully:", get_data_command)

    rows = cursor.fetchall()
    
    column_names = [desc[0] for desc in cursor.description]
    
    # Create DataFrame from fetched rows with column names
    rows = [list(map(str, row)) for row in rows]
    df = pd.DataFrame(rows, columns=column_names)

    float_conv = ['pm2.5', 'pm10','no','no2','nox','nh3','co','so2','o3','benzene','toluene','xylene','aqi']
    string_conv = ['city', 'aqi_bucket', 'group_level']

    # Convert columns to numeric, coercing errors to NaN
    df[float_conv] = df[float_conv].apply(pd.to_numeric, errors='coerce')
    df[string_conv] = df[string_conv].astype(str)

    df['date'] = pd.to_datetime(df['date'])

    print(df.info())
    
    conn.commit()

    return df

Hey @Saketh welcome to the forums!

Thats a lot of code to read. Chances are quite high, that you wont get a lot of responses. Keep in mind, that time is limited.

Try breaking your code down to a minimal example which reprodeces your issue by deleting not relevant stuff.

Finally, please avoid tagging people directly. This usually disencurages others to help you. :hugs:

Are there situations where your code is working, e.g. when group_level is the default value of "group"?

HTTP error 403 is a Forbidden error, which indicates that your accessing a resources to which you don’t have access. Are your credentials correct? Is the query you execute correct, or does is somehow tries to contact parts of the database that are off-limits for you?

P.S. You included a username and password in your post, are those real? If so, change your password right now.

Some other notes on your code:

  1. context in lambda_handler is not a dict, it is a Python object. Doesn’t matter much, but still good to know
  2. The get_url_prefix function seems a bit redundant. You already extract the path in lambda_handler, so you could pass it as an input argument to the function directly. Then you don’t need save the path in the event like you do in lambda_handler
  3. build_handler has the exact same response irrespective of what the url_prefix is. Is that intended?

Lastly, I am unfamiliar with the apig_wsgi package. To create the dash response I use the aws-wsgi package:

  • install with
    pip install aws-wsgi
    
  • import as:
    import awsgi
    
  • use with
    awsgi.response(app.server, event, context)
    

The server is never started, awsgi handles everything to generate the proper Dash response.

When creating the app, I do:

app = Dash(
    __name__,
    "serve_locally": False,
    "compress": False,
    "requests_pathname_prefix": pathname_prefix,
    "routes_pathname_prefix": pathname_prefix,
)

My apps are located at URL: <base_url>/pathname_prefix/dash

Whenever I play around with custom urls I use the URL <base_url>/pathname_prefix/<custom_path>
You can contact this URL for example from clientside callbacks in your app.

The custom path is extracted with

path_proxy = event.get("pathParameters", {}).get("proxy", "unknown")

Then you can implement whatever logic you like based on what you get for the path_proxy.

I hope this is helpful for you. It is unclear to me what the problem is that you’re facing. Maybe it’s in the fact that you didn’t specify enough parameters when creating app = Dash(...), maybe it’s because the database if off-limits to you. Hopefully you can try out other things to fix your issue.