Dash Upload to AWS S3 Bucket

I saw a few posts on how to upload files from a dash app to an AWS S3 bucket that didn’t get answered so I created a working example of one way to do this. AWS S3 buckets have a feature called Presigned URLs that allow you to securely and quickly upload any kind of file directly to your S3 bucket while also keeping your bucket private.

In order to do this, you will need an AWS account, a beginners knowledge of how IAM works in AWS and the boto3 python package.

app.py

from dash import Dash, html, dcc, Output, Input, callback, State
import base64
import boto3
import logging
from botocore.exceptions import ClientError
import requests

app = Dash(__name__)

app.layout = html.Div([dcc.Upload(
                id='upload',
                children=html.Div([
                    'Drag and Drop'
                ]),
                style={
                    'lineHeight': '60px',
                    'borderWidth': '1px',
                    'borderStyle': 'dashed',
                    'borderRadius': '5px',
                    'textAlign': 'center'
                },
            ),
            html.Div(id='output-upload')
])


@callback(Output('output-upload', 'children'),
          Input('upload', 'contents'),
          State('upload', 'filename'),
          State('upload', 'last_modified'),
          prevent_initial_call=True)
def update_output(content, name, date):

    # the content needs to be split. It contains the type and the real content
    content_type, content_string = content.split(',')
    # Decode the base64 string
    content_decoded = base64.b64decode(content_string)

    message = upload_file(content_decoded, name, date)

    if message is not None:
        return f"{message.status_code} - {message.reason}"


def create_presigned_post(bucket_name, object_name):
    """Generate a presigned URL S3 POST request to upload a file

    :param bucket_name: string
    :param object_name: string
    :param expiration: Time in seconds for the presigned URL to remain valid
    :return: Dictionary with the following keys:
        url: URL to post to
        fields: Dictionary of form fields and values to submit with the POST
    :return: None if error.
    """

    # Generate a presigned S3 POST URL
    s3_client = boto3.client('s3')

    try:
        response = s3_client.generate_presigned_post(bucket_name,
                                                     object_name,
                                                     ExpiresIn=3600)
    except ClientError as e:
        logging.error(e)
        return None

    # The response contains the presigned URL and required fields
    return response


def upload_file(contents, filename, date):

    result = create_presigned_post("whatever-you-called-your-bucket", filename)

    if result is not None:
        #Upload file to S3 using presigned URL
        files = {'file': contents}
        r = requests.post(result['url'], data=result['fields'], files=files)

        return r


if __name__ == '__main__':
    app.run_server(
        debug=True
    )

If you are using render to deploy your app, you can add your AWS user (the one that this app can use to work with S3) credentials access keys to the environment variables of your web service on render.

1 Like

Nice suggestion, though this solution is not enough if your app is running on a remote server. Given that you execute the upload from a callback, this means that you were able to get the uploaded data on the server.

For example, AWS imposes a 6MB size limit on each individual HTTP request. So when your app runs on an AWS lambda, you have to make sure that the data being transferred from the browser to the server and back, upholds that size limit. This means that whatever data you upload via dcc.Upload fits inside the AWS data limit.

It would be great if we can use the S3 to circumvent this size limit and send the large data file directly from the browser to the S3. Unfortunately, I don’t have a solution for this yet.

What if we convert your example to a clientside callback, I guess that should work. Would it be secure to do that though, or do you leave your S3 vulnerable for attacks?