Show and Tell - dash-uploader (Upload large files)

Hi,
Firstly thanks for this amazing alternative to dcc.Upload. I wanted to understand one problem I am facing.
So I have created a multi tab app using dash and in the first tab, I am uploading data using dash uploader. This data is then being used in other tabs via different callbacks and dcc.Store. One problem that I am facing is when I upload data using dash uploader in Tab1 and switch to Tab 2 and then switch back to Tab 1, the data is not there anymore in the dash uploader button(evident as it shows Upload complete which now disappears). I think in normal dash core components, using persistence alongside persistence-type provides a solution to this problem but I wanted to know what is the solution while using dash uploader.

Thanks and regards!

Iā€™m not sure what you mean, but if you mean that you would like to change the apprearance of the component, all what you can do now is to try to use CSS to make it look like you want. (or, make changes to the source code)

1 Like

Happy to hear you like it!

Yeah you might be right that the dash uploader Upload component is reloaded when tab is changed. I havenā€™t tested it on a app with tabs myself.

The data uploaded by dash uploader sitting on the hard disk, and does not move anywhere even if you switch tabs in the UI.

Currently I am storing the data uploaded in a dcc.Store component using the folowing code

@du.callback(
    output=Output('store-data', 'data'),
    id='upload-data',
)
def store_data(filenames):
    data1 = pd.read_excel(filenames[0])
    return data1.to_dict('records')

and then I am using this data in other callbacks by simply reading the data using pandas -

@app.callback(
        Output('output-datatable', 'children'),
        Input('store-data','data'),
        prevent_initial_call = True
    
)
def mwe(data):
    df = pd.DataFrame(data)

and then perform whatever I want to do on this df dataframe and return a final result to be stored in the ā€˜output-datatableā€™ and similarly I am updating dropdowns based on which graphs would be plotted using the following code snippets for reference -

@app.callback(
    Output("xaxis-data","options"),
    Input("store-data", "data")
)
def update_dropdown(data):
    df = pd.DataFrame(data)
    return [{'label':x, 'value':x} for x in df.columns.unique()]

    
@app.callback(
    Output("yaxis-data","options"),
    Input("store-data", "data")
)
def update_dropdown(data):
    df = pd.DataFrame(data)
    return [{'label':x, 'value':x} for x in df.columns.unique()]


@app.callback(Output('output-graph', 'children'),
              Input('submit-button','n_clicks'),
              State('store-data','data'),
              State('xaxis-data','value'),
              State('yaxis-data', 'value'))
def make_graphs(n, data, x_data, y_data):
    df = pd.DataFrame(data)
    if n is None:
        return no_update
    else:
        bar_fig = px.bar(df, x=x_data, y=y_data)
        # print(data)
        return dcc.Graph(figure=bar_fig)

The problem is for the first time it does work but after I switch tabs and I come back to the Tab the dash uploader is in, I can see that the data is gone as shown by the progress bar which now reads ā€œUpload dataā€ instead of ā€œUpload completedā€ which it was earlier showing. Also all the dropdowns are empty. Although I have tried using persistence in the dropdowns and dash tables it is still happening and I think that is because the original source of data is not there after switching tabs. It would be great if you can provide your inputs on this.

After uploading the data using dash uploader, I am storing it in a dcc.Store components through which I am accessing the data using pandas. Based on these Store components I am updating dropdowns, graphs, data tables. The problem is that the first time things are getting done but once I start switching between the tabs, the data disappears as can be seen from the progress bar of the dash uploader which now reads - ā€œUpload dataā€ instead of ā€œUpload completeā€ as shown earlier. Similarly the dropdowns based on the dcc.Store also become empty. Any inputs?

Thanks and regards!

After uploading a zip file with dash-uploader, sometimes it happens that the upload stop and nothing happens. No logs are displayed, and this issue occurs with small or bigger zipfile (max 100mb). Is it a problem from my server or is it an issue from dash-uploader ?

Iā€™m using the version 0.6.0

How to upload files from a dash app to an AWS S3 bucket using dash-uploader? Does anyone have a small working example?

Thanks!

Hi @fohrloop
Can I upload/send the files to an external API instead of saving them to the serverā€™s hard disk?

Hello, I have a question about using dash-uploader.

I am currently using dash-uploader to upload files and convert them to pandas dataframes for data processing.

My question is whether there is a way to convert the uploaded files to pandas dataframes without copying them to a specific folder.

Since my data is large in size, in the order of gigabytes, dash-uploader takes a long time to copy the files during the upload process. To solve this problem, I need to convert the files to pandas dataframes without copying to given folder.

If it is possible, please let me know how to set it up in detail.

my some parts of code is like bellow

Thank you.

 du.configure_upload(app, "/tmp/Uploads")
 
layout =  du.Upload(
            id="upload-data",
            max_file_size=5000,  # 5GB
            filetypes=["csv"],
            max_files=1,
        )


        @du.callback(
            [
                Output("modal-header", "children", allow_duplicate=True),
                Output("modal-upload-summary",
                       "children", allow_duplicate=True),
            ],
            id="upload-data",
        )
        def data_upload_completion(status: du.UploadStatus):
            if len(status.uploaded_files) != 1:  # no file selected
                raise exceptions.PreventUpdate
            df = pd.read_csv(status.uploaded_files[0])
            modal_header = f"{os.path.basename(status.uploaded_files[0])}"
            return df_name, df.info()

``

I donā€™t suppose there are any plans to support current version of Dash. I am too invested in the current version but still need an upload solution for large files.

Iā€™m not currently using dash-uploader (or Dash) and I have no short term plans to develop it further. The project has been looking for maintainer(s) for some months. There was a set of people interested in helping but there has been no activity recently.

1 Like

Fair enough. I have started using Uppy.js and itā€™s working., apart from restricting file types by setting the dashboard componentā€™s ā€˜allowedFileTypesā€™ parameter.