Parse user uploaded hdf5?

I’m building a Dash web app that needs to allow a user to upload some of their own data. Due to previous versions of this web app (not built with Dash), we require users to upload “.hdf5” binary files, which have a pandas DataFrame stored under an agreed-upon key.

I’m looking at the tutorials for how to use the Upload component, and I want to store the dataframe as a DataTable in a hidden Div component (so that the data can be shared with other callbacks).

However, I’m not sure how to decode the hdf5 file. I keep encountering this message from pandas.read_hdf:

“Support for generic buffers has not been implemented.”

Below is the relevant code:

@app.callback(Output('table', 'rows'),
              [Input('upload-data', 'contents'),
               Input('upload-data', 'filename')])
def store_user_data(contents, filename):
        content_type, content_string = contents.split(',')
        print("content_type = {}".format(content_type))
        decoded = base64.b64decode(content_string)
        print("decoded...")
        try:
                if 'hdf5' in filename or 'h5' in filename:
                        df = pd.read_hdf(io.BytesIO(decoded), key='rpkm')
                        print("df.shape = {}".format(df.shape))
        except Exception as e:
                print(e)
                return html.Div(['There was an error processing this file.'])
        
        return df.to_dict('records')

In case anyone else has this issue, here is how to decoded an HDF5 file that you get from a dcc.Upload component:

1 Like