✊🏿 Black Lives Matter. Please consider donating to Black Girls Code today.
🐇 Announcing Dash VTK for 3d simulation graphics. Check out the March webinar.

Uploading CVS file and rendering plot (global scope for Pandas data frame)

Hello Dash Community,

I’m new to the Plotly and Dash frameworks. I originally asked how to upload a CSV file and render a bar plot (see How to upload a csv file and render a bar plot). I subsequently answered my own question.

Now, I am trying to render multiple plots and have become stuck on the correct design pattern.

Here is my data:

df = pd.DataFrame({'Make':['Ford', 'Ford', 'Ford', 'Buick', 'Buick', 'Buick', 'Mercedes', 'Mercedes', 'Mercedes'],
                          'Score':['88.6', '76.6', '86.2', '79.1', '86.8', '96.4', '97.3', '98.7', '98.5'],
                          'Dimension':['Speed', 'MPG', 'Styling', 'Speed', 'MPG', 'Styling', 'Speed', 'MPG', 'Styling'],
                          'Month':['Apr-19', 'Apr-19', 'Apr-19', 'Apr-19', 'Apr-19', 'Apr-19', 'Apr-19', 'Apr-19', 'Apr-19']})

Here is my code:

import base64
import datetime
import io
import dash
from dash.dependencies import Input, Output, State
import dash_core_components as dcc
import dash_html_components as html
import plotly.express as px
import plotly.graph_objects as go
import dash_table
import pandas as pd


app = dash.Dash()

app.layout = html.Div([
dcc.Upload(
        id='upload-data',
        children=html.Div([
        'Drag and Drop or ',
        html.A('Select Files')
        ]),
        style={
        'width': '100%',
        'height': '60px',
        'lineHeight': '60px',
        'borderWidth': '1px',
        'borderStyle': 'dashed',
        'borderRadius': '5px',
        'textAlign': 'center',
        'margin': '10px'
         },
        # Allow multiple files to be uploaded
        multiple=True
),

html.Div(id='output-data-upload'),
])

def parse_contents(contents, filename, date):
    content_type, content_string = contents.split(',')

    decoded = base64.b64decode(content_string)
    try:
        if 'csv' in filename:
        # Assume that the user uploaded a CSV file
            df = pd.read_csv(
                io.StringIO(decoded.decode('utf-8')))
        elif 'xls' in filename:
        # Assume that the user uploaded an excel file
            df = pd.read_excel(io.BytesIO(decoded))
    except Exception as e:
        print(e)
        return html.Div([
            'There was an error processing this file.'
        ])

    return html.Div([

        dcc.Graph(
            figure = go.Figure(data=[
            go.Bar(name=df.columns.values[0], x=pd.unique(df['Make']), y=df['Score'], text=df['Score'], textposition='auto'),
            ])
            ),        


    ])

@app.callback(Output('output-data-upload', 'children'),
              [Input('upload-data', 'contents')],
              [State('upload-data', 'filename'),
               State('upload-data', 'last_modified')])
def update_output(list_of_contents, list_of_names, list_of_dates):
    if list_of_contents is not None:
        children = [
            parse_contents(c, n, d) for c, n, d in
            zip(list_of_contents, list_of_names, list_of_dates)]
        return children

if __name__ == '__main__':
    app.run_server(debug=True)

The parse_contents function parses the contents AND returns the bar plot. It works, but it’s not very scalable in the event that I want to render 5-10 plots.

I’ve tried to have the parse_contents function just return the pandas data frame and then reference that data frame when creating several bar plots. This approach didn’t work because the data frame wasn’t in the global scope (similar to the post in Callback confusion which reads):

“It looks like you’re defining df in the scope of your parse_contents function, not globally, so when you instantiate your Dash components you’re trying to reference a variable that doesn’t exist in its scope. (Actually when Python gets to that point, parse_contents won’t have been executed yet either, just defined, so it doesn’t exist yet in any scope.)”

What I would like to do is as follows:

  1. use the parse_contents function to create the Pandas data frame, then;
  2. outside the parse_contents function, reference the data frame to create multiple plots (so that I don’t have to define all of the plots inside the parse_contents function, which will get messy!)

What is the best way to do this?

Thanks in advance!