Upload CSV with dcc.upload and using it as df

hey, im busting my head over it and cannot find any reference code to help me.
i already wrote a code that manually take a CSV creates different DF of it that are used for plotting.
now im trying to redo it all in DASH.
first step is to take a CSV from the user → set it as df → perform the fonctions on it → use new dfs for plotting.

is it possible?
can someone help with the uploading and setting as df part?

1 Like

Hi Dyms617,

Yes this is possible with Dash. I advise you to check out this page. On the page you can find the explanation for how to upload a csv and make it a Pandas DataFrame. In this tutorials the uploaded file will be displayed below in a DataTable. You can change the code to perform you functions on it. Within the return statement you can return your plot.

Good luck!

Greeting Sempah

Hey Sempah,
thank you for your replay, i have read the link you shared several times and was not able to recreate the code :frowning: .
also, what do you mean by return a plot?
i meant through call back to be shown in the dashboard

Hi,

First try to copy the sample code (from the link I shared) to your own script. Don’t add any code of your own yet. Does the sample script work? If it does then you can continue to edit the code to your own benefit.

What I mean by ‘Return a plot’:
In this example you upload a csv of xlsx and then the DataTable shows the table. This means that the file returns a DataTable. Now remember that if it can return a DataTable it also can return anything you want it to return. For example a plot of your own. So if you change the part that it returns a dataTable to returning a graph (or whaterver you like) it should work.

Perhaps you can post some short sample code so we can figure out whats wrong.

Greeting,

Sempah

hey Senpah,
still no luck

i will explain again my issue:

i need to upload csv
perform operations to create data sets
visualise the data in a way that the user can toggle through the data.

when i created the upload manually
every thing worked and i connected a dcc.dropdown(that got its values from the csv)
to several hist and scatter plots.

therefore i need the update output function to return data structures as variables so i would be able to reach my data with the other functions and callbacks.

is the possible?

as a referance this was my previous code with manual upload:

import plotly.graph_objs as go
import base64
import datetime
import io
import dash
from dash.dependencies import Input, Output, State
import dash_core_components as dcc
import dash_html_components as html
import dash_table

[dm_table,header_table,data_table,dm_data_set] = create_all_tables(df)–>here all the data table are created using multiple functions after manual csv upload

app = dash.Dash()

#Drop down dictionery of DM names

dm_names =
for dm in list(dm_table[‘Concatenated Measurement’]):
dm_names.append({‘label’: dm , ‘value’: dm})

#%% Html layout1
colors = dict(bg = ‘#000000’,text = ‘#7FDBFF’)

app.layout = html.Div([#Overall division
html.Div([dcc.Dropdown(id=‘DM-picker’,#Dropdown for DMs
options=dm_names,
value = list(dm_table[‘Concatenated Measurement’])[0])
]),
html.Div([dcc.Graph(id=‘scatter’)],#Scatter plot definition and style
style=dict(width = ‘48%’,
display = ‘inline-block’)),
html.Div([dcc.Graph(id=‘histogram’)],#Histogram plot definition and style
style=dict(width = ‘48%’,
display = ‘inline-block’)),
]
#,style = dict(backgroundColor = colors[‘bg’])
)

#%% Scatter plot call

@app.callback(Output(‘scatter’, ‘figure’),
[Input(‘DM-picker’, ‘value’)])
def update_scatter(selected_dm):
trace = go.Scatter(x = data_table[‘WAFERXREL_mm’],
y = data_table[‘WAFERYREL_mm’],
mode=‘markers’,
text = data_table[selected_dm],
marker=dict(color = data_table[selected_dm],
showscale = True))

layout = go.Layout(xaxis = dict(title = 'WAFERXREL_mm' , color = colors['text'] ), # x-axis label
                   yaxis = dict(title = 'WAFERYREL_mm' , color = colors['text'] ), # y-axis label
                   hovermode='closest') 
    
return dict(data = [trace],
            layout = dict(layout= layout,
                          plot_bgcolor = colors['bg'], 
                          paper_bgcolor = colors['bg'],
                          font = dict(color = colors['text']),
                          title = selected_dm))

#%% Histogram plot call

@app.callback(Output(‘histogram’, ‘figure’),
[Input(‘DM-picker’, ‘value’)])
def update_hist(selected_dm):
trace = go.Histogram(x = data_table[selected_dm],
xbins = dict(size = 0.5),
name = selected_dm)

layout = go.Layout(title = selected_dm)
       
return dict(data = [trace],
            layout = dict(layout = layout,
                          plot_bgcolor = colors['bg'], 
                          paper_bgcolor = colors['bg'],
                          font = dict(color = colors['text']),
                          title = selected_dm))

#%%call for execution

if name == ‘main’:
app.run_server()

Hey Sempah,
i made it work!!!
it was a pain but it works, i can share my code if needed
unfortunately it is very slow because for every graph i basically need to code,decode my CSV and then apply all my functions to construct my data sets.

is there an option to save my dcc.Upload as a global variable?

2 Likes

Great to hear.

To answer your question ‘is there an option to save my dcc.Upload as a global variable?’.

Yes. You can work with dcc.Store (so Input dcc.Upload and Output dcc.Store). Then create functions to with Output Graph and Input Store. Check out this link.

For now, the easier solution:
Another option is to make the return statement return multiple plots at once. See description below, based on the user guide. In this example it return a dataFrame.

import base64
import datetime
import io
import dash
from dash.dependencies import Input, Output, State
import dash_core_components as dcc
import dash_html_components as html
import dash_table
import pandas as pd

external_stylesheets = [‘https://codepen.io/chriddyp/pen/bWLwgP.css’]

app = dash.Dash(name, external_stylesheets=external_stylesheets)

app.layout = html.Div([
dcc.Upload(
id=‘upload-data’,
children=html.Div([
'Drag and Drop or ',
html.A(‘Select Files’)
]),
style={
‘width’: ‘100%’,
‘height’: ‘60px’,
‘lineHeight’: ‘60px’,
‘borderWidth’: ‘1px’,
‘borderStyle’: ‘dashed’,
‘borderRadius’: ‘5px’,
‘textAlign’: ‘center’,
‘margin’: ‘10px’
},
# Allow multiple files to be uploaded
multiple=True
),
html.Div(id=‘output-data-upload’),
])

def parse_contents(contents, filename, date):
content_type, content_string = contents.split(‘,’)

decoded = base64.b64decode(content_string)
try:
    if 'csv' in filename:
        # Assume that the user uploaded a CSV file
        df = pd.read_csv(
            io.StringIO(decoded.decode('utf-8')))
    elif 'xls' in filename:
        # Assume that the user uploaded an excel file
        df = pd.read_excel(io.BytesIO(decoded))
except Exception as e:
    print(e)
    return html.Div([
        'There was an error processing this file.'
    ])

return html.Div([
    html.H5(filename),
    html.H6(datetime.datetime.fromtimestamp(date)),

    dash_table.DataTable(
        data=df.to_dict('records'),
        columns=[{'name': i, 'id': i} for i in df.columns]
    ),

    html.Hr(),  # horizontal line

    # For debugging, display the raw contents provided by the web browser
    html.Div('Raw Content'),
    html.Pre(contents[0:200] + '...', style={
        'whiteSpace': 'pre-wrap',
        'wordBreak': 'break-all'
    })
])

@app.callback(Output(‘output-data-upload’, ‘children’),
[Input(‘upload-data’, ‘contents’)],
[State(‘upload-data’, ‘filename’),
State(‘upload-data’, ‘last_modified’)])
def update_output(list_of_contents, list_of_names, list_of_dates):
if list_of_contents is not None:
children = [
parse_contents(c, n, d) for c, n, d in
zip(list_of_contents, list_of_names, list_of_dates)]
return children

if name == ‘main’:
app.run_server(debug=True)

You can change the following part to display your graphs.

return html.Div([
    html.H5(filename),
    html.H6(datetime.datetime.fromtimestamp(date)),

    dash_table.DataTable(
        data=df.to_dict('records'),
        columns=[{'name': i, 'id': i} for i in df.columns]
    ),

    html.Hr(),  # horizontal line

    # For debugging, display the raw contents provided by the web browser
    html.Div('Raw Content'),
    html.Pre(contents[0:200] + '...', style={
        'whiteSpace': 'pre-wrap',
        'wordBreak': 'break-all'
    })
])

For instance like this (but then with your own graphs):

return html.Div([
    html.Div([
        html.Div([
            html.H3('Column 1'),
            dcc.Graph(id='g1', figure={'data': [{'y': [1, 2, 3]}]})
        ], className="six columns"),

        html.Div([
            html.H3('Column 2'),
            dcc.Graph(id='g2', figure={'data': [{'y': [1, 2, 3]}]})
        ], className="six columns"),
    ], className="row")
])

Of course you need to place your current functions in the function above.

Hello Sempah,
thank you so much for the help,
i was able to fix the issues and now trying to divide my app into tabs.
i encountered this error:
“callback() takes from 2 to 4 positional arguments but 5 were given”
even though i have only 3 arguments in the callback,
and why can i only input 2 to any ways?

my code:

#DASH creation

#Style component(select file as link)

=============================================================================

external_stylesheets = [‘https://codepen.io/chriddyp/pen/bWLwgP.css’]

app = dash.Dash(name, external_stylesheets=external_stylesheets)

=============================================================================

app = dash.Dash()
colors = dict(bg = ‘#000000’,text = ‘#7FDBFF’)

#%%call for execution
external_stylesheets = [‘https://codepen.io/chriddyp/pen/bWLwgP.css’]

app = dash.Dash(name, external_stylesheets=external_stylesheets)

app.layout = html.Div([dcc.Store(id=‘dfs’),
dcc.Upload(id=‘upload-data’,
children=html.Div(['Drag and Drop or click to ',
html.A(‘Select Files’)]),
style = dict( width = ‘100%’,
height = ‘60px’,
lineHeight = ‘60px’,
borderWidth = ‘1px’,
borderStyle = ‘dashed’,
borderRadius = ‘5px’,
textAlign = ‘center’),
# Allow multiple files to be uploaded
multiple=True
),
html.Div([dcc.Dropdown(id=‘DM-picker’)]),
html.H1(‘Dash Tabs component demo’),
dcc.Tabs(id=“tabs-example”, value=‘tab-1-example’, children=[
dcc.Tab(label=‘Tab One’, value=‘tab-1-example’),
dcc.Tab(label=‘Tab Two’, value=‘tab-2-example’),
dcc.Tab(label=‘Tab Three’, value=‘tab-3-example’),
]),
html.Div(id=‘tabs-content-example’)
])

#%%Store data as JSON string for future use
@app.callback(Output(‘dfs’, ‘data’),
[Input(‘upload-data’, ‘contents’)])
def update_output(list_of_contents):
dfs = parse_contents(list_of_contents)
dfs_json =
for item in dfs:
dfs_json.append(df2json(item))

return dfs_json    

#%% DM picker to use as dropdown filter

@app.callback(Output(‘DM-picker’, ‘options’),
[Input(‘dfs’, ‘data’)])
def dm_pick(data):
dfs =
for item in data:
dfs.append(json2df(item))

dm_table = dfs[0]
header_table = dfs[1]
data_table = dfs[2]
dm_data_set = dfs[3]

dm_names = []
for dm in list(dm_table['Concatenated Measurement']):
    dm_names.append({'label': dm , 'value': dm})
return dm_names

#%%Tabs structure
@app.callback(Output(‘tabs-content-example’, ‘children’),
[Input(‘tabs-example’, ‘value’)],
[Input(‘DM-picker’, ‘value’)],
[State(‘dfs’, ‘data’)])
def render_content(tab,selected_dm,data):
scatter = update_scatter(selected_dm,data)
hist = update_hist(selected_dm,data)
if tab == ‘tab-1-example’:
return html.Div([
html.H3(‘Tab content 1’),
html.Div([dcc.Graph(id=‘graph-1-tabs’,figure=scatter)],
style=dict(width = ‘48%’,display = ‘inline-block’)),
html.Div([dcc.Graph(id=‘graph-2-tabs’,figure=hist)],
style=dict(width = ‘48%’,display = ‘inline-block’))
])
elif tab == ‘tab-2-example’:
return html.Div([
html.H3(‘Tab content 2’),
html.Div([dcc.Graph(id=‘graph-1-tabs’,figure=hist)],
style=dict(width = ‘48%’,display = ‘inline-block’)),
html.Div([dcc.Graph(id=‘graph-2-tabs’,figure=scatter)],
style=dict(width = ‘48%’,display = ‘inline-block’))
])
elif tab == ‘tab-3-example’:
return html.Div([
html.H3(‘Tab content 3’),
html.Div([dcc.Graph(id=‘graph-1-tabs’,figure=hist)],
style=dict(width = ‘48%’,display = ‘inline-block’)),
html.Div([dcc.Graph(id=‘graph-2-tabs’,figure=scatter)],
style=dict(width = ‘48%’,display = ‘inline-block’))
])

if name == ‘main’:
app.run_server()

Hi,

I think that the syntax of your callback is wrong. It should be one list with inputs in stead of a list for each input.

You use:

@app.callback(Output(‘tabs-content-example’, ‘children’),[Input(‘tabs-example’, ‘value’)],[Input(‘DM-picker’, ‘value’)],[State(‘dfs’, ‘data’)])

And it should be:

@app.callback(Output(‘tabs-content-example’, ‘children’),[Input(‘tabs-example’, ‘value’),Input(‘DM-picker’, ‘value’)],[State(‘dfs’, ‘data’)])

The posts here all pretty badly edited.

1 Like

Could you share the code?

1 Like