Hi there! I’m strugglin with the follow situation:
I have a dcc.Upload that pretend to receive a PDF file in order to apply a parse process with the camelot python module to extract the tables in PDF file an then, convert these in a pandas dataframes to create graphs…
I get this fuction that figures into the dash layout:
def GetInput3():
return html.Div([
dcc.Upload(
id='upload-data3',
children=html.Div(id='drag_drop', children=[
'Arrastra y suelta o ',
html.A(
'selecciona el buró de crédito en formato PDF')
], style={'color': 'white'}),
multiple=True
), ])
The result of drag and drop is pretending to obtain some table extracted by a the camelot module, so i defined a function that try to get the main table and transform it to a dataframe (i guess here is the problem, mainly i the way to decode the pdf, this is a process that works fro my in other process wich incluides a excel or csv file, not pdf file)
def parse_contents3(contents, filename, date):
content_type, content_string = contents.split(',')
decoded = base64.b64decode(content_string)
try:
if 'pdf' in filename:
# Assume that the user uploaded a PDF file
try:
tables = camelot.read_pdf(io.StringIO(
decoded.decode('utf-8')), pages='all')
except Exception as e:
print(e)
return html.Div([
'There was an error processing this file.'])
else:
return html.Div([
'Try to set a PDF file.'
])
except Exception as e:
print(e)
return html.Div([
'There was an error processing this file.'
])
The last function is called by the callback that store the main df, which goint to be the root of a several graphs in the layout
@app.callback(Output('c-store3', 'data'),
[Input('upload-data3', 'contents')],
[State('upload-data3', 'filename'),
State('upload-data3', 'last_modified')])
def update_output(list_of_contents, list_of_names, list_of_dates):
if list_of_contents is not None:
children = [
parse_contents3(c, n, d) for c, n, d in
zip(list_of_contents, list_of_names, list_of_dates)]
return children
Where do you find the issue?
Thanks in advance… i repeat that this process works for my processing a csv or xlsx file, but didnt’t work with a pdf file.