Displaying generated PDF (not local or external)

I have a PDF that is generated by an API that I would like to display in my app:

asset_report_pdf = client.AssetReport.get_pdf(asset_report_token)

Thus since it’s not served locally, and it’s not hosted externally, I’m not sure what would occupy the src property of either the iFrame or Embed components.

UPDATE:
Trying something along these lines:

encoded_image = base64.b64encode(open(asset_report_pdf, "rb").read())
        PDF = html.Img(src=encoded_image.decode())

but getting:

encoded_image = base64.b64encode(open(asset_report_pdf, "rb").read()) ValueError: embedded null byte

Any ideas?

1 Like

Maybe I’m missing something, but PDFs are not images so I don’t see how converting it base64 and trying to display it as an image would work?

If you wanted to do that I’d image these are the steps

  1. Convert the PDF to a PIL object: https://github.com/Belval/pdf2image
  2. Export the PIL object to a PNG: https://stackoverflow.com/a/19651233/2958068
  3. Display in base64 like this: PNG image not showing

However this may not look good and you won’t get PDF features like text selection and copying. The only way I know to display a real PDF on a web-page without relying on external technologies like Adobe Acrobat is to use pdf.js.

It should be possible to take a React version of pdf.js and create a Dash component out of it:

  1. https://www.npmjs.com/package/react-pdf-js
  2. https://dash.plot.ly/react-for-python-developers
  3. https://dash.plot.ly/plugins

But I don’t think it’s a small undertaking.

1 Like

@Damian Same result:

from pdf2image import convert_from_bytes
...
encoded_image = convert_from_bytes(open(asset_report_pdf, "rb").read())
asset_report_png = encoded_image.save("asset_report.png", "PNG")
report_file = 'asset_report.png'
file_base64 = base64.b64encode(open(report_file, 'rb').read()).decode('ascii')
PDF = html.Img(src='data:image/png;base64,{}'.format(file_base64)), 

-> encoded_image = convert_from_bytes(open(asset_report_pdf, "rb").read()) ValueError: embedded null byte

Split up your code so it’s clear what is throwing the exception, to me it looks like your file is invalid somehow but it’s hard to tell, e.g.

asset_report_fp = open(asset_report_pdf, "rb")
asset_report_bytes = asset_report_fp.read()
encoded_image = convert_from_bytes(asset_report_bytes)

Once you know what is throwing the exception you can investigate further.

Input:

print(asset_report_pdf)
asset_report_fp = open(asset_report_pdf, "rb")
asset_report_bytes = asset_report_fp.read()
encoded_image = convert_from_bytes(asset_report_bytes)

Output from print (get same error on open [“line 2”]):

b'%PDF-1.4\n3 0 obj\n<</Type /Page\n/Parent 1 0 R\n/Resources 2 0 R\n/Group <</Type /Group /S /Transparency /CS /DeviceRGB>>\n/Contents 4 0 R>>\nendobj\n4 0 obj\n<</Filter /FlateDecode /Length 1481>>\nstream\nx\x01\xa4\x98\xdbn\xdbL\x0e\xc7\xef\xf5\x14\xbcH\x81-PL\xe7\xa8\xc3e\x9c\xa4E\x16\x9b\xad\x1b\xfb\x05\xd4XM\xdcZ\xd2V\x96Sd\x9f\xfe\xc3\x88\x948\x13\xc9\xedwh\x81\x00T\xa4\xdf\x9f\xe4\x9f\xd4!\x12\xfe\x9dH\xf8\x96(!%\xfcLV[x\xffA\x82\xd2>\xdc~\x85\x9bm"E\x9ek\x90\xa2P\xd6\xff\xb4\n\xee?\xfa\x83N\xf9\xd0\x19\xff\xb3H\xa1{L~\x80\x95B\xfa\x7f\xe0\xff[%\xa4\xd3\xa9\x01\x9de"\xb5RJ\xc82)r\x93\x99\x0c\x1ejx\x7f\xab\xe0\xba\x85\xcf\x89\x91^-+\n!3\xa8\x13\x9d\xfa\xd3\xc7\xf8\x90\xac\x12\xa3\xc3#u\xe2R\'t\x1e\x9eAy\x17\x1e\x84i\xff\x80!\x17x\x84\xd5\x16t\xea\x84\x93\x909+T\x01\xdb\x1d\xfc\xebr\xb3\xb9\xd9\xc2\xfd\xcd\xfa\xd3\xfd\xf6-...

Looks like it already in bytes. No error just going straight to convert_from_bytes:

encoded_image = convert_from_bytes(asset_report_pdf)

However, haven’t figured out how to save as PIL.

Ahhh convert_from_bytes returns list - lemme check some things

UPDATE:
Able to save pdf to project dir:

    PIL = convert_from_bytes(asset_report_pdf)
    filename = '/DashFinance/static/asset_report.png'
    PIL[0].save(filename, "PNG")

However, I’m sure there is something easy I am missing on serving the local file:

    report = html.Img(src='{}'.format(filename)),
    return html.Div([report])

because all I get is this little guy:
29%20PM

Alternatively and preferably (if possible), I would like to not have to worry about the existence of an intermediate file:

asset_report_png = PIL[0]
report = html.Img(src=asset_report_png),
return html.Div([report])

Output:

dash.exceptions.InvalidCallbackReturnValue: 
The callback for property `children` of component `transaction-table` returned a value which is not JSON serializable.
In general, Dash properties can only be dash components, strings, dictionaries, numbers, None, or lists of those.

If you have a PDF’s bytes, e.g. because you downloaded it with requests or you generated it and saved it to Python’s BytesIO, you can simply convert it to base64 and reutn the encoding and the base64 content to an iframe. For example:

            r = requests.get(url)
            encoded_string = base64.b64encode(r.content).decode('ascii')
            html.Iframe(id="embedded-pdf", src=f"data:application/pdf;base64,{encoded_string}", width="100%", height="650px")

I only tested on Chrome, but googling indicates this works on many contemporary desktop browsers.

I built a new component to render PDFs in Dash apps: GitHub - ploomber/dash-pdf: Display PDFs in your Dash apps.
dash-pdf-small

1 Like