Px.imshow(img=np.array(), binary_string=True) --> how to extract the array information from the figure in a callback?

Hi there,

I use the px.imshow() with binary_string=True in my app. I need to access the original array which has been used to create the figure in a callback. I know I can use PIL to do so, but I actually wanted to use numpy only. I actually don’t know why the shape of the array created with np.frombuffer() is different than the actual flattened original array.

I guess there is more information in the _content than just the array- which PIL somehow understands?

What am I missing?

import base64
from PIL import Image
import io
import dash
from dash import html, dcc, Input, Output, State
import plotly.express as px
import numpy as np

np.random.seed(42)

DIMS = (20, 30)


def parse_binary_string(contents: str) -> list[str]:
    return contents.split(',')


def str_to_b64(content_string: str) -> base64.b64decode:
    return base64.b64decode(content_string)


def b64_to_array(content: base64.b64decode) -> np.array:
    img = Image.open(io.BytesIO(content))
    arr = np.array(img, dtype=float)
    return arr


app = dash.Dash(__name__)
app.layout = html.Div([
    html.Button(id='btn', children='DO'),
    dcc.Graph(
        id='graph',
        figure=px.imshow(
            np.random.randint(0, 255, size=DIMS),
            binary_string=True
        )
    ),
    html.Pre(id='out')
])


@app.callback(
    Output('out', 'children'),
    State('graph', 'figure'),
    Input('btn', 'n_clicks'),
    prevent_initial_call=True
)
def ha(fig, _):
    # extract source of figure dictionary
    source = fig['data'][0]['source']

    # extract the content string
    _type, _content = parse_binary_string(source)

    # convert into byte like
    byte_content = str_to_b64(_content)

    # using PIL
    img = b64_to_array(byte_content)

    # using numpy only
    # img = np.frombuffer(byte_content, dtype=np.uint8)
    # img = img.reshape(DIMS)
    # ^^ does not work as the shape of img is currently 688 instead of 600 (20x30)

    return dcc.Graph(figure=px.imshow(img, color_continuous_scale='gray')),


if __name__ == '__main__':
    app.run(debug=True)

EDIT: Well, I’m missing that _content does not consist of the pixel values only but also some data related to the png file signature.

I was able to extract the dimensions like so:

import base64
import struct

# Base64 string
base64_string = "data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAB4AAAAUCAAAAAC/wNIYAAACd0lEQVR4XgFsApP9AWZNqbJc3XVYUhNZBHSAjR3vBDDrE5/NVpSyiFw+/AAUoMs5FfzrWDDaOv+p27vPDr29rr0yazbzP/iC5DIAhhRIphGDWDsN8fkIWTSBU1tuu8ar/AeuIs1QozFnBP3ttYgk5qsyz83T0VJ2KDwmKvDR3xzGCdn9bWqt+wDsUW40F5nY+7t77CicDixAWEYIV4Drh9c+ivJQh6IAoiB6BOnm+SgbhshHC6EgL/aWPdckYqtn1doiwOJkAgytCPwb8/3XcuBSQcNtOfqFNnVnO4RFzCeFYR0CMgDm7I6qHCMMn0a68lUbQaksPbj0hRsbaytTHb1Kf/kB9mV9DtcjoPiio7v7WRwaZF8C0+61ZzxSt8LsN2w7BKkE/wL2kM2zQXbY8ilbgO6fDEe2fbZdgPKKIU2lAQGAEusEs3cuxibaV01zFO3DFm8m8i0v7mQQtXXwoNECSOk9xJuxKfQ2G61Po/hIGANgoRykLUa6UcdokXt7BPBd2VHPgc3PC/WUyhY29Kk1oiH8Fes5fXtb/BLgegGSnaCzEQSeQin33NcmV9wCwuyq5Y2O24o2xDXu7uQCLwbTTSkRrimjiRrKv8xU2iAzxa4QI9dpIhZE96icAJ0l5WwytQcaGuEUHWAbbr/gxPs8L5IDIr8wEKvbnQI/CI+ZMMbhCv173BBUQ/T8AK/DwKUN036DTwFtDkEEWQzOYkZq9vuXp6aUvTxPmB6d9OLoosvZ5F0qTSNJATFn/3Uvvo2y64lnLcfmov4+p2S4zQbg3rfol91mIQINJeWJ/kDP+JD9Ms4u/SzXFZVerZAKw0ftRxet26TffS3aEod0MgAAAABJRU5ErkJggg=="

# Decode the base64 string
# omit "data:image/png;base64,"
decoded_bytes = base64.b64decode(base64_string[22:54])

# Check if the first 4 bytes represent "PNG"
if decoded_bytes[:4] == b'\x89PNG':
    # Read width and height from positions 16-19 and 20-23 (using 'struct' module)
    width = struct.unpack('>I', decoded_bytes[16:20])[0]
    height = struct.unpack('>I', decoded_bytes[20:24])[0]
    print("Width:", width)
    print("Height:", height)
else:
    print("The decoded data is not in PNG format.")

But I’m still searching for the pixel values.