✊🏿 Black Lives Matter. Please consider donating to Black Girls Code today.
⚡️ Concerned about the grid? Kyle Baranko teaches how to predicting peak loads using XGBoost. Register for the August webinar!

Large pixels in heatmap

Hi.

I’m trying to implement a go.Heatmap. I generate the data for the heatmap using a numpy mesh grid.

When I use a 118 x 118 mesh grid I get the following nice heatmap.

However, the pixels are very large and the heatmap looks ugly. I was aiming for something smooth like the following.

But when I try ANY larger grid size, even as low as 148 x 148, then the plotly function returns nothing. No plot is generated. The issue is not that the loading is slow, but that no image is generated whatsoever. I reviewed the other posts on heatmaps but I don’t think they apply because they have grid sizes far greater than mine.

Or could the issue be that my Z values are to small? My Z values are very very small, ie. 0.000000000001 small.

I am running this in google colab.

Thank you for your help.

Hi @joemac1985,

go.Heatmap has an attribute, zsmooth. Its values can be False, best, fast. The default one is False. Try setting it as best. if even in this case your heatmap is pixelated, please paste here, your heatmap definition for x, y, and z, i.e.
x= np.linspace(a, b, n)

The small values for z do not influence the heatmap appearance, because before color-mapping plotly.js maps the interval [z.min(), z.max()] to [0,1] .

I believe that parameter solved the problem for pixelation. Here is the result.

However, if I increase the size of the mesh beyond a certain point then plotly continues to generate no plot.

Here is my heatmap definition

# Minimum and maximum values of x and y coordinates in mesh grid
x_min, x_max = X.iloc[:, attributes[0]].min() - 1, X.iloc[:, attributes[0]].max() + 1
y_min, y_max = X.iloc[:, attributes[1]].min() - 1, X.iloc[:, attributes[1]].max() + 1

# Range of values for x and y coordinates in mesh grid
x_range = np.arange(x_min, x_max, h)
y_range = np.arange(y_min, y_max, h)

# Set of x and y coordinates in mesh grid
x_coord, y_coord = np.meshgrid(x_range, y_range)
coords = np.c_[x_coord.ravel(), y_coord.ravel()]

y_ = np.arange(y_min, y_max, h)

Eeach coord in coords is then plugged into an algorithm which spits out predictions which become my Z values.

Z = np.array(predictions).reshape(x_coord.shape)

Then…

trace1 = go.Heatmap(x=x_coord[0],
                    y=y_,
                    z=Z,
                    colorscale='Jet',
                    showscale=False,
                    zsmooth='best')

@joemac1985

To find out why your heatmap is not displayed I need to understand your data. Could you please explain how your dataframe, X, is defined, and whether the list(?) prediction is a column in that dataframe or not? If it isn’t , how is related its length to len(X)? Eventually please create some synthetic data and paste it here to test your code. I cannot understand why did you substract/add 1, to define x_min, x_max.

My data starts with the famous “iris” data dataset. The first four columns are floats describing attributes about the column with labels, “Species”.

X is a DataFrame of the first 4 columns.
Y is a Series of the last column “Species”.

I have a loop which feeds each row of X (ie. the four attributes) into an algorithm. In each iteration, the algorithm takes such a row of X and spits out a floats “prediction” for each of the 3 possibles values in species.

Observe above (1) model.predict() takes a row of X and returns one of the 3 labels in “Species” column, which is the prediction, and (2) the prediction also returns probabilities for each of the 3 possible Species, whose total adds up to 1.0, but in practice 1 species has a probability of something like 0.97 and the other two are virtually zero. I thought perhaps such low values might be an issue.

I construct a mesh grid for the heatmap using numpy. The -1 and +1 to define X_min/max are not essential for the heatmap. I only use them because when I combine this heatmap with a second trace of a scatterplot, then their width/height don’t seem to align well. But I can remove the -1 and +1 and nothing changes in this issue.

When I construct the mesh grid, I select two attributes from the above DataFrame, for example, SepalLengthCm and SepalWidthCm. The range of these two attributes forms the range of the X_min/X_max and Y_max/Y_min of the heatmap, respectively. The +1 and -1 (as described above) merely expand the their lengths. Each “coordinate” in the mesh grid represents an X (dataframe) value which is plugged into the above model which then spits out a probability prediction. The coordinates are then the x/y values in the heatmap, and the probability values (between 0 and 1.0) are the Z values in the heatmap.

The “h” value is used to calculate X_range and Y_range. A higher h value means more granular coordinates to be calculated in model.predictions (ie. 0.01, 0.02, 0.03, … versus 0.1, 0.2, 0.3, …), so a higher h value means a much more “dense” heatmap with a much large number of x/y/z coordinate values for plotly to plot.

When I use an h of .05 I get maybe 4000 coordinates to calculate for the heatmap, and plotly successfully generates a nice plot. But if I lower .05 to 0.02 I get maybe 5000 coordinates, and plotly does not generate an error but fails to display any plotly in Google Colab.

All I can think is maybe its about memory?