# Colorscale inaccurate

Hi,

Iâ€™m using plotly with Python, and am having some trouble with the `colorscale` property - I donâ€™t think itâ€™s mapping the values to the colours correctly:

The input is a list of 5000 values from a Normal distribution with mean 0 and variance 1.
The colorscale should be blue for values < -0.25, white for values between -0.25 and 0.25, and red for values greater than 0.25. Note, this is for the values.

Of course, with colorscale we have to map values onto the interval [0,1], so Iâ€™ve done this. This input

``````print(
value_to_percentile(fake_values, -0.25),
value_to_percentile(fake_values, 0),
value_to_percentile(fake_values, 0.25)
)
``````

produces this output: `0.394 0.4912 0.5884`, which looks pretty close to what weâ€™d expect from a standard normal distribution.

My colorscale is as follows:

``````colorscale=[
[0, 'rgb(0,0,255)'],
[value_to_percentile(fake_values, -0.25), 'rgb(0,0,255)'],
[value_to_percentile(fake_values, -0.25), 'rgb(255,255,255)'],
[value_to_percentile(fake_values, 0), 'rgb(255,255,255)'],
[value_to_percentile(fake_values, 0.25), 'rgb(255,255,255)'],
[value_to_percentile(fake_values, 0.25), 'rgb(255,0,0)'],
[1, 'rgb(255,0,0)']
]
``````

which I believe accurately represents what I outlined above for the colours I want.

However, my colorscale is coming out like this:

Where the thresholds are more like +/- 0.7 than +/- 0.25. Any idea whatâ€™s going wrong?

Thanks

Hey @taimur ,
Something went wrong in your approach, but I cannot figure out whatâ€™s wrong, because I donâ€™t know your data, and what type of trace you plotted.

I defined the colorscale as follows:

`````` import scipy stats as st
colorscale=[[0, 'blue'],
[st.norm.cdf(-0.25), 'blue'],
[st.norm.cdf(-0.25), 'white'],
[st.norm.cdf(0.25), 'white'],
[st.norm.cdf(0.25), 'red'],
[1, 'red'] ]
``````

and got this plot:

Here is the corresponding Jupyter notebook: https://plot.ly/~empet/14587

Hey @empet, thanks a lot for taking a look at this! I checked out your notebook and it seems to me like the problem is still there (but perhaps Iâ€™m misinterpreting the graph - let me know!) -

I agree with your colorscale - if I read it right, it should make point (x,y) white if x is in [-0.25,0.25]. However, from looking at the plot, it looks like white points are actually more like x in [-0.8, 0.3] - do you agree? And the colour bar itself also seems to show this.

Iâ€™ve also reproduced your notebook and done the same plot but with 10k points and it shows a similar thing - the points in white seem go significantly outside of the x in [-0.25, 0.25] constraint. However,

`sorted(x)[int(round(st.norm.cdf(-0.25) * len(x)))]` outputs `-0.26114270136600848` which is about rightâ€¦

Am I looking at it wrong somehow?

Thanks, really appreciate your help!

The colorbar does not illustrate the subintervals of values as you expected, because the
range of x-values, `[ min(x), max(x)]`, is mapped to [0,1], in the definition of the colorscale, via the CDF of the normal standard distribution, which is a nonlinear function.
Having a Plotly colorscale, x-values are first linearly normalized by the Plotly, i.e. mapped to [0,1], by the linear
function `x-->(x-min(x))/(max(x)-min(x))`, and these normalized values are mapped to the corresponding color in the colorscale.
To be more precise, suppose that
min(x)=-3, max(x)=3. Then x=0.3 is mapped linearly to (0.3+3)/6=3.3/6=0.55 in [0.1].
Our colorscale being:

`````` [[0, 'blue'],
[0.4012936743170763, 'blue'],
[0.4012936743170763, 'white'],
[0.5987063256829237, 'white'],
[0.5987063256829237, 'red'],
[1, 'red']]
``````

the color associated to the normalized value, 0.55, is white (not red), because 0.55 belongs to the interval [0.4012936743170763, 0.5987063256829237].
If Plotly normalization function were the normal distribution CDF, the colorbar would illustrate the right values.

Thanks for the reply @empet.

Still a bit puzzled by this - I agree that the normal CDF is non-linear, but itâ€™s still a strictly increasing function and so should preserve the ordering of the original values, which is what weâ€™re really concerned about. Likewise, the Plotly normalisation also preserves the ordering.

If the closest value to `-0.25` in the original list is `0.401` of the way through the list (by the inverse of the normal CDF with -0.25), then in the Plotly-normalised list, the value thatâ€™s `0.401` of the way through should still correspond to the original value that was close to `-0.25`. Do you agree?

Iâ€™m obviously looking at this wrong somehow because what you said in your previous post is definitely whatâ€™s going on in the graph, but Iâ€™m struggling to get my head around why exactly my interpretation is wrong.

Thanks!

@taimur To understand why the distribution of colors is not symmetric when the colors are computed via Plotly, I illustrated in this notebook https://plot.ly/~empet/14596 how it should assign the colors from your chosen colorscale, but it doesnâ€™t (because it works exclusively with linearly defined colorscales).
This is the plot with colors computed by a custom function: