✊🏿 Black Lives Matter. Please consider donating to Black Girls Code today.
🧬 Learn how to build RNA-Seq data apps with Python & Dash. Register for the May 20 Webinar!

Trendlines with px.scatter : how to explain and fix strange behaviours

Hi,

First time I use this great feature and I see a weird behaviour. Here is two examples with the version 4.14.3 of Plotly :

Working case with generated data

Code

s_1 = np.arange(1, 2, 0.01)
noise = np.random.normal(0, .1, s_1.shape)
s_2 = (s_1 * 3) + noise

fig = px.scatter(y=s_2, x=s_1, 
                 trendline='ols', 
                 color_discrete_sequence=[viz.blue],
                 trendline_color_override=viz.pink)
pio.show(fig)

Output as expected

Real data

Code (no change)

fig = px.scatter(data_frame=df, 
                 y='pH', x='fixed_acidity', 
                 trendline='ols', 
                 color_discrete_sequence=[viz.blue],
                 trendline_color_override=viz.pink)
pio.show(fig)

Output without the trendline

We can see that the ols should find a best fit line withou any issue according to the datapoints.

Output with the weird trendline

Here is the issue:

Again, this is the first time I use this feature, so I don’t know if there is a a procedure to follow since the tutorial on the documentation doesn’t present any kind of refinement other that the “lowess” argument for non-linear relations.

Here is a dropbox export of the numpy array if needed.

Thank you !

I reply to my own question : restarting the kernel fixed the issue. I dont remove the message in case it happens to someone else.
Thanks!