Error when i put trendline in scatter plot in plotly

Hi, im working on Diamonds database, so im trying to create a trendline using scatter plot

So this code is working absolutely fine
fig = px.scatter(df,x=‘carat’,y=‘price’,color = ‘cut’)

When i put trendline
fig = px.scatter(df,x=‘carat’,y=‘price’,color = ‘cut’, trendline=‘ols’)

Can anybody help me. I will be really thankful

Hi @sobia

Welcome to the community!

It seems that using the Trend line in Plotly requires that the ‘statsmodels’ package be installed to work.

Simply download the package to your current environment using pip install statsmodels or conda install anaconda::statsmodels if you’re using Anaconda and rerun your code.

2 Likes

Code is working absalutle fine. What is the reason to install this library pip install statsmodels

when you use the ‘trendline’ argument in px.scatter(), Plotly will use this library in the background to compute the trend line, hence it must be available within your environment in order for Plotly to make a call to the library.

This is the approach I use for trendlines or best fit:

  1. Calculate y-intercept and slope using scipy.stats.lingress
  2. get the minimum and maximum x-coordinate values from your data
  3. calculate the y-values associated with min and max x
  4. do a 2-point scatter plot with mode = ‘lines’, to connect [xmin, (slopexmin)+y-intercept] and [xmax, (slopexmax)+y-intercept]
    This approach is very simple, hope it helps you.