My dataframe has np.nan values, due to this the trendlines turn out wacky:
For example, the red scatterplot shows the sentiment of a character who appears later on in the TV show (thus, for the first few episodes, the values in the dataframe are np.nan), and if you look just at the red trendline
you can see that it doesn’t align with the red scatterplot. If I take out the np.nan values then the problem disappears, but that’s not what I want to do since I want the sentiments to align with each episode on the x-axis.
How can I fix this? Here’s my code:
import plotly.express as px
fig = px.scatter(df, x='episode', y='sentiment', color='character', trendline="lowess")
fig.show()
Hi @marichevy welcome to the forum! Thank you for the bug report, there is indeed a bug in the sense that x and y values are not properly aligned. I have opened a pull request to fix the issue (https://github.com/plotly/plotly.py/pull/2357), this should be fixed in the next version of plotly.py. In the meantime, you can manually correct the position of x points as follows:
import plotly.express as px
import numpy as np
df = px.data.gapminder().query("continent == 'Oceania'")
df['pop'][df['year'] < 1970] = np.nan
fig = px.scatter(df, x='year', y='pop', color='country', trendline='lowess')
# Correct position of x points
for scatter, trendline in zip(fig.data[::2], fig.data[1::2]):
trendline['x'] = scatter['x'][np.logical_not(np.isnan(scatter['y']))]
fig.show()
1 Like