✊🏿 Black Lives Matter. Please consider donating to Black Girls Code today.
⚡️ Concerned about the grid? Kyle Baranko teaches how to predicting peak loads using XGBoost. Register for the August webinar!

Unwanted interpolation and points ignored in px.scatter

Hi all,

I’ve spent almost a week trying to get why is plotly doing this with my data. Now I’m giving up and asking for your kind help.

I have two frame columns that I want to graph using a scatter plot. Both of them correspond to measurements at various time instants.

In the above you can appreciate the resulting scatter plot as well as some points of the X-column. There, my problem is easy to understand: the upper value in the plot corresponds to the 1.61E+12 in red to the right while the precedent one is the 8.06E+11 also in red to the right. It means that plotly is ignoring the points in the middle (positions 43 to 46) and making a kind of extrapolation. Besides, it cuts the trace because the subsequent values in the X-column are lower than 1.61E+12. This “interpolation” issue is basically happening along the whole trace every time that X(t)>X(t+1).

I already verified the resulting fig object and all the data is duly passed. I only want to appreciate my measurement results along the time, even if there were oscillations, that is why, sorting my columns is not an option.

Basically, the code I’m using for the scatter plot is:

fig = px.scatter(x=frame[‘a’], y=frame[‘b’])
fig.update_traces(mode=‘lines+markers’, connectgaps=False, showlegend=True)

Then I update the layout to get the log scale in the axis as well as the scientific exponential notation.

Thank you in advance for your answer and for your help.

Hi @dsalfaror,

plotly is not ignoring your points between your marked 1 and 2 positions (index 47 and 42), the point with index=45 is present where it should be on the plot. the other values are lower left outside the frame, are you zoomed in?
if you provide the actual data I could try on my own.

Alex-

Hi @Alexboiboi,

Thank you so much for your answer.

You are right, the values are there. I applied a sort to my X array and all the (x,y) couples are in the line. But that is not what I desire. I’m not interested in px.scatter applying a sort to my data. I’d prefer something like this:

Event if is aesthetic.

Do you know which option do I need to modify to do so? I tried this https://stackoverflow.com/questions/47489554/plotly-deactivate-x-axis-sorting

But it is not what I’m interested on.

I don’t understand neither why the figure object presents the arrays in the “right” order

Once again, thank you.

Diego

Is frame a pandas DataFrame object ?
If yes, the arguments are supposed to be column names for px.scatter. And btw the px.scatter function does not sort your data

Try this:

fig = px.scatter(frame, x=‘a’, y=‘b’)

Also it looks like a and b columns have perfect linear correlation. I don‘t know why you would expect such randomness.
Like I said pleade share a short sample of your data with both columns.

Yes, you are right, they are perfect linear correlation and that allows me to have a new idea for my analysis. Thank you for your help Alex and sorry for the inconveniences.

Diego