Plotly express scatter with date as xaxis

Wen I try to render a scatter plot with the x-axis as date the values gets rounded when using express.

fig = px.scatter(df, x="time", y="value", title="Durations", color='metric')

gives me a plot that looks like this

all values come in “columns”. When I instead use

fig = go.Figure()
for metric, group in df.groupby("metric"):
    fig.add_trace(go.Scatter(x=group['time'], y=group['value'], mode='markers', name=metric))

I get the plot that I am aiming for, with correct x values

What am I doing wrong here or is it a bug?

Info of my data

#   Column         Non-Null Count  Dtype                                
---  ------         --------------  -----                                
 0   metric         1550 non-null   object                               
 1   time           1550 non-null   datetime64[ns, pytz.FixedOffset(120)]
 2   value          1550 non-null   float64                             

More relevant info is that hover data seems to be located at the correct x value so I get info while hovering empty space. If it is of any value I am using jupyter notebook and the same issue occurs both through vscode and jupyter-lab.

Can you share your sample df? I think maybe fig.update_xaxes(type='category') may help.

sorry. the xaxes is datetime and if i do a fig.update_xaxes(type = "category") it works but it is not a datetime axis so it looks like this

Here is a sample from my data

Index Metric Timee Value
19830 metrica 2023-04-06 15:45:43.488911+02:00 3.234
19572 metricb 2023-04-06 15:45:36.926151+02:00 0.346
4444 metricb 2023-04-06 15:38:32.102785+02:00 0.281
10500 metricc 2023-04-06 15:41:22.098974+02:00 1.432
45774 metricb 2023-04-06 15:58:27.443395+02:00 0.308
27192 metrica 2023-04-06 15:49:30.714646+02:00 2.101
9487 metrica 2023-04-06 15:40:54.835025+02:00 2.765
15035 metricb 2023-04-06 15:43:28.260221+02:00 0.643
44649 metricd 2023-04-06 15:58:00.867717+02:00 0.835
15461 metricd 2023-04-06 15:43:41.859967+02:00 3.702

with columns datatypes described above

Can you try fig.update_xaxes(type='-'). With your sample data I’m not sure it works as your expectation or not.

I tried to get better test data for you and found something intereting

dfsmall = df.head(1000)
fig = px.scatter(dfsmall, x="time", y="value", title="Durations", color='metric')

gave the correct output

i tried 2000 and it looks wrong. Stepped down and found the breakpoint.

dfsmall = df.head(1001)
fig = px.scatter(dfsmall, x="time", y="value", title="Durations", color='metric')

so above 1000 values seems wrong. Updating the xaxis does not impact this in any way

1 Like

Might upload sample data to somewhere to download, I want to try to run it in my PC.

i can recreate it doing:

df = pd.DataFrame(columns=['metric', 'value', 'time'])
for i in range(1000):
    df = pd.concat([df, pd.DataFrame({'metric': random.choice(['a', 'b', 'c', 'd', 'e']), 'value': random.random()*20, 'time': datetime.datetime.now() + datetime.timedelta(seconds=random.randint(1, 300))}, index=[0])], ignore_index=True)

px.scatter(df, x="time", y="value", title="Durations", color='metric')

1000 values

1001 values

Quite strange, it worked for me with data over 1000. Which version of plotly are you using?

import plotly
plotly.__version__
'5.12.0'

Can you try to update newest version of it. I’m using 5.13 and it worked.

tried 5.14.1
and pandas 2.0.0

must be something wrong with my setup i guess.

I created an issue on github. Thanks for the help, I appreciate it!

1 Like

Hope that you will find a solution, I am using notebook with windows laptop and it works fine.

I will try the same thing in windows!

Hi there, I answered a topic with what it seems a similar issue with plotly.express. Somehow plotly.express groups the data.

In addition to that, I recall a topic where plotly.express switches the way of plotting the data internally when passing certain limit of data points but I can’t find it right now.

the same code worked in windows btw

1 Like

I can’t reproduce this neither on linux,

plotly: 5.13.1
pandas: 1.5.1

just tried linux myself and it works there as well. also reinstalled the library on mac and changed my kernel to the newest python version but still the same issue

it is not a really pressing issue since i can do it wot go figures but it is slightly annoying since it is more code to add margins etc. Also doing a line plot gives the correct results so it is only tied to scatter. Or I could change computer =)

1 Like

I just tried upgrading to the newest versions of plotly and pandas (5.14.1, 2.0.0) on linux and it works.

So seems to be an issue on mac. Did you add the information to your github issue?

1 Like

sure did. Added a comment of it working on windows. Should probably add that it only seems to be scatter since line works. I should perhaps dig up a intel mac and see if it has to do with Arm

Helpful community btw. I am not a Python developer and just tried to make some quick and dirty reports and try out pandas and plotly and I am pleasantly surprised.

2 Likes