Olov
April 6, 2023, 3:54pm
1
Wen I try to render a scatter plot with the x-axis as date the values gets rounded when using express.
fig = px.scatter(df, x="time", y="value", title="Durations", color='metric')
gives me a plot that looks like this
all values come in “columns”. When I instead use
fig = go.Figure()
for metric, group in df.groupby("metric"):
fig.add_trace(go.Scatter(x=group['time'], y=group['value'], mode='markers', name=metric))
I get the plot that I am aiming for, with correct x values
What am I doing wrong here or is it a bug?
Info of my data
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 metric 1550 non-null object
1 time 1550 non-null datetime64[ns, pytz.FixedOffset(120)]
2 value 1550 non-null float64
More relevant info is that hover data seems to be located at the correct x value so I get info while hovering empty space. If it is of any value I am using jupyter notebook and the same issue occurs both through vscode and jupyter-lab.
Can you share your sample df? I think maybe fig.update_xaxes(type='category')
may help.
Olov
April 7, 2023, 5:38am
3
sorry. the xaxes is datetime and if i do a fig.update_xaxes(type = "category")
it works but it is not a datetime axis so it looks like this
Here is a sample from my data
Index
Metric
Timee
Value
19830
metrica
2023-04-06 15:45:43.488911+02:00
3.234
19572
metricb
2023-04-06 15:45:36.926151+02:00
0.346
4444
metricb
2023-04-06 15:38:32.102785+02:00
0.281
10500
metricc
2023-04-06 15:41:22.098974+02:00
1.432
45774
metricb
2023-04-06 15:58:27.443395+02:00
0.308
27192
metrica
2023-04-06 15:49:30.714646+02:00
2.101
9487
metrica
2023-04-06 15:40:54.835025+02:00
2.765
15035
metricb
2023-04-06 15:43:28.260221+02:00
0.643
44649
metricd
2023-04-06 15:58:00.867717+02:00
0.835
15461
metricd
2023-04-06 15:43:41.859967+02:00
3.702
with columns datatypes described above
Can you try fig.update_xaxes(type='-')
. With your sample data I’m not sure it works as your expectation or not.
Olov
April 7, 2023, 6:21am
5
I tried to get better test data for you and found something intereting
dfsmall = df.head(1000)
fig = px.scatter(dfsmall, x="time", y="value", title="Durations", color='metric')
gave the correct output
i tried 2000 and it looks wrong. Stepped down and found the breakpoint.
dfsmall = df.head(1001)
fig = px.scatter(dfsmall, x="time", y="value", title="Durations", color='metric')
so above 1000 values seems wrong. Updating the xaxis does not impact this in any way
1 Like
Might upload sample data to somewhere to download, I want to try to run it in my PC.
Olov
April 7, 2023, 6:30am
7
i can recreate it doing:
df = pd.DataFrame(columns=['metric', 'value', 'time'])
for i in range(1000):
df = pd.concat([df, pd.DataFrame({'metric': random.choice(['a', 'b', 'c', 'd', 'e']), 'value': random.random()*20, 'time': datetime.datetime.now() + datetime.timedelta(seconds=random.randint(1, 300))}, index=[0])], ignore_index=True)
px.scatter(df, x="time", y="value", title="Durations", color='metric')
1000 values
1001 values
Quite strange, it worked for me with data over 1000. Which version of plotly are you using?
Olov
April 7, 2023, 6:38am
9
import plotly
plotly.__version__
'5.12.0'
Can you try to update newest version of it. I’m using 5.13 and it worked.
Olov
April 7, 2023, 6:43am
11
tried 5.14.1
and pandas 2.0.0
must be something wrong with my setup i guess.
Olov
April 7, 2023, 7:02am
12
I created an issue on github . Thanks for the help, I appreciate it!
1 Like
Hope that you will find a solution, I am using notebook with windows laptop and it works fine.
Olov
April 7, 2023, 7:11am
14
I will try the same thing in windows!
AIMPED
April 7, 2023, 9:56am
15
Hi there, I answered a topic with what it seems a similar issue with plotly.express. Somehow plotly.express groups the data.
I am working on creating a visualization where each data point in the barchart is a duration in seconds (x-axis )and the color represents an event type such as as create, update, delete etc. I need all the events to be in a single bar and the the events should occur in the chart in the order they occur in the data with no grouping or sorting. Currently this seems to be grouping the events by type. I’m not sure if there is a better way to do this or not. I am fairly new to plotly’s library.
I’ve…
In addition to that, I recall a topic where plotly.express switches the way of plotting the data internally when passing certain limit of data points but I can’t find it right now.
Olov
April 7, 2023, 10:15am
16
the same code worked in windows btw
1 Like
AIMPED
April 7, 2023, 10:17am
17
I can’t reproduce this neither on linux,
plotly: 5.13.1
pandas: 1.5.1
Olov
April 7, 2023, 10:27am
18
just tried linux myself and it works there as well. also reinstalled the library on mac and changed my kernel to the newest python version but still the same issue
it is not a really pressing issue since i can do it wot go figures but it is slightly annoying since it is more code to add margins etc. Also doing a line plot gives the correct results so it is only tied to scatter. Or I could change computer =)
1 Like
AIMPED
April 7, 2023, 10:29am
19
I just tried upgrading to the newest versions of plotly and pandas (5.14.1, 2.0.0) on linux and it works.
So seems to be an issue on mac. Did you add the information to your github issue?
1 Like
Olov
April 7, 2023, 10:30am
20
sure did. Added a comment of it working on windows. Should probably add that it only seems to be scatter since line works. I should perhaps dig up a intel mac and see if it has to do with Arm
Helpful community btw. I am not a Python developer and just tried to make some quick and dirty reports and try out pandas and plotly and I am pleasantly surprised.
2 Likes