Black Lives Matter. Please consider donating to Black Girls Code today.

Scattergl not working with 700,000 sensor readings when x = datetime

I have sensor readings where I plot the reading on y and datetime on x. When I use arange(len(readings)) Scattergl works. When I use the datetime, I get runtime disconnect. I am running within Google colab.

import numpy as np

x_axis = np.arange(df['Pa'].count())

# Create figure
fig = go.Figure()

fig.add_trace(
    go.Scattergl(
        x = x_axis,
        # x = df['Pa'].index,
        y = df['Pa']
     )
)
fig.update_layout(title_text='title',
                      xaxis_rangeslider_visible=True)
fig.show()

The above works because x is 0 to the number-1 of readings. The above does not work when x is the datetime.
Note: Readings are one second apart:

--------------------------
Start index: 2020-01-12 13:56:10.411604643-08:00
--------------------------
End index: 2020-01-21 03:00:43.241627693-08:00
--------------------------

It is not clear to me why I am successful using integer increments on x. But using datetime fails? Any advice appreciated.

Thank you.

Hi @happyday

For me it works with even more data (len(df) = 777600):

import pandas as pd
import numpy as np
from datetime import datetime
import plotly.graph_objects as go

dt = []
for k in range(1, 10):
    dt += [datetime(2020, 1, k, h, m, s) for h in range(0, 24) for m in range(0,60) for s in range(0,60)]

d ={'date': dt,
   'y': np.random.randint(2,15, len(dt))}

df = pd.DataFrame(d, index=d['date'])

fig = go.Figure()

fig.add_trace(
    go.Scattergl(
        x = df.index,
        y = df['y']
     )
)
fig.update_layout(title_text='title',
                      xaxis_rangeslider_visible=True)
fig.show("browser")

1 Like

Hi @empet,
THANK YOU. Your example helped me resolve my challenge. It turns out, I needed to remove the TZ. i.e., here is an example of my data:

2020-01-12 13:56:10.411604643-08:00  1264.03520  

This will not work. I then did the following to the datetimeindex:

df.index = df.index.round('s')  
df.index = df.index.tz_localize(None)  

Now I have:

DatetimeIndex(['2020-01-12 13:56:10', '2020-01-12 13:56:11',....  

i.e.: Same format as yours.

This leads me to conclude Scattergl does not work when datetimeindex is localized (and perhaps when the time resolution is higher than a second).

I very much appreciate your help. The example was SUPER HELPFUL.

Plotly.js can handle fractional seconds and some timezone info (-08:00, also -0800 and Z) though currently it just ignores this https://codepen.io/alexcjohnson/pen/abzPgJK?editors=0010
So these are all valid once they make it to javascript:

'2019-01-01 12:34:56.123456789-08:00',
'2019-01-02 12:34:56.123456789+0400',
'2019-01-03 12:34:56.123456789Z',
'2019-01-04 12:34:56+03:00',
'2019-01-05 12:34+1100',
'2019-01-06 12:34Z'

The runtime disconnect makes me think there’s another issue at play, and these values aren’t even making it to the graph. I’m not sure what the problem is though, AFAICT our JSON serializer has no trouble with timezones:

>>> from datetime import datetime
>>> import pytz
>>> from plotly import utils
>>> d1=datetime(2019,1,2,3,4,5)
>>> json.dumps(d1, cls=pu.PlotlyJSONEncoder)
'"2019-01-02T03:04:05"'
>>> timezone = pytz.timezone("America/Los_Angeles")
>>> d2=timezone.localize(d1)
>>> json.dumps(d2, cls=utils.PlotlyJSONEncoder)
'"2019-01-02T03:04:05-08:00"'

@happyday if you’d like to help us debug, I’d be curious to know what happens if you call json.dumps(your_date, cls=utils.PlotlyJSONEncoder)

cell in colab:

import json
from plotly import utils
display(df.index[0])
json.dumps(df.index[0], cls=utils.PlotlyJSONEncoder)

output:

Timestamp('2020-01-12 13:56:10.411604643-0800', tz='US/Pacific')
'"2020-01-12T13:56:10.411604643-08:00"'

Huh ok, thanks for checking. So the serializer works, and the resulting string works on the javascript side… I’m puzzled where the problem could be.

I found I can plot when I don’t have “big data” sized readings. E.g.: try using the index format I used with 800,000 samples on colab.