Black Lives Matter. Please consider donating to Black Girls Code today.

Datetime as color variable for scatter

Does anyone have a nice example of a scatter plot colored by a continuous DateTime variable? (similar to https://plot.ly/python/line-and-scatter/#scatter-with-a-color-dimension but with a date for the color).

It seems like my best option right now is to convert it to a timestamp and then maybe manually adjust the colorbar in the legend, which is a bit tricky and laborious.

Doesn’t look like plotly express handles this either – it makes into a categorical variable.

Thanks.

Hi @chubukov,

Welcome to Plotly forum!

I give here an example on how the datetime colorapping is performed:

Define a dataframe with 'date' one of the columns, and if you are not sure it is well ordered, sort it by date.
If the datetimes are uniformly distributed over 1 day or other time unit, map each date to an int in the range [0, len(df)-1].
Otherwise you have to perform some previous computations to map all datetimes to an interval of floats, [a, b].
The values associated
to datetimes and saved in a time series:
df['date'].map(pd.Series(data=np.arange(len(df)), index=df['date'].values).to_dict())

will replace in the latter case the np.arange(len(df)).

First show the figure without associating colorbar tickvals and ticktext, and
save in tickvals the values displayed as ticklabels for colorbar
(in this example, 0, 20, … 140). Note that tickvals are recognized by plotly.js as values to be colormapped, while ticktext are the dates to be displayed on colorbar.

Then these lines of code:

tickvals = [20*k for k in range(8)]
dlist = list(date_to_val)
index_tickvals = [dlist.index(tv) for tv in tickvals]

play the role of the inverse map, i.e. for each value in tivkvals associated to colorbar
is computed the index in df of the corresponding date.

Here is the complete code:

import pandas as pd
import numpy as np
import plotly.graph_objs as go
from datetime import datetime

df = pd.DataFrame({'date': [datetime(2020, 1, k, j)  for k in range(15, 32)  for j in range(8, 17) ],
                   'x': 10+3* np.random.random(153),
                   'y': 21+3*np.random.rand(153)})


date_to_val = df['date'].map(pd.Series(data=np.arange(len(df)), index=df['date'].values).to_dict())

tickvals = [20*k for k in range(8)]
dlist = list(date_to_val)
index_tickvals = [dlist.index(tv) for tv in tickvals]

ticktext = [df['date'][id].strftime("%m-%d  %H:%M") for id in index_tickvals]


customdata = [each_date.strftime("%m-%d  %H:%M") for each_date in df['date']]
fig= go.Figure(go.Scatter(x= df['x'], y=df['y'], mode='markers', 
                          marker_color=date_to_val, 
                          marker_colorscale='Plasma',
                          marker_showscale=True,
                          marker_size=8,
                          marker_colorbar=dict(tickvals=tickvals, 
                                               ticktext=ticktext, 
                                               title_text='2020'),
                         customdata=customdata,
                         hovertemplate="%{customdata}<br>x: %{x}<br>y: %{y}"))
fig.update_layout(width=700, title_text='Datetime colormapping', title_x=0.5)

Note that here the inverse map was not needed, because to each datetime is associated an int in np.arange(len(df)). In this case the datetime index in df coincides with the associated value in np.arange(len(df)).

Thank you @empet. This is more or less what I imagined would be necessary. I think the solution could be generalized a little bit by converting to a timestamp ( with just .astype(int)) and then generating the ticks with linspace or similar.

Would be awesome if just passing the date variable worked out of the box.