Adding event annotations to time series plots as vertical bars

Hello,

I have a time series plot which plots multiple lines over time - example code:

fig = px.line(df_viz.query("publisher=='NYT'"), x="publication_date", y="proportion", color="query", title="Publisher: NYT")

Every aspect of this plot except the x value (date) gets manipulated via ipywidgets.

I would like to add some annotations for specific events. So I have a dataframe with 3 columns: date, description, and type. I would like to have a line spanning the full height of the plot for each event at the correct date, color-coded by the event type, and with the description displaying as hovertext/tooltip.

I have tried a few approaches. Drawing rectangles for each event is extremely verbous and tedious, and is not sustainable if I want to add many new events - plus, shapes do not have hovertext. I also tried to add the events as a bar chart using

fig_bar = px.bar(events, x="date", color="type", hover_data=["date", "event"])
fig.add_trace(fig_bar.data[0])

However, I cannot figure out how to make the y axis be the maximum y value (so each bar extends to the full height of the chart) and how to increase the width of each bar (they are essentially invisible in the full view of the chart, as each line represents one day, and the timeline is multiple decades long).

In case it is useful, I tried the shapes approach with two test events (using code from the documentation):

fig.update_layout(
    shapes=[
        
        go.layout.Shape(
            type="rect",
            # x-reference is assigned to the x-values
            xref="x",
            # y-reference is assigned to the plot paper [0,1]
            yref="paper",
            x0="2007-03-01",
            y0=0,
            x1="2007-03-30",
            y1=1,
            fillcolor="LightSalmon",
            opacity=0.5,
            layer="below",
            line_width=0,
        ),
        
        go.layout.Shape(
            type="rect",
            xref="x",
            yref="paper",
            x0="2017-04-01",
            y0=0,
            x1="2017-04-30",
            y1=1,
            fillcolor="LightSalmon",
            opacity=0.5,
            layer="below",
            line_width=0,
        )
    ]
)

I want to be able to visually see how my lines change before/after these events, and I want to be able to see the full description of the events on hover (so the text doesn’t crowd the plot). How can I solve this? Thanks in advance.

Hi @jenna, welcome to the forum! Several approaches are possible here, as you already discovered.

If you prefer to have a bar chart, you can make very long bars and restrict the yaxis range (https://plot.ly/python/axes/#setting-the-range-of-axes-manually) so that it’s clipped to the range of your lines (if you know it). The width of bars can be set as described in https://plot.ly/python/bar-charts/#customizing-individual-bar-widths. If you create the trace with px.bar, you can do

fig_bar = px.bar(events, x="date", color="type", hover_data=["date", "event"])
fig_bar.update_traces(width=20)

As for shapes, the very nice thing is that you can make them extend to the whole height of the plot as you already did with yref="paper" (nicely done!). You can write a for loop in order to add all your shapes without too much code. As for the hover, what you could do is to add an invisible trace (scatter or bar, with transparent markers) at the same location. But it’s probably more work than the first solution.

Hi @jenna,

To avoid defining shapes you can perform a trick, namely, plot a vertical bar at each date/event, having the same height as the plot window.
The main trace(s) are referenced to xaxis, yaxis, while the vertical bars to xaxis and yaxis2 (located on the right side).
yaxis2 is then set of fixed range [0,1], and made invisible.

Here is an example that you can adapt to your data:

fig = make_subplots(specs=[[{"secondary_y": True}]])
y = [0.96, 1.5,  1.31,  1.84, 2.05, 0.87]
dates = [datetime(2020, 1, k).strftime("%m.%d.%Y") for k in range(5, 31, 5)]                                 

d = {'date': dates,
     'description': [f'event_{k}'  for k in range(6)],
    'type':['A', 'B', 'C', 'D', 'E', 'F']
            }
df= pd.DataFrame(d)


event_color=['aqua', 'brown', 'darkseagreen', 'firebrick', 'magenta', ' darkturquoise']



fig.add_trace(go.Bar(x= df['date'],
                             y= [1]*len(df), 
                             name='',
                             showlegend=False,
                             marker_color=event_color,
                             width=0.2, #bar width
                             customdata=df['description'],
                             hovertemplate='date: %{x}<br>event: %{customdata}',
                             opacity=0.65
                              ), secondary_y=True);
#The main plot
fig.add_trace(go.Scatter(x=df['date'], 
                         y=y, 
                         line_color='red', 
                         line_width=2),  secondary_y=False);


fig.update_layout(width =800, height=450,
                  yaxis2=dict(fixedrange= True,
                              range= [0, 1],
                              visible= False ))  

2 Likes

Great trick to add a secondary axis!

@empet beautiful example. Is there a way to do this with dates/events that are within the range of the main trace, but not actual main trace dates?
(I tried creating a full dataframe with both the event-dates and the value-dates, most of which do not correspond to each other, but I only get a graph of the values)

@Hillel

Yes, you may draw the bars at other dates within the actual xaxis range.
In this trace definition:


fig.add_trace(go.Bar(x= df['date'],
                             y= [1]*len(df), 
                             name='',
                             showlegend=False,
                             marker_color=event_color,
                             width=0.2, #bar width
                             customdata=df['description'],
                             hovertemplate='date: %{x}<br>event: %{customdata}',
                             opacity=0.65
                              ), secondary_y=True);

replace x=df['date'] with a list of datetime(s), at your convenience.

1 Like

Thanks @empet! No luck though. I tried following the example to a tee but no luck. It’s actually not graphing the events at all…
I have tried with various time types (datetime, datetime64) as I was thinking maybe that’s the issue, but to no avail yet. Any other idea?
Just in case, here is some of the data…

events:
dates = [Timestamp(β€˜2018-01-14 00:00:00’), Timestamp(β€˜2018-10-24 00:00:00’), Timestamp(β€˜2018-11-07 00:00:00’)]
descriptions = [β€˜Kytruda’, β€˜Video assisted thoracoscopy lobectomy’, β€˜Kytruda’]
types = [β€˜Drug Therapy’, β€˜Surgery’, β€˜Drug Therapy’]

and the main trace:
dates = [Timestamp(β€˜2020-01-19 00:00:00’), Timestamp(β€˜2019-12-09 00:00:00’), Timestamp(β€˜2019-11-20 00:00:00’), Timestamp(β€˜2019-10-30 00:00:00’), Timestamp(β€˜2019-10-10 00:00:00’), Timestamp(β€˜2019-09-18 00:00:00’), Timestamp(β€˜2019-08-28 00:00:00’), Timestamp(β€˜2019-08-06 00:00:00’), Timestamp(β€˜2019-07-10 00:00:00’), Timestamp(β€˜2019-06-19 00:00:00’), Timestamp(β€˜2019-05-29 00:00:00’), Timestamp(β€˜2019-05-07 00:00:00’), Timestamp(β€˜2019-04-17 00:00:00’), Timestamp(β€˜2019-03-27 00:00:00’), Timestamp(β€˜2019-03-06 00:00:00’), Timestamp(β€˜2019-02-13 00:00:00’), Timestamp(β€˜2019-01-23 00:00:00’), Timestamp(β€˜2019-01-06 00:00:00’), Timestamp(β€˜2018-12-26 00:00:00’), Timestamp(β€˜2018-12-05 00:00:00’), Timestamp(β€˜2018-11-25 00:00:00’), Timestamp(β€˜2018-11-14 00:00:00’), Timestamp(β€˜2018-10-26 00:00:00’), Timestamp(β€˜2018-10-25 00:00:00’), Timestamp(β€˜2018-10-24 00:00:00’), Timestamp(β€˜2018-10-23 00:00:00’), Timestamp(β€˜2018-10-21 00:00:00’), Timestamp(β€˜2018-10-03 00:00:00’), Timestamp(β€˜2018-09-20 00:00:00’), Timestamp(β€˜2018-09-12 00:00:00’), Timestamp(β€˜2018-09-03 00:00:00’), Timestamp(β€˜2018-08-27 00:00:00’), Timestamp(β€˜2018-08-19 00:00:00’), Timestamp(β€˜2018-08-12 00:00:00’), Timestamp(β€˜2018-08-05 00:00:00’), Timestamp(β€˜2018-07-29 00:00:00’), Timestamp(β€˜2018-07-23 00:00:00’), Timestamp(β€˜2018-07-16 00:00:00’), Timestamp(β€˜2018-07-09 00:00:00’), Timestamp(β€˜2018-07-04 00:00:00’), Timestamp(β€˜2018-07-01 00:00:00’), Timestamp(β€˜2018-06-25 00:00:00’), Timestamp(β€˜2018-06-24 00:00:00’), Timestamp(β€˜2018-06-10 00:00:00’), Timestamp(β€˜2018-05-06 00:00:00’), Timestamp(β€˜2018-04-29 00:00:00’), Timestamp(β€˜2018-03-25 00:00:00’), Timestamp(β€˜2018-03-18 00:00:00’), Timestamp(β€˜2018-03-04 00:00:00’), Timestamp(β€˜2018-02-25 00:00:00’), Timestamp(β€˜2018-02-04 00:00:00’), Timestamp(β€˜2018-01-21 00:00:00’), Timestamp(β€˜2018-01-14 00:00:00’), Timestamp(β€˜2017-12-24 00:00:00’), Timestamp(β€˜2017-12-10 00:00:00’), Timestamp(β€˜2017-12-03 00:00:00’), Timestamp(β€˜2017-11-24 00:00:00’), Timestamp(β€˜2017-09-13 00:00:00’)]
values = [ 2.6 3.5 3.9 3.8 3.5 3.5 4.6 4.7 3. 3.3 3.7 4.8
3.8 3.1 3.4 3.1 5.8 4.4 3.8 5.4 3.8 4.5 3.81 5.74
13.1 4.81 4.1 3.4 3.1 3.8 3.1 3.3 3.5 3.8 3.8 3.9
3.7 4. 3.3 2.5 1.4 1. 0.9 6.8 0.6 2.9 1.9 3.8
2.2 3.6 4.3 2.5 2.1 10.9 2.8 10.7 10.9 10.6 ]

@Hillel

It works for distinct x-lists in the definition of the two plots, when x are cartesian coordinates:

from plotly.subplots import make_subplots
from datetime import datetime
import pandas as pd
import plotly.graph_objects as go
import numpy as np

fig = make_subplots(specs=[[{"secondary_y": True}]])
y = [0.96, 1.5,  1.31,  1.84, 2.05, 0.87]
dates = [datetime(2020, 1, k).strftime("%m.%d.%Y") for k in range(5, 31, 5)]                                 

d = {'date': dates,
     'description': [f'event_{k}'  for k in range(6)],
    'type':['A', 'B', 'C', 'D', 'E', 'F']
            }
df= pd.DataFrame(d)


event_color=['aqua', 'brown', 'darkseagreen', 'firebrick', 'magenta', ' darkturquoise']



fig.add_trace(go.Bar(x= np.arange(1, 7), #df['date'],
                             y= [1]*len(df), 
                             name='',
                             showlegend=False,
                             marker_color=event_color,
                             width=0.2, #bar width
                             customdata=df['description'],
                             hovertemplate='date: %{x}<br>event: %{customdata}',
                             opacity=0.65
                              ), secondary_y=True);
#The main plot
fig.add_trace(go.Scatter(x=np.arange(1,7)+0.25, #df['date'], 
                         y=y, 
                         line_color='red', 
                         line_width=2),  secondary_y=False);


fig.update_layout(width =800, height=450,
                  yaxis2=dict(fixedrange= True,
                              range= [0, 1],
                              visible= False ))  

This points out that when x are dates converted to strings, Plotly cannot sort those dates increasingly.
But I checked it and even with pure datetimes the bars are not plotted. Please open an issue on plotly.js, because this is a plotly.js bug.

1 Like