I would like to use rangebreaks for data that is on a level more fine than hours (30 minute, 15 minute, etc.).
The docs are very clear about how to weed out certain days and hours, like so:
dict(bounds=[17, 9], pattern="hour"), #hide hours outside of 9am-5pm
But what about higher resolutions?
The most obvious use-case is for a candlestick chart for the NYSE operating hours which are from 9:30 to 16:00. Using the above code, we can only weed out times up to 9:00, leaving a nasty little 30 minute gap between 9:00 and 9:30. I’ve tried entering a float value of 9.5, but it does not work.
So how can we use rangebreaks to skip breaks on, for example, 30-minute or 15-minute interval candlesticks, or even much higher resolutions?
Note: I thought we might need to use values/dvalues, but the documentation for dvalues is difficult to understand, and there are no examples in the documentation showing how it could be used.
I managed to find an article showing how to use dvalues to remove gaps… but the article is in Japanese.
By chance, I can read Japanese, so I’m posting this to hopefully show any future readers how to remove gaps of any size, theoretically down to 1 millisecond resolution data.
It turns out to be very simple using Pandas:
# Note: Example is for 30-minute bars
bars = ... <your candlestick data in a Pandas Dataframe>
df_resample = bars.resample("30T").max()
merged_index = bars.index.append(df_resample.index)
timegap = merged_index[~merged_index.duplicated(keep=False)]
dvalue = 30 * 60 * 1000 # 30min * 60sec/min * 1000msec/sec
- Resample your bars using the interval of the bars. In my example I’m using 30-minute bars, so I resampled using ‘30T’ which means 30 minutes to Pandas. Reference: https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#dateoffset-objects
- Merge the indexes of your original data with the new resampled data, then take a difference to find only values that do not exist in your original data (referred to as ‘timegap’ in the code). These will be the values you do not want to show (you want to skip) in the graph.
- Set ‘dvalues’ to match your bar interval. dvalues is specified in milliseconds, so my example sets dvalues to 30 minutes specified in milliseconds (30 * 60 * 1000).
Now the gaps are gone!
I had a similar problem using S&P 500 data. I wanted to skip weekends and US holidays, so I found this work around. Note that you need to install the holidays library using pip install holidays
# 'start' & 'end' are datetime objects
us_holidays = pd.to_datetime(list(holidays.US(range(start.year, end.year + 1)).keys()))
us_holidays += pd.offsets.Hour(9) + pd.offsets.Minute(30)
rangebreaks = [dict(bounds=[16, 9.5], pattern="hour"), dict(bounds=["sat", "mon"]), dict(values=us_holidays)]