Background:
I’m converting a bunch of plots from seaborn/matpotlib to plotly. The old static plots were designed with specific requirements, most of which can’t be changed now so I’m basically recreating them one-for-one as much as possible and adding some plotly specific features on top (mostly interaction controls).
Problem:
One of plots is a grid of histograms that all look sort of like this:
I’ve overlaid a rugplot (orange) over the histogram to ensure even single data points are obviously visible.
From what I’ve gathered in the docs rug+hist combos are only possible in a margin plot which essentially creates a Figure with a Histogram and Scatter. These figures can’t be combined in the grid created using make_subplots
since they’re Figure objects and not individual Histogram
or Scatter
separately.
Is there a way to recreate the hist+rug visual in a way that I can create a the required grid of subplots?
My current attempt:
I found this in the docs but I don’t think this method is suitable for my use case since I would have to manage 2x the number of figures and ensure the axis mappings are correct.
So I tried to combined the Histogram
and Scatter
traces into a single overlaid plot which I can add in the grid:
# create subplot grid
ncols = 2
nrows = 1
fig = make_subplots(
rows=nrows, cols=ncols,
subplot_titles=subplot_titles,
vertical_spacing=0.10,
horizontal_spacing=0.03,
)
for row in range(1, nrows+1):
for col in range(1, ncols+1):
# create FF hist+rug
ff_data = ff.create_distplot(
[plot_df["column"]],
["distplot"],
bin_size=0.01,
show_curve=False,
)
# modify FF data
ff_data.data[1]["y"] = [-1] * len(ff_data.data[1]["x"])
ff_data.data[1]["yaxis"] = "y"
ff_data.data[1]["marker_color"] = "orange"
# add modified FF data to existing subplot
fig.add_trace(
ff_data.data[0],
row=row, col=col,
)
fig.add_trace(
ff_data.data[1],
row=row, col=col,
)
fig.update_layout(
showlegend=False,
title_text="Specs with Subplot Title",
width=1700,
height=800,
)
fig
This creates the desired output but the jupyter notebook crashes if I try this on a grid larger than 1x2 for my dataframe (~100k rows)
In the actual system I’ll be working with possibly larger dataframes and the subplot grid is usually set to 5x10 so this hack method is not going to work out.