I have a number of issues with ff.create_distplot()
:
1. Consider the following data:
import numpy as np
import plotly.graph_objs as go
import plotly.figure_factory as ff
m = np.random.normal(loc=0.08, scale=0.0008, size=5000)
Histogram of the data:
fig = go.FigureWidget()
fig.add_histogram(x=m)
fig
However, when I try to produce a density plot using the figure factory, it does not produce what I want:
hist_data = [m]
group_labels = ['m1']
colors = ['#333F44']
# Create distplot
fig = go.FigureWidget(ff.create_distplot(hist_data, group_labels, show_hist=True, colors=colors))
fig.layout.update(title='Density curve')
fig
I can perhaps tinker with it until it gives me the right plot, but I think there is an issue there.
If I set show_hist=False
, the plot looks much better:
The problem seems to be with the bins of the histogram. If we set scale=0.08
we can see that the histogram is displayed only in one bin:
2. Even though the histnorm
is set to probability density
by default, I did not manage to make it look like a probability density. It looks more like a frequency “distplot”.
3. The curve_type
is set to kde
. What kind of KDE is being used? I would like to try the epanechnikov
kernel for instance.
Is a kde
curve type meant to produce something like the density function in R?
4. When several distplots are combined, e.g.:
hist_data = [m, m+0.001]
group_labels = ['m1', 'm2']
colors = ['#333F44', '#37AA9C']
# Create distplot
fig = go.FigureWidget(ff.create_distplot(hist_data, group_labels, show_hist=False, colors=colors))
fig.layout.update(title='Density curve',
)
fig
The rug plot as well as the legend do not appear in the logical sequence.
Sure, we can set
fig.layout.update(legend=dict(traceorder='normal'))
but I think the default should be the order in which they were added.
I also think that the distance between the rug plots is disproportionately big.