Best plotly equivalent to sns.distplot(data, fit=norm)

Hello there.

I’m trying to find the best, quickest equivalent possible to the following seaborn snippet:

import seaborn as sns
from scipy.stats import norm

sns.distplot(data, fit=norm)

This allows me to fit the normal to an existent displot in seaborn in a very handy manner. What would be the best equivalent in plotly?

Thanks for your time

Hi @jralfonsog,

Welcome to Plotly forum!!
Plotly provides the function plotly.figure_factory.create_distplot() to generate a distplot, that can display the histogram, the pdf estimate, and the rug plot:

create_distplot(hist_data, group_labels, bin_size=1.0, curve_type='kde', colors=None, rug_text=None, histnorm='probability density', show_hist=True, show_curve=True, show_rug=True)

This function works with multiple data sets. If you want to plot just the distplot associated to a single sample,
x= [n values], then pass to hist_data, [x], i.e. a list of a list, not just x.

Example:

import plotly.figure_factory as ff
import numpy as np
np.random.seed(123)
x = np.random.normal(loc=2.5, scale=0.85, size=300) 
group_labels = 'My sample'

# Create distplot with custom bin_size, and without rug plot
fig = ff.create_distplot([x], [group_labels], bin_size=.2, show_rug=False)
fig.update_layout(width=600, 
                  height=400,
                  bargap=0.01)

distplot1

If we set above, show_rug=True, we get:

distplotRug

For more information on this function type:

help(ff.create_distplot)

and here https://plot.ly/python/distplot/ you can find more examples, but with no settings to ensure plot aesthetics (i.e. they are plotted with default layout.width and layout.height, and the bargap is not set, as i did above). That’s why the histograms look like a continuum, not like in these seaborn examples https://seaborn.pydata.org/generated/seaborn.distplot.html.
Hence you should customize the figure appearance.

2 Likes

(Late) thanks for your response, @empet

I’m familiar with ff.create_displot(). What I would like to do would similar to being able to plot “kde” and “normal” curve types at the same time. I guess I could do something like:

import plotly.figure_factory as ff
from scipy.stats import norm
import numpy as np
np.random.seed(123)

data = np.random.noncentral_chisquare(3, 20, 1000)

m, s = norm.fit(data)
gaussian_data = np.random.normal(m, s, 10000)

fig = ff.create_distplot(
    [data, gaussian_data],
    group_labels=["plot", "gaussian"],
    curve_type="kde"
)

fig.show()

But, in this case:

  • I would have to print the gaussian histogram, and I would rather not to (AFAIK, you can’t choose not to print histogram just for a specific element of hist_data).
  • I would have to pick a big number of random gaussian samples to ensure a proper gaussian visualization.

On the other side, ff.create_displot says it’s deprecated in favor of px.histogram, but I can’t find a way of plotting kde easily with px.histogram.

Thanks a lot for your time.

@jralfonsog

When I answered your question, ff.create_distplot() wasn’t declared “deprecated”. Reading the attribute description in help(px.histogram) it doesn’t seem that px.histogram have an option for density estimation, via kde. @nicolaskruchten could you please give more details to @jralfonsog on this aspect?

You’re right, we don’t have KDE functionality within px.histogram yet, so for those uses I would recommend either computing the KDE line outside of plotly and using px.histogram().add_trace() or continuing to use ff.create_distplot() keeping in mind that we’re not maintaining it much any more.

At some point in the hopefully-near future we will add the KDE functionality to px.histogram() … assistance is very welcome if someone wants to pitch in with a PR! Happy to discuss the design I have in mind!

Thanks both @empet and @nicolaskruchten for your answers. I’ll give the add_trace() way a look, and I’ll let you know.

Maybe not strictly a plotly question, but I’ve not been able to properly compute KDE and add it as a plotly trace to px.histogram(). Could any of you please give me some help? (An example would probably work perfectly)

Thanks