I’m just discovering distplot and love it, it’s making awesome plots for my kind of data ! However, there is still something missing for me but I can’t really figure out how to do it…
My data looks like this with different neuron types as column names and activity times in each columns. This makes it that all columns don’t have the same size.
Displot is doing an awesome job making bins for each neurons but all bins have different start and ends. What I was interesting is though is the number of activity timepoints (lower plot) for each bins (upper plot). I guess having it as Y values would work for the upper plot but since the bins are so different I’m getting stucK…
@lguerard I didn’t understand how the plot in the posted image is created, because the bars in the associated histogram have a unique color, whereas the legend shows many colors. Is it the histogram and probability density function (curve) associated to just a column, and you defined a subplot with many rows, such that on each row you plotted the distplot of a column?
If this is not the case, please write down here your call for ff.create_distplot.
If you want to illustrate in distplot the count for each bin, then follow this example:
import plotly.figure_factory as ff
import numpy as np
np.random.seed(2020)
group_labels = ['distplot'] # name of the dataset
fig = ff.create_distplot([np.random.randn(1000)], group_labels, histnorm= '', bin_size=0.5)
fig.update_layout(width=700, bargap=0.01)
By default the histnorm in ff.create_distplot is set as 'probability density', i.e. each histogram bar has the height equal to the probability of data falling in the corresponding bin. Setting histnorm = '' you’ll get a histogram with each bar of height equal to the number of data points that bin.
Indeed, I just selected a few populations to display on the graph since you can just click on the pop name to show/hide it. I thought this would make visibility easier !
Super cool ! This worked flawlessly, thank you very much ! . However, as you can see below the curve is not displayed anymore is there a way to still show it ?
@lguerard When histnorm='', the graphs of the estimated pdfs cannot be seen because their max values are very small compared to the max number of counts in a bin, i.e. their graphs is very close to x-axis.
For example if the estimated pdf is almost equal to the normal pdf:
then the max value of this pdf is f(mean) = 1/(sigma *sqrt(2 pi)). For sigma >=1 this max value is less than 1. If the max count is 200, it is obvious that the graph of the pdf has its y-coordinates <1, and are very, very small compared to 200.
The graph of the estimated pdf is included in the plot generated by ff.create_distplot to compare it with the histogram coresponding to histnorm='probability density'.
I have another question not linked directly to distplot but plotly in general. Is it possible to interactively extract the displayed values when zoomed in.
For example, if I zoom in a specific time zone, can I then use all the values of that time zone to plot a different plot ? I know I can just filter the dataframe using the values, but we’re having users not super familiar with python and plotly…
If you define your figure as a go.FigureWidget then you can perform some interaction with your plot, but not generating a new plot from data displayed in the window after a zoom-in:
Also, I tried using FigureWidget to have some interaction but this code gives me weird result…
bin_size_var = 5
fig = ff.create_distplot([neuron_data[c][neuron_data[c].notnull()] for c in neuron_data.columns],
neuron_data.columns, histnorm='', bin_size=bin_size_var)
# fig.show()
# find the range of the slider.
xmin = neuron_data.min().min()
xmax = neuron_data.max().max()
# create FigureWidget from fig
f = go.FigureWidget(fig)
# our function that will modify the xaxis range
def update_range(start, end):
f.layout.xaxis.range = [start, end]
# display the FigureWidget and slider with center justification
vb = VBox((f, interactive(update_range,
start=(xmin, xmax, (xmax - xmin) / 1000.0),
end=(xmin, xmax, (xmax - xmin) / 1000.0))))
vb.layout.align_items = 'center'
vb
So fig works fine and prints the plots that I showed before but vb just prints VkJveChjaGlsZHJlbj0oRmlndXJlV2lkZ2V0KHsKICAgICdkYXRhJzogW3snYXV0b2JpbngnOiBGYWxzZSwKICAgICAgICAgICAgICAnaGlzdG5vcm0nOiAnJywKICAgICAgICAgICAgICAnbGXigKY= somehow…
Any ideas ? Are figure_factory not possible to display in FigureWidget ?