Hello, I am using go.Box to generate box plots for some data. In addition to the figures, I would to save the statistical information that shows up on the hover including quartile information, median, mean, number of outliers, etc. I’ve looked but been unable to locate this information. Any help in resolving this issue would be greatly appreciated.
Welcome to the forum @arajan
What do you mean you would like to SAVE the statistical information that shows up on the hover? Where do you want to save it?
Hi @arajan you cannot have Javascript communicate back to Python the values of the quantiles (this computation is made in Javascript), but you can precompute them using the various methods of scipy.stats (scoreatpercentile, iqr, etc) and then pass them to go.Box as in this example https://plotly.com/python/box-plots/#box-plot-with-precomputed-quartiles.
@adamschroeder the idea was to save and display the statistics in a table underneath the plots themselves. In addition to displaying the interactive plots, the work flow I have designed involves printing the plots to PDF.
@Emmanuelle thanks for this suggestion. I had seen this option, but wasn’t sure if there was an in-built solution. Thanks, I was able to resolve the issue using your solution.
## Calculate quartiles as outlined in the plotly documentation
## (method #10 in paper https://jse.amstat.org/v14n3/langford.html)
def get_percentile(data, p):
data.sort()
n = len(data)
x = n*p + 0.5
# If integer, return
if x.is_integer():
return round(data[int(x-1)], 2) # account for zero-indexing
# If not an integer, get the interpolated value of the values of floor and ceiling indices
x1, x2 = math.floor(x), math.ceil(x)
y1, y2 = data[x1-1], data[x2-1] # account for zero-indexing
return round(np.interp(x=x, xp=[x1, x2], fp=[y1, y2]), 2)
## calculate all boxplot statistics
q1, median, q3 = get_percentile(data, 0.25), get_percentile(data, 0.50), get_percentile(data, 0.75)
iqr = q3 - q1
# Lower fence value is the minimum of y values that is more than the calculated lower limit
lower_limit = q1 - 1.5 * iqr
lower_fence = round(min([i for i in data.tolist() if i >= lower_limit]), 2)
# Upper fence value is the maximum of y values that is less than the calculated upper limit
upper_limit = q3 + 1.5 * iqr
upper_fence = round(max([i for i in data.tolist() if i <= upper_limit]), 2)