Black Lives Matter. Please consider donating to Black Girls Code today.
Dash HoloViews is now available! Check out the docs.

More pandas related: how to limit data to only display 5 stacks of bars in plotly?

I think it’s more of a pandas dataframe question, because Plotly graphs if given a dataframe will display all the data in it, I have to “filter” in the dataframe.

End result should be a bar chart with 5 stacks of ‘categories’ in ‘total descending’ order, regardless of how many bars are stacked vertically. I need to do that for non-interactive reports.

Any ideas how to achieve this?

Code sample below:

df = pd.DataFrame(data={
    'Department': ['Sales', 'Sales', 'HR', 'HR', 'IT', 'IT', 'Security', 'Security', 'Cybersecurity', 'Management', 'Support', 'Support', 'Fulfillment', 'Store', 'Manufacturing'], 
    'City' : ['NYC', 'LA', 'NYC', 'LA', 'NYC', 'LA', 'NYC', 'LA', 'NYC', 'LA', 'NYC', 'LA', 'NYC', 'LA', 'LA'],
    'CoffeeCups': [15, 35, 54, 67, 32, 12, 56, 97, 45, 10, 54, 23, 66, 10, 31],
})

figure = go.Figure()

for city, group in df.groupby('City'):
    figure.add_trace(go.Bar(x=group['Department'], y=group['CoffeeCups'], name=city))
    
figure.update_layout(barmode='stack')
figure.update_xaxes(type='category', categoryorder='total descending')
figure.show()

This is the graph displayed initially:

But I only want plotly to get those 5 which are 9 bars with stacking occuring:

It could be that I found my solution, but I’m curious whether there are better ways:

I make a temporary dataframe grouping all department (destructively), get their sum, sort them, get the five first and make it into a ‘top_five’ array with which I filter the main dataframe.

df_tmp = df.groupby('Department', as_index=False).sum()
df_tmp = df_tmp.sort_values('CoffeeCups', ascending=False)
top_five = df_tmp['Department'].head(5).unique()
df = df.loc[df['Department'].isin(top_five)]