🚀 Gen 5 of the leading AI app deployment platform launches October 6. Click for the livestream.

How to create a grouped bar chart with a date selector and multiple columns for y-axis

I am pretty new to plotly, and am amazed at the graphs/charts that can be produced. However, I am having an issue with generating a bar chart that has a time-based date selector and multiple columns for values. I believe I am just missing something pretty basic. Any advice/suggestions are greatly appreciated.

My goal is to have a grouped bar chart (stacked will work as well, but grouped chart is preferred) that allows the user to select specific months/years on a selector under the chart. When they select a different month/year, the updated chart is displayed.

I am pulling the data from a DB and loading the data into a dataframe and then creating a dataframe that is segmented by date (month/year) to create the trace for each graph.

The data is in wide-format (though I’ve also converted it into long-format - listed below) and consists of the date information, communities (that will represent the x-axis) and about 6 categories of counts that need to be graphed. Those 6 categories are needed to make up each bar in the grouped chart.

Please see some sample data:

I have been able to create a sample chart, but have only been able to successfully pull the first column (bm_cnt) of data for the y-axis. (Data shows the 12/2020 date, but able to correctly select 10/2020 and 11/2020 as well).

I believe that I would need to make multiple traces for the graph to retrieve the other columns for the y-axis. However, I’m not sure how to do that within a single list item to pass to go.Figure.

Here is some of the code that I am using to generate the graph:

data_trace = []
for yearmon in df_users_community['yearmons'].unique():
    
    
    df_segmented = df_users_community[(df_users_community['yearmons']==yearmon)] 

#    trace = []
    data_each_month = dict (
        type='bar',
        x=df_segmented['community_name'], 
        y=df_segmented['bm_cnt'],
#        y1=df_segmented['cp_cnt'],
        name='BM',
        visible = False,
        marker=dict(color='blue')
        )


    data_trace.append(data_each_month)

#    data_trace.append(trace)

idx=len(data_trace) - 1
data_trace[idx]['visible'] = True     

#Create and add the slider
event_dt_list=df_users_community['event_dt'].unique()
steps = []
for i in range(len(data_trace)):
    step = dict(method='update',
                  args=[{'visible': [False] * len(data_trace)}],
                  label=event_dt_list[i])
    step['args'][0]['visible'][i] = True
    steps.append(step)

sliders = [dict(active=idx, pad={"t": 1}, steps=steps)]

layout = dict(title = 'Unique Users by Community by Month',
              sliders=sliders,
              barmode='group',
              height=650)

fig = go.Figure(data=data_trace, layout=layout)

fig.show()    

Here is some of the information that gets printed out for the segmented df and fig.data:

#############################################################################
yearmon is: [2020/12]
   yearmons event_dt community_name  bm_cnt  cp_cnt  fm_cnt  res_cnt  spu_cnt  sr_cnt
8   2020/12  12/2020              A       3       1       2        1        3       1
9   2020/12  12/2020              B       0       1       1        1        1       1
10  2020/12  12/2020              C       1       1       1        0        2       5
11  2020/12  12/2020              D       1       0       0        0        2       0
(Bar({
    'marker': {'color': 'blue'},
    'name': 'BM',
    'visible': False,
    'x': array(['A', 'B', 'C', 'D'], dtype=object),
    'y': array([0, 1, 0, 1], dtype=int64)
}), Bar({
    'marker': {'color': 'blue'},
    'name': 'BM',
    'visible': False,
    'x': array(['A', 'B', 'C', 'D'], dtype=object),
    'y': array([0, 1, 0, 0], dtype=int64)
}), Bar({
    'marker': {'color': 'blue'},
    'name': 'BM',
    'visible': True,
    'x': array(['A', 'B', 'C', 'D'], dtype=object),
    'y': array([3, 0, 1, 1], dtype=int64)
}))

What I have tried:

  • Setting y=df_segmented.columns (and also a list of the specific columns) - resulting graph does not display the desired results
  • creating a list of lists (the nested list contains multiple traces for each instance of the graph) - but this fails on the call to go.Figure since a list of lists is not expected
  • even attempted to add a y0=, y1=,… for each column, but that failed when sent to go.Figure call as well.

As mentioned above, I changed the data in the pandas df from wide-format to long format, and that generated a graph that was close to what I need. The resulting graph is a stacked bar chart containing all of the data counts, but they are all the same color and have the same information listed when hovering over the column in the chart.
Here is some data from the converted df:

yearmon is: [2020/12]
   yearmons event_dt community_name variable  role_counts
8   2020/12  12/2020              A   bm_cnt            3
9   2020/12  12/2020              B   bm_cnt            0
10  2020/12  12/2020              C   bm_cnt            1
11  2020/12  12/2020              D   bm_cnt            1
20  2020/12  12/2020              A   cp_cnt            1
21  2020/12  12/2020              B   cp_cnt            1
22  2020/12  12/2020              C   cp_cnt            1
23  2020/12  12/2020              D   cp_cnt            0
32  2020/12  12/2020              A   fm_cnt            2
33  2020/12  12/2020              B   fm_cnt            1
34  2020/12  12/2020              C   fm_cnt            1
35  2020/12  12/2020              D   fm_cnt            0
44  2020/12  12/2020              A  res_cnt            1
45  2020/12  12/2020              B  res_cnt            1
46  2020/12  12/2020              C  res_cnt            0
47  2020/12  12/2020              D  res_cnt            0
56  2020/12  12/2020              A  spu_cnt            3
57  2020/12  12/2020              B  spu_cnt            1
58  2020/12  12/2020              C  spu_cnt            2
59  2020/12  12/2020              D  spu_cnt            2
68  2020/12  12/2020              A   sr_cnt            1
69  2020/12  12/2020              B   sr_cnt            1
70  2020/12  12/2020              C   sr_cnt            5
71  2020/12  12/2020              D   sr_cnt            0
(Bar({
    'marker': {'color': 'blue'},
    'name': 'BM',
    'visible': False,
    'x': array(['A', 'B', 'C', 'D', 'A', 'B', 'C', 'D', 'A', 'B', 'C', 'D', 'A', 'B',
                'C', 'D', 'A', 'B', 'C', 'D', 'A', 'B', 'C', 'D'], dtype=object),
    'y': array([0, 1, 0, 1, 1, 2, 3, 1, 2, 0, 1, 1, 1, 0, 0, 0, 0, 0, 0, 6, 0, 0, 0, 0],
               dtype=int64)
}), Bar({

And the resulting chart:

While the graph is close to what is needed, I cannot find a way to identify the different values for each community (ie. have the hover display the “variable” column listed in the data) without creating multiple traces. As mentioned before, calling go.Figure with multiple traces (list of lists) is not working for me.

Again, any suggestions/thoughts are greatly appreciated. Thank you for reading down to this point.

Thank you for your help.

1 Like

Hi @sdmcs,

here you go :wink: :

import plotly.express as px
import pandas as pd

# df_users_community = pd.read_excel() ??

communities = ['bm_cnt', 'cp_cnt', 'fm_cnt', 'res_cnt', 'spu_cnt', 'sr_cnt'] # or df_users_community.columns[3:]

fig = px.bar(df_users_community, x='community_name', y=communities, animation_frame='event_dt', 
             labels={'variable':'categorie, whatever', 'value':'count,value,whatever'}, 
             barmode='group', title='your title')
fig.show()

have a look at plotly-express. It simplifies, A LOT, plotting data in dataframes structures:

Merry christmas !

Alex-

2 Likes

Thank you for the information, Alex. I will definitely check this out and follow up. Nice Christmas present for me! Thank you!

Thank you again, Alex. This was a really big help!

Hi @Alexboiboi, how can I group x-axis by communities instead of community_name?