Multi-categorical Axes : How to order the category?

Hi there,

In my dataset, i have quarterly observations, and after rebuilding the index I usually get something like this:

             Value_A    Value_B   Year    Quarter

NaT NaN 1 -1 -1
2007Q1 … … … …
2007Q2 … … … …
… … … … …
2015Q4 NaN 1 2015 4
2016Q1 NaN 1 2016 1
2016Q2 NaN 1 2016 2
2016Q3 36 1 2016 3

The NaT is there on purpose; they reflect inconsistencies and I want to keep them. Therefore, my xaxis is a multicategorical axis, and not a datetime axis.

I get the expected output, which is:
StartFromQ12007

It’s worth to mention that:
a) my data are already properly sorted in the dataframe,
b) when plotting my traces, they are plotted by Quarter, like this:

for i in [-1, 1, 2, 3, 4]:
   trace=go.Bar(...)

Problems arise if the first observation for a specific year is for Q2, Q3, or Q4.
For example, say we have:
Value_A Value_B Year Quarter
NaT NaN 1 -1 -1
2015Q4 NaN 1 2015 4
2016Q1 NaN 1 2016 1
2016Q2 NaN 1 2016 2
2016Q3 36 1 2016 3

In such a case, i get this output:

So, either
a) I find a way to force the multicategorical axis to be order by year
(I tried with categoryorder =“array”, and categoryarray=["-1". “2007”, “2008”, …“2021”]), but it didnt work
or,
b) I apply a workaround: in my dataset, if the first observation is not for Q1, then it must insert data for Q1.

Solution a) would be better, but I can’t find the way. Does anyone know a way to sort a multicategorical axis? The references give an example for a categorical array, not a multicategorical one (Categorical Axes | Python | Plotly)

Edit: In the meantime, I found this discussion: Extend `categoryorder` to multicategory axes · Issue #3908 · plotly/plotly.js · GitHub
and also this one: Multi-category sorting bug with missing categories · Issue #3723 · plotly/plotly.js · GitHub

While I was looking for a way to sort a multicategorcal axis, I found my own post, written 2y earlier…

I have an xaxis whose external categories are [2017, 2017, 2017, 2017, 2018, 2018, 2018, 2018]
and inner categories are [Q1, Q2, Q3, Q4].

Because some traces don’t always have data for each date, it starts with Q2…
image

My xaxis already set categoryorder=“category ascending”. But it looks like it’s only applied on the external level (?)

Is there any workardound to fix that?

It happens because the first external category does not contain all the inner categories:
image

So, I assume the starting point is making sure that I do have data for Q1 if the dataset starts at Q2, for Q1 and Q2 if it starts at Q3, etc…and fill the blank with 0 or NA…

For other readers who might have the same issues, it seems some work has been done to fix it, here :slight_smile: #2175 & #3723 - Multicategory Multilevel 2+ & Sorting Multicategory by richardnm-2 · Pull Request #6327 · plotly/plotly.js · GitHub

1 Like

Hi @David22, thanks for the update. It helped me save time on researching. Plotly’s Multi-categorical Axes won’t order months/quarters ascending for corner cases where at least 1 year doesn’t have full 12-month data.

Since the bug seems not to be fixed soon, using Plotly’s multi-categorical axes is not applicable for cases where a timeseries on xaxis contains year(s) with less than 12-month data. Hence I’ve searched for a workaround. Inspired by this annotation approach & matplotlib’s secondary xaxis approach and with the help from AI, I’ve come up with a workaround to sort the years by placing them below the months/quarters as text annotations using “paper coordinates” (docs) then drawing the grouping lines using minor ticks instead. It’s not perfect but is close enough for my needs so I thought I’d share this workaround for anyone who might find themselves in a similar situation like I did.

Drawback: when panning or zooming in, the years might disappear as their x-coordinates are relative to the xaxis scale when using xref="x". I think this can be fixed by approaching FigureWidget’s layout.on_change or Dash’s relayoutData to recalculate the xaxis range. The below reproducible code doesn’t fix this drawback as the potential fix seems slightly complicated for my use case.

Output:

Reproducible example:

import numpy as np
import pandas as pd
import plotly.graph_objects as go

# Generate sample data:
np.random.seed(42)  # for reproducibility
num_months = 25
dates = pd.date_range("2020-05-01", periods=num_months, freq="MS")
revenue = np.random.randint(50000, 150000, size=num_months)
df = pd.DataFrame({"month": dates, "revenue": revenue})
# Plot:
fig = go.Figure()
fig.add_trace(go.Bar(x=df["month"], y=df["revenue"]))

# Calculate the OUTER minor ticks' x-coordinates for year grouping lines:
offset = pd.DateOffset(17)  # prevent lines from overlapping with major tick labels
x_min = min([min(trace_data.x) for trace_data in fig.data]) - offset
x_max = max([max(trace_data.x) for trace_data in fig.data]) + offset
minor_ticks = [x_min, x_max]

# Calculate the years' x-coordinates:
years = df["month"].dt.year.unique()
for year in years:
    df_year = df[df["month"].dt.year == year]
    
    # Calculate the years' x-coordinates at the center points:
    midpoint = (
        df_year["month"].min() + (df_year["month"].max() - df_year["month"].min()) / 2
    )

    # Calculate the INNER minor ticks' x-coordinates for year grouping lines by excluding the last year to avoid duplicate overlapping lines from OUTER minor ticks:
    if year != years.max():
        minor_ticks.append(df_year["month"].max() + offset)

    # Add the year labels as annotations:
    fig.add_annotation(
        xref="x",  # Relatively-positioned annotation: x-coordinates are with respect to xaxis
        yref="paper",  # Absolutely-positioned annotation
        x=midpoint,
        y=-0.15,
        text=str(year),
        showarrow=False,
    )

fig.update_xaxes(
    dtick="M1",
    tickformat="%b",
    tickangle=0,
    minor={"tickvals": minor_ticks, "ticklen": 50, "tickcolor": "gray"},  # draw lines
    range=[x_min, x_max],  # extend x-axis limits to match minor ticks range
    showline=True,
    linecolor="gray",
)
fig.update_layout(
    yaxis={"showgrid": False, "tickprefix": "$", "ticksuffix": " "},
    plot_bgcolor="rgba(0,0,0,0)",
)