How to plot multiple vertical lines based on the category filter selected by user

Hello everyone,

We are trying to plot the changes that we have observed in a plot for each category.

Note this is the link to the sample data that we were using: sample_dataset/sample.json at main · vasu228114/sample_dataset · GitHub

  1. We were able to get the correct graph if the category is static(we were getting the vertical lines exactly at the changes that we have identified), code:
import pandas as pd
import plotly.graph_objects as go

# reading the dataframe from github, link: https://github.com/vasu228114/sample_dataset/blob/main/sample.json
df = pd.read_json('sample.json')

# group in categories
gb = df.groupby('category')

df2 = df.groupby('category')['changes_observed'].apply(lambda x: x.value_counts().index[0]).reset_index(name='changes')

                                                                                                               
# base figure
fig = go.Figure()


for cat in ['B']:
    fig.add_trace(go.Scatter(x=gb.get_group(cat)['time_period'], 
                    y=gb.get_group(cat)['scores'], 
                    meta=cat, 
                    name='scores', 
                    mode='lines'))
    
    for change in list(df2[df2['category'] == cat]['changes'])[0]:
        fig.add_vline(x=change)

fig.update_layout(
    xaxis_title='time period',
    yaxis_title="Scores",)
fig.show()

  1. But when we were trying to replicate the above plot using the drop down button (user input), we were getting all the changes that were identified in all the categories instead of fetching the changes that were identified for that particular category.

Code is as follows:

import pandas as pd
import plotly.graph_objects as go

# reading the dataframe from github, link: https://github.com/vasu228114/sample_dataset/blob/main/sample.json
df = pd.read_json('sample.json')

# group in categories
gb = df.groupby('category')

df2 = df.groupby('category')['changes_observed'].apply(lambda x: x.value_counts().index[0]).reset_index(name='changes')

# base figure
fig = go.Figure()

for cat in df["category"].unique().tolist():
    fig.add_trace(go.Scatter(x=gb.get_group(cat)['time_period'], 
                    y=gb.get_group(cat)['scores'], 
                    meta=cat, 
                    name='scores', 
                    mode='lines', 
                    visible = False))
    
    for change in list(df2[df2['category'] == cat]['changes'])[0]:
        fig.add_vline(x=change)
        

    
# create buttons
fig.update_layout(
    xaxis_title='time period',
    yaxis_title="Scores",
    updatemenus=[
        {
            "buttons": [
                {
                    "label": c,
                    "method": "update",
                    "args": [{ "visible": [z.meta == c for z in fig.data]}],
                }
                for c in df["category"].unique().tolist()
            ]
        }
    ]
)

  1. The thing that we have noticed is that it gets extremely slow to load the plots when dealing with a lot of data and sometimes it kills the jupyter lab kernel. If there is something wrong that we are doing or is there way to speed up things (can’t use dash for the current use case that we are dealing with)??

It would be great to get some answers and thanks in advance.

HI @Vasu,

this looks quite familiar to what I observed in my solution of your other topic. Did you manage to correct the graph behavior?

Try plotting the traces independently from each other and test your process of creating the vlines.

Hi @AIMPED ,

Yes, your answer to that question (How to get proper legend on a filtered line graph) was working fine in the case of legend and I marked that one as the solution as well, thanks for that.

This was the solution to the question that you answered and it was working perfectly fine:

Right now we have been trying to apply the category filter to plot multiple vertical lines on top of line chart and facing the issue (little different use case compared to the legend one).

HI @Vasu ,

I was referring to the differences between “my solution” and your initial chart using plotly.express

If you compare these two side by side, you will see a difference for catagory ‘B’:

initial:

my solution

As I said, something gets mixed up and by the looks of it, you have something similar with the vlines now.