Hello everyone,
We are trying to plot the changes that we have observed in a plot for each category.
Note this is the link to the sample data that we were using: sample_dataset/sample.json at main · vasu228114/sample_dataset · GitHub
- We were able to get the correct graph if the category is static(we were getting the vertical lines exactly at the changes that we have identified), code:
import pandas as pd
import plotly.graph_objects as go
# reading the dataframe from github, link: https://github.com/vasu228114/sample_dataset/blob/main/sample.json
df = pd.read_json('sample.json')
# group in categories
gb = df.groupby('category')
df2 = df.groupby('category')['changes_observed'].apply(lambda x: x.value_counts().index[0]).reset_index(name='changes')
# base figure
fig = go.Figure()
for cat in ['B']:
fig.add_trace(go.Scatter(x=gb.get_group(cat)['time_period'],
y=gb.get_group(cat)['scores'],
meta=cat,
name='scores',
mode='lines'))
for change in list(df2[df2['category'] == cat]['changes'])[0]:
fig.add_vline(x=change)
fig.update_layout(
xaxis_title='time period',
yaxis_title="Scores",)
fig.show()
- But when we were trying to replicate the above plot using the drop down button (user input), we were getting all the changes that were identified in all the categories instead of fetching the changes that were identified for that particular category.
Code is as follows:
import pandas as pd
import plotly.graph_objects as go
# reading the dataframe from github, link: https://github.com/vasu228114/sample_dataset/blob/main/sample.json
df = pd.read_json('sample.json')
# group in categories
gb = df.groupby('category')
df2 = df.groupby('category')['changes_observed'].apply(lambda x: x.value_counts().index[0]).reset_index(name='changes')
# base figure
fig = go.Figure()
for cat in df["category"].unique().tolist():
fig.add_trace(go.Scatter(x=gb.get_group(cat)['time_period'],
y=gb.get_group(cat)['scores'],
meta=cat,
name='scores',
mode='lines',
visible = False))
for change in list(df2[df2['category'] == cat]['changes'])[0]:
fig.add_vline(x=change)
# create buttons
fig.update_layout(
xaxis_title='time period',
yaxis_title="Scores",
updatemenus=[
{
"buttons": [
{
"label": c,
"method": "update",
"args": [{ "visible": [z.meta == c for z in fig.data]}],
}
for c in df["category"].unique().tolist()
]
}
]
)
- The thing that we have noticed is that it gets extremely slow to load the plots when dealing with a lot of data and sometimes it kills the jupyter lab kernel. If there is something wrong that we are doing or is there way to speed up things (can’t use dash for the current use case that we are dealing with)??
It would be great to get some answers and thanks in advance.