Line Chart not displaying Lines for multiple Years

mrel · April 28, 2023, 8:24pm

Hi everyone!

I am trying to create a WebApp that would display some charts based on data downloaded from an Oracle DB.

I have a separate Python file where I connect to the server=>download the data into Pandas dataframe=>save the data as Parquet file. Then in a separate Python file where I am creating a Dash app. I import the Parquet file saved by the first file, connect some callbacks and display the line chart.

Data

contains about 50k rows
is monthly (Jan-Dec) product data for last 3 years of the business
looks something like this

This code is the graph part of my WebApp

 @app.callback(
    Output('our_graph','figure'),
    [Input('product','value'),
    Input('type','value'),
    Input('category','value')])

def build_graph(first, second, third):
    dff=df[(df['product']==first)&(df['type']==second)&(df['category']==third)]
    fig = px.line(dff, x="month", y="sum", color="year",height=600)
    fig.update_layout(xaxis=dict(tickmode='linear'),yaxis={'title':'Amount'}, title 
{'text':'Product Analysis','font':{'size':28},'x':0.5,'xanchor':'center'})
    
    return fig

I am basically trying to filter data by a product/category/type and then see how they were doing over last 3 years. Each line on the chart should represent a year (2021, 2022, 2023).

What I am expecting to get is something like this (made in excel): https://i.stack.imgur.com/OpWxV.png

But what I am getting is this

I have been trying to figure out what the problem but can’t find any solution. Lines are overlapping and chart is useless.

I’d appreciate any recommendations and help. Thanks in advance.

AIMPED · April 28, 2023, 10:04pm

Hi, check the dtype of dff["month"]

mrel · April 29, 2023, 7:30am

Hi, @AIMPED !

dff[“month”] is int64.

Here are data types of other columns:
year int64
month int64
company object
product object
type object
category object
count int64
sum float64
currency object
customer object

Am I having problem because Plotly is not able to group/aggregate data and it is simply stacking/overlapping multiple values for years/months ?

AIMPED · April 29, 2023, 8:23am

Try changing the month column to datetime.

mrel · April 29, 2023, 2:47pm

I did it as you suggested df[‘month’]= pd.to_datetime(df[‘month’])
To be sure that it worked, I checked dtypes and it is showing “datetime64[ns]” format for the “month” column.
But it didn’t work. Data in “month” column is in digits, meaning: 1,2,3,4,5,6,7,8,9,10,11,12. I guess, that is why applying the date format is not helping.
This is how graph looks like after thanging the format.

AIMPED · April 29, 2023, 3:37pm

Hi @mrel could you share an extract of your data? It’s pretty difficult to help you without being able to reproduce this.

mrel · April 29, 2023, 3:53pm

Hi, @AIMPED .
Please find the edited version (in csv) here.

I hope it works and you can see the csv file.

AIMPED · April 29, 2023, 4:03pm

mrel:

    dff=df[(df['product']==first)&(df['type']==second)&(df['category']==third)]
    fig = px.line(dff, x="month", y="sum", color="year",height=600)
    fig.update_layout(xaxis=dict(tickmode='linear'),yaxis={'title':'Amount'}, title 
{'text':'Product Analysis','font':{'size':28},'x':0.5,'xanchor':'center'})

This seems to be something different. What do you want to plot? Are you using dash?

	year	month	billed_product	type	category	sum	currency
0	2021	1	A	K	M	51.890	EUR
1	2021	1	A	K	M	208.290	EUR
2	2021	1	A	K	M	0.210	EUR
3	2021	1	A	K	M	47.336	EUR
4	2021	1	A	K	M	7.422	EUR

mrel · April 29, 2023, 4:25pm

Yes, I am using Dash to make a WebApp with three callbacks (for three filters: product filter, type filter and category filter).
X-axis should show months (Jan-Dec, 1-12).
Y-axis should represent total sums.
The graph itself should consist of 3 lines (one line for 2021, second line for 2022, third line for 2023).
Depending on filters chosen (products/types/categories), lines should show monthly totals for per year.
This (made in Excel) is what I am trying to do.

AIMPED · April 29, 2023, 5:44pm

I understand. You will have to prepare your data, you can’t use the dataframe as it is if you want to achieve your result.

You are mentioning the problem right here:

AIMPED · April 29, 2023, 6:54pm

I played around a bit, maybe that helps you to get started:

import dash
from dash import Input, Output, State, html, dcc
import plotly.graph_objects as go
import pandas as pd

df = pd.read_csv('data_test.csv')
gb = df.groupby(['year', 'month'])

app = dash.Dash(__name__)
app.layout = html.Div(
    [
        html.Div(
            [
                dcc.Dropdown(id='prod', options=df.billed_product.unique()),
                dcc.Dropdown(id='type', options=df.type.unique()),
                dcc.Dropdown(id='cat', options=df.category.unique()),
                html.Button(id='btn', children='submit')
            ],
            style={'width': '10%'}
        ),
        dcc.Graph(id='graph')
    ]
)


@app.callback(
    Output('graph', 'figure'),
    State('prod', 'value'),
    State('type', 'value'),
    State('cat', 'value'),
    Input('btn', 'n_clicks'),
    prevent_initial_call=True
)
def update(prod, tpe, cat, _):
    years = df.year.unique()
    months = df.month.unique()

    d = {y: [] for y in years}

    for y in years:
        monthly_total = []
        for m in months:
            try:
                s = gb.get_group((y, m))
            except KeyError:
                break
            monthly_value = s.query(f"billed_product == '{prod}' & type == '{tpe}' & category == '{cat}'")['sum'].sum()
            monthly_total.append(monthly_value)
        d[y] = monthly_total

    fig = go.Figure(
        layout={
            'title': f'billed_product:{prod}, type: {tpe}, category: {cat}',
            'xaxis_title': 'months',
            'yaxis_title': 'sum'
        }
    )
    for y in years:
        fig.add_scatter(
            x=months,
            y=d[y],
            name=str(y),
            showlegend=True
    )
    return fig


if __name__ == '__main__':
    app.run(debug=True, port=8051)

app

Note: The try/except block was necessary because you don’t have data for the full year of 2023 (which is understandable). So when looping over all months, you get a key error when you try to access the group of month 5 (may).

mred

mrel · April 29, 2023, 8:04pm

Thanks a lot for all your time!

It seems like I solved the issue.
All I needed is to add this line of code:

dff=dff.groupby(["year","month","product","type","category"])["sum"].sum().reset_index()

This allowed the data to be grouped/aggregated first and only then passed down to Plotly to visualize it.

Thank you!

AIMPED · April 29, 2023, 8:13pm

Glad you figured it out!

Could you post the working code of your callback here?

mrel · April 29, 2023, 8:23pm

Sure!
Here it is:

@app.callback(
    Output('our_graph','figure'),
    [Input('product','value'),
    Input('type','value'),
    Input('category','value')])

def build_graph(first, second, third):
    dff=df[(df['product']==first)&(df['type']==second)&(df['category']==third)]
    dff=dff.groupby(["year","month","product","type","category"])["sum"].sum().reset_index()
    #need to "import calendar" for the following line
    dff['month_name'] = dff['month'].apply(lambda x: calendar.month_abbr[x])
    fig = px.line(dff, x="month", y="sum", color="year",height=500, labels={
                     "month_name": "Months",
                     "sum": "Amounts (€)",
                     "year": "Years"
                     })
    fig.update_layout(xaxis=dict(tickmode='linear'), title {'text':'Product Analysis','font':{'size':28},'x':0.5,'xanchor':'center'})
    
    return fig

EmadElewa · December 25, 2023, 9:26am

I have the same issue with my data set. but I am getting the data from MS SQL. and it is grouped and aggregated already but still displaying with this useless form.

mrel · December 25, 2023, 4:39pm

Hi @EmadElewa ,

My data is also coming from SQL. I pull data, group and aggregate it and pass it to the graph. putting sort=False, as_index = False is what solved most of issues for me.

This is what I am using in my code:

df.groupby(["period", "bill"], sort=False, as_index = False)["amount"].aggregate(np.sum)

Replace df name and column names and see if it works.

EmadElewa · December 30, 2023, 6:17pm

Thanks @mrel. it works thanks. I just added sort function the the SQL query and it worked.

Topic		Replies	Views
Line graph with monthly data points over multiple years 📊 Plotly Python	2	11728	August 23, 2018
Multiple Line Charts Dash Python	8	16519	March 2, 2020
Making a line chart using 1 DataFrame Dash Python	1	1412	July 15, 2019
Plotly Dash Line Graph Using Dates and price doesn't work Dash Python	1	1094	August 4, 2021
Dash Doesn't Show Plotly Line graph Dash Python	3	1078	April 9, 2020

Line Chart not displaying Lines for multiple Years

Related topics