Trouble with dataframe slicing (using iloc and loc) in Dash

darkreqnx · September 30, 2019, 2:23pm

I have been trying to build a multi-page Dash app based on how nedned had mentioned in Splitting callback definitions in multiple files. My directory structure pretty much follows the same blueprint:

run.py
project/
	__init__.py
	app.py
	server.py
	callbacks1.py
	tab1_layout.py
	tab1_callbacks.py
	wsgi.py

app.py is like an index file that has an overarching layout for the dashboard. The dashboard’s contents are to be split across tabs, for which I decided to split the code across files dedicated to those tabs’ contents, and corresponding files for dedicated callbacks (like tab1_layout.py and tab1_callbacks.py).

The problem started when I tried to write a callback with some pandas dataframe manipulation/calculation in tab1_layout.py. And it was simple slicing using iloc/loc. For some reason, it is throwing me a positional-index-out-of-bounds error when I am facing no such problem with the same operations in a Jupyter notebook or even a standalone vanilla python file.

Other than listing through iloc/loc, I tried out two other methods to calculate dff which worked:

########### different ways to calculate dff ###########

# direct listing using iloc
dff = dff_buff.iloc[[0, 1, 2, 4, 5, 25, 56]].reset_index(drop=True)

# chaining using append() from Pandas
chain_1 = dff_buff.iloc[0:3].append(dff_buff.iloc[4:6]) 
chain_2 = chain_1.append(dff_buff.iloc[25:26])
dff = chain_2.append(dff_buff.iloc[56:57])

# reset_index() without dropping and filtering based on old index
dff_buff_1 = dff_buff.reset_index()
dff = dff_buff_1[(dff_buff_1['index'] == 0) 
   | (dff_buff_1['index'] == 1) 
   | (dff_buff_1['index'] == 2)
   | (dff_buff_1['index'] == 4) 
   | (dff_buff_1['index'] == 5) 
   | (dff_buff_1['index'] == 25)
   | (dff_buff_1['index'] == 56)]

#######################################################

However, even after this, Dash throws a callback error (KeyError: None) at the sum(axis=1) function, which led me to think that perhaps that iloc/loc mess-up is something I cannot ignore.

For reference (data at: https://github.com/darkreqnx/sales_file):

import numpy as np
import pandas as pd

xlsx = pd.ExcelFile('./data/sales_data.xlsx')
df_oper = pd.read_excel(xlsx, 'oper')

states_list = list(df_oper)[2:]
months_list = list(df_oper.month.value_counts(dropna=False).reset_index()['index'])

month_selected = 'Sep-15'
districts_selected = ['district_1', 'district_6']

op_tb1 = pd.DataFrame([], columns=['Quick figures'])
op_tb2 = pd.DataFrame([], columns=['No of units sold'])

dff_buff = df_oper[df_oper.month == month_selected].reset_index(drop=True)

#######################################################

# calculate dff

#######################################################

df1 = dff.copy()
df1['sales_metric'][0] = 'Primary Sales'
df1['sales_metric'][1] = 'Returned Sales'
df1['sales_metric'][2] = 'Accepted Sales'
df1['sales_metric'][3] = 'Net Sales'
df1['sales_metric'][4] = 'Invalid Sales'

df2 = df1.iloc[0:5]
df2['district_6'][2] = df1['district_6'].iloc[6]
df2['district_6'][3] = df1['district_6'].iloc[5]
df2['district_6'][4] = df1['district_6'].iloc[6] - df1['district_6'].iloc[5]

op_tb1['Quick figures'] = df2['sales_metric']
op_tb2['No of units sold'] = df2[districts_selected].sum(axis=1)


print(op_tb1)
print(op_tb2)


tr_breaker = input("\nend of code block check")

Another interesting thing I found was that the positional index error wasn’t thrown when I used iloc in an iloc[start:stop] manner, instead of an iloc[i1, i2, i3, …]. Can somebody please help me understand where I am going wrong with iloc/loc, and also help resolve the issue with the sum() function?

Error due to iloc/iloc: error_1
and error at sum(axis=1): error_2

nedned · October 1, 2019, 1:58am

It’s odd that you’re not getting the error in a Jupyter notebook but are in your Dash app. Although it still sounds like a Pandas issue rather than a Dash one. Perhaps try to verify that you have the exact same environment in your Dash app and in your notebook (eg same versions of all relevant libraries.) and your dataframe at the start of the callback is exactly the same as being used in the notebook. My guess is that there could be a subtle difference between the two states.

Topic		Replies	Views
Please help me breakdown and understand this code Dash Python question	2	606	September 14, 2022
Callback confusion Dash Python	11	3814	January 18, 2020
How does one modify a dataframe with a callback? [former: circular import errors] Dash Python	42	188	October 28, 2024
Rangeslider callback with df.column slicing Dash Python	1	419	January 10, 2020
Handle global dataframe Dash Python	3	2269	April 12, 2021

Trouble with dataframe slicing (using iloc and loc) in Dash

Related topics