Hi!
I am new to Dash and trying to implement an application that scrapes data online and create graphs according to the data scrapped.
The basic layout looks like this:
It takes in the date as input and scrapes data online. The callback part looks like this:
@app.callback(
[Output(‘overall-barchart-by-brand’, ‘figure’),
Output(‘overall-piechart-by-brand’, ‘figure’)
],
[Input(‘submit-date’, ‘n_clicks’)],
[State(‘start-year’, ‘value’),
State(‘start-month’, ‘value’),
State(‘end-year’, ‘value’),
State(‘end-month’, ‘value’)
]
)
def update_main_charts_by_brand(n_clicks, start_year, start_month, end_year, end_month):
# scrape data online:
df = pd.DataFrame()
url = ‘https://xl.16888.com/brand-0-’ + start_year + start_month + ‘-’ + end_year + end_month + ‘-1.html’
current_page = 1page = requests.get(url) soup = bs4.BeautifulSoup(page.content, 'lxml') num_of_pages = 2 while current_page <= num_of_pages: page = requests.get(url) soup = bs4.BeautifulSoup(page.content, 'lxml') table = soup.find(name='table', attrs={'class':'xl-table-def'}) df = df.append(table_to_df(table)) if current_page != num_of_pages: url = 'https://xl.16888.com/brand-0-' + start_year + start_month + '-' + end_year + end_month + '-' + str(current_page+1) + '.html' current_page += 1 df.columns = ['Rank', 'drop', 'Brand', 'Country', 'Sales', 'Percentage', 'Other_info'] df = df.dropna() df.index = df.Rank bar_chart = go.Figure([go.Bar(x=df['Brand'].values, y=df['Sales'].values)]) # clean data for pie chart df_percent = df[df['Percentage'] != '-'] pie_chart = go.Figure([go.Pie(labels=df_percent['Brand'], values=[float(percent.strip('%'))/100 for percent in df_percent['Percentage']])]) return bar_chart, pie_chart
I tried this function in jupyter notebook and it works fine, but I got this error when running the dash application:
ValueError: Length mismatch: Expected axis has 1 elements, new values have 7 elements
I think the data is not scrapped correctly, which caused the problem when assigning the column names. Any idea how can I solve this?