I am attempting to create a choropleth map using python. The map is for the number of babies given a specific name in each state. The slider is used to change the year and, therefore, change the number of babies for each state. I have been successful at pulling the data, placing it in a panda DataFrame, and initiating the map with the data. I have two issues best described with screenshots.
1)
2)
1) Upon initiation, the legend on the right seems to be showing numbers for all the combined years. The correct legend/numbers appear once the slider is moved. (As seen in the second image.)
2) The data for each year is not matching up with the year on the slider. See the hover box over California compared to the year selected on the slider. I have verified that the year data is accurate and the same for each state. The year on the map simply doesn’t match the year on the slider.
Below is my code. Any insights would be greatly appreciated.
#These are database tables for each state.
state_tables = [‘BabyNameAlabama’, ‘BabyNameAlaska’, ‘BabyNameArizona’, ‘BabyNameArkansas’, ‘BabyNameCalifornia’, ‘BabyNameColorado’, ‘BabyNameConnecticut’,
‘BabyNameDelaware’, ‘BabyNameDistrictColumbia’, ‘BabyNameFlorida’, ‘BabyNameGeorgia’, ‘BabyNameHawaii’, ‘BabyNameIdaho’, ‘BabyNameIllinois’,
‘BabyNameIndiana’, ‘BabyNameIowa’, ‘BabyNameKansas’, ‘BabyNameKentucky’, ‘BabyNameLousiana’, ‘BabyNameMaine’, ‘BabyNameMaryland’,
‘BabyNameMassachusetts’, ‘BabyNameMichigan’, ‘BabyNameMinnesota’, ‘BabyNameMissippi’, ‘BabyNameMissouri’, ‘BabyNameMontana’, ‘BabyNameNebraska’,
‘BabyNameNevada’, ‘BabyNameNewHampshire’, ‘BabyNameNewJersey’, ‘BabyNameNewMexico’, ‘BabyNameNewYork’, ‘BabyNameNorthCarolina’, ‘BabyNameNorthDakota’,
‘BabyNameOhio’, ‘BabyNameOklahoma’, ‘BabyNameOregon’, ‘BabyNamePennsylvania’, ‘BabyNameRhodeIsland’, ‘BabyNameSouthCarolina’, ‘BabyNameSouthDakota’, ‘BabyNameTennessee’,
‘BabyNameTexas’, ‘BabyNameUtah’, ‘BabyNameVermont’, ‘BabyNameVirginia’, ‘BabyNameWashington’, ‘BabyNameWestVirginia’, ‘BabyNameWisconsin’, ‘BabyNameWyoming’]
frames = []
#This for loop pulls rows based on the name and gender selected and adds them to the frames list.
for state in state_tables:
var = state
table = getattr(models, var)
state_high = table.query.filter_by(baby_name=selected_name, gender=selected_gender).all()
df = pd.DataFrame([(d.state, d.year, d.baby_name, d.count, d.rank) for d in state_high],
columns=['state', 'year', 'baby_name', 'count', 'rank'])
frames.append(df)
#This combines all the DataFrames into one.
new_df = pd.concat(frames)
year = '1910'
scl = [
[0, 'rgb(242,240,247)'],
[0.2, 'rgb(218,218,235)'],
[0.4, 'rgb(188,189,220)'],
[0.6, 'rgb(158,154,200)'],
[0.8, 'rgb(117,107,177)'],
[1.0, 'rgb(84,39,143)']
]
data_slider = []
for year in new_df['year'].unique():
df_segmented = new_df[(new_df['year']== year)]
for col in df_segmented.columns:
df_segmented[col] = df_segmented[col].astype(str)
data_each_yr = dict(
type='choropleth',
autocolorscale = False,
locations = df_segmented['state'],
z=df_segmented['count'].astype(float),
locationmode='USA-states',
colorscale = scl,
text = "Rank: " + df_segmented['rank'] + " Year: " + df_segmented['year'],
marker = dict( # for the lines separating states
line = dict (
color = 'rgb(0,0,0)',
width = 2) ),
colorbar= {'title':'# of Babies'})
data_slider.append(data_each_yr)
steps = []
for i in range(len(data_slider)):
step = dict(method='restyle',
args=['visible', [False] * len(data_slider)],
label='Year {}'.format(i + 1910))
step['args'][1][i] = True
steps.append(step)
sliders = [dict(active=0, pad={"t": 1}, steps=steps)]
new_layout = go.Layout(
autosize = False, title = 'Number of Babies by State', width = 1000, height = 600,
geo = go.layout.Geo(
scope = 'usa',
projection = go.layout.geo.Projection(type = 'albers usa'),
showlakes = True,
lakecolor = 'rgb(255, 255, 255)'),
sliders=sliders,
)
fig_three = go.Figure(data = data_slider, layout = new_layout)
graphJSON_three = json.dumps(fig_three, cls=plotly.utils.PlotlyJSONEncoder)
return render_template('culture/baby_state_map.html', wins_graph_three=graphJSON_three)
If it helps, the top half of this image is the output for the combined DataFrame. The bottom half is the output for one state’s (California) DataFrame.