Hi everyone,
As part of my Python for Nonprofits project (which remains a work in progress), I’ve written a notebook that shows how to create choropleth maps with percentile-based bins. This approach helps prevent outliers from skewing your map’s color scale. Although the notebook is still a work in progress, I thought I would share what I’ve written so far in case it would help others who are trying to create percentile-based maps.
Here’s the original map, with a linear colorscale, that my code shows how to update: (Note that a few outlier counties with particularly large population changes have caused the other counties to appear relatively homogeneous.)
And here’s the updated map. Note that the colorscale has been updated to show percentile-based increments (~0th percentile, ~10th percentile, ~20th percentile, and so on) rather than linear increments. As a result, the map is much more colorful and easier to interpret.
In order to make these updates, I first changed the color argument of my map to a column that stores the percentiles for each value rather than the actual values. (Such a column can be created via Series.rank(pct=True).) I then modified two values of the map’s colorbar:
colorbar_tickvals
, which specifies where along the colorbar (e.g. from the lowest percentile to the highest) text labels should be added. (I set this property to specific percentile ranks within my dataset.)colorbar_ticktext
, which specifies what text should be placed at those values. (I set this property to the actual values within my dataset that corresponded to the percentile ranks passed tocolorbar_tickvals
.)
The code is released under the MIT license, so feel free to repurpose it for your own projects (including commercial ones).
As time allows in the next week or so, I’ll consolidate this code into a function so that it can be more easily applied to different projects. I’ll plan to add that function to the same Mapping section of Python for Nonprofits that contains the notebook to which I linked above.