A map of Virginia with color coded districts

Data Science for Social Good Team Builds Tools to Support Fairness in Computational Redistricting

A map of Virginia with color coded districts
An example of the use of computational redistricting tools to sample potential plans in Virginia. (click to view)

By: Emily Keller-O’Donnell

This fall, Congressional leaders will begin the state redistricting process that takes place every ten years to reflect population changes captured by the U.S. Census. As the process gets underway, a team in the  University of Washington Data Science for Social Good (DSSG) program is creating a resource guide to help nonpartisan groups who want to evaluate the fairness of proposed maps, or draw their own, using a computational redistricting tool called GerryChain.

 The project, “Developing Ensemble Methods for Initial Districting Plan Evaluation,” is one of two projects hosted this year at the UW’s eScience Institute. The DSSG summer program brings together students, stakeholders, data scientists and domain researchers to work on project teams for a 10-week period.

The interdisciplinary project team consists of four student fellows from universities around the country, working with eScience data scientists Bernease Herman and Vaughn Iverson, and project lead Daryl DeFord, an Assistant Professor of Data Analytics in the Department of Mathematics and Statistics at Washington State University.

A range of computational and statistical methods are applied to the redistricting process, as there are an astronomically large number of different ways to draw geographic district boundaries for each state. The team is using a computational method called Markov Chain Monte Carlo (MCMC) sampling, which attempts to generate non-partisan redistricting plans based on criteria determined by each state by randomly sampling to create a distribution of possible plans. This yields a collection (or “ensemble”) of districting plans which modelers can use to produce statistics to evaluate plans. GerryChain is a Python library, developed by the Metric Geometry and Gerrymandering Group (MGGG) Redistricting lab at Tufts University with extensive work by DeFord, for building ensembles of plans using MCMC methods. These tools are aimed at detecting gerrymandering, the process of manipulating district boundaries to favor certain political parties or demographic groups.

Stakeholder engagement

Early in the program, the team members reviewed relevant literature and met with a range of stakeholders and community groups to establish a contextualized approach to their work. Rowana Ahmed, a master’s student in the Health Data Science program at Harvard University, said, “Engaging with stakeholders has helped ground our work in reality. By talking with experts from a variety of backgrounds, ranging from advocacy lawyers in the Texas Civil Rights Project, data experts in the Redistricting Data Hub, to members of an independent election commission in Chicago, we have gained insight on the practical considerations involved with redistricting.”

Ryan Goehrung, a Ph.D. candidate in the UW Department of Political Science, said the meetings helped the team to refine and sharpen their project goals. “Meeting with people who are engaged in redistricting has given us invaluable insights into this process and helped us to develop outputs with these stakeholders in mind, so hopefully the tools and approaches we develop this summer can be taken up and utilized by community groups and others who wish to contribute to their state’s redistricting plans or challenge unfairly drawn maps more effectively,” he said.

The team is writing a user’s guide that outlines good practices, such as sequential steps and decision trees, for deploying GerryChain, to be used as a resource by advocacy groups, independent consultants, non-partisan commissions and others with limited technical knowledge in any state. The guide covers how to generate maps that account for state-level redistricting rules, evaluate tradeoffs in selecting map criteria, outline key decision-making points, test whether existing maps are fair and representative, and identify outliers in human-drawn maps. They are also conducting case study analyses of Colorado, Georgia and Texas, which will be incorporated into the guide as examples of some of the ways that GerryChain can be applied at different stages of the redistricting process. 

Interdisciplinary approach

Katherine Chang, a Ph.D. candidate in Education Policy, Organizations, and Leadership at the  UW College of Education and an MPA student in Social Policy at the Evans School of Public Policy and Governance at UW, said the interdisciplinary nature of the project provides a unique experience for integrating different types of knowledge. “The issue of political redistricting travels across mathematics, political science and legal studies, and our DSSG project requires strong technical, analytical and communication skills. Our team has seen that interdisciplinary collaboration requires the need to read deeply and widely; to lean on each other’s background skills and knowledge; and to be comfortable working in a dynamic environment that is continually evolving,” she said.

Daryl DeFord said working with an interdisciplinary group of fellows has helped to unearth potential use cases for various groups that are interested in the project results. “Our group discussions have been very lively and have interrogated the redistricting problem from all sorts of interesting angles, reflecting their diverse backgrounds and interests. This is also very important for the project itself, since although we are applying tools from mathematics and data science, political redistricting is a fundamentally human endeavor that necessarily incorporates a broad range of perspectives and stakeholders,” he said.

Wide-ranging applications

The outputs of the team’s work will have long-ranging benefits, as the redistricting process can take years to complete, given potential court challenges to proposed plans. The team’s methods and approach are also applicable beyond the redistricting process, to other complex issues.

Vaughn Iverson, a research scientist at the eScience Institute, said that participating in the project has informed his ongoing work. “My research involves designing very efficient methods to compare, search and cluster huge amounts of biological data using concepts from ‘Information Theory’, which is a branch of mathematics upon which much of modern digital communications is built. Thinking about the enormous number of possible congressional redistricting plans for a given state, and the relationships among them, has provided an interesting opportunity to evaluate and generalize some of the approaches I’ve been working on for use in a wider set of applications than they were originally intended to tackle,” he said.

Michael Souffrant, a Ph.D. candidate in Computational Biophysical Chemistry at Georgia State University, said, “As a fellow of the DSSG program, I now see data science as a tool that can be implemented in many aspects of society to provide solutions and suggestions to computational problems. Meeting with stakeholders regarding the focus of my project has offered perspectives and significance to the potential data generated and its application in local communities.”

The final DSSG presentations will take place on Wednesday, August 18th via Zoom from 1:00 to 2:30 p.m. The event is open to the public.