By: Emily F. Keller

Research about the relationship between poverty and the minimum wage in Seattle has often focused on examining the economic circumstances of individual workers. One of the projects of the Data Science for Social Good (DSSG) summer program at the University of Washington is expanding this lens to create data that views workers as household members, to provide greater context about the role of individual wages in reducing poverty in the region.

The project, “Tracking family and intergenerational poverty using administrative data,” is one of four projects hosted this year by the eScience Institute. The DSSG program brings together students, stakeholders, data scientists, and domain researchers to work on project teams for 10 weeks each summer. The project team consists of four student fellows from universities around the country, a data scientist at the eScience Institute, and the project lead, Jennie Romich, a Professor of Social Welfare at the UW School of Social Work and Faculty Director of the West Coast Poverty Center.

The project builds on Romich’s prior work on the UW Minimum Wage Study that examines how Seattle’s $15 minimum wage policy, which took effect in 2015, impacted employment and earnings. The project utilizes the Washington Merged Longitudinal Administrative Data (WMLAD), which Romich developed with seven other faculty members, through partnerships with seven state agencies, beginning in 2014, to examine employment and earnings outcomes. Romich led a 2021 DSSG project as well. The goal this summer is to capture the household status of workers in the data.

Ethical Considerations

Working with administrative state data brings unique ethical responsibilities. The WMLAD database contains data on 10 million people from 2010-2016. The data are housed by the UW Data Collaborative at the Center for Studies in Demography & Ecology, and accessed through a secure remote enclave. The data are derived from unemployment insurance, birth certificates, voter registrations, drivers’ licenses, arrests, and benefit receipts from the Washington State Department of Social & Health Services. To protect privacy, residential addresses are identified only by their associated Census block levels.

Data Scientist Jessica Godwin, Statistical Demographer and Training Director for the UW Center for Studies in Demography & Ecology, said, “Working with administrative data poses many challenges: it’s messy, it’s huge, and it contains highly sensitive information. The biggest challenge, however, is ensuring that any aggregation or statistical analysis is done with the humanity of those that comprise the data in mind, as well as their needs and privacy.”

The team met with multiple stakeholders to help guide their approach. They met with a senior researcher in the Department of Social & Health Services (DSHS) unit that administers anti-poverty programs, to understand how the department analyzes data, how their data was put together for WMLAD, and how households are defined in comparison to families within different programs. The team met with a policy analyst at a nonprofit to learn more about the data analysis that was used to advocate for the $15 minimum wage. They also spoke to a Ph.D. candidate in public policy who studies poverty issues, to learn more about household-level geocoding for the region, and the importance of having multiple definitions for households.

Fellow Zhaowen Guo, a Ph.D. candidate in Political Science at UW, said, “Meetings with stakeholders allowed me to better understand the data-generating process and how to analyze the data while navigating ethical considerations.” This experience, along with her prior work with large and complex data, has informed her research as part of the DSSG team. “I’ve learned how we can lean on each other’s expertise to achieve our project goals,” she said.

Fellow Betelhem Aklilu Muno, a Master of Public Health Student in Epidemiology in the UW School of Public Health, said, “One thing I’ve been thinking about is the need to continuously evaluate how we approach data, data analysis and how we are representing people in our society. This has been a big part of the goals of the Tracking Poverty team as it requires us not just question our own approach but think about the factors that went into creating the administrative datasets we are using to consider households. I plan on taking this form of questioning and evaluation in my own work and the work spaces I enter while I do epidemiological work in public health.”

From Individuals to Households

Constructing household data using administrative records is a complicated task for many reasons. State programs often provide benefits on an individual rather than household basis, and those that focus on households may define them differently. Birth certificate data for Washington excludes transplants.  Voting activity is only recorded in election months.  Licensed drivers don’t always update their addresses right when they move. There are data gaps and variability of available information on individuals, depending on whether they are workers, drivers, or participants in state-run programs. Some data sources do not include children, making it harder to determine if a worker is supporting others. When children are present, they may be part of either one or two separate households. Households are fluid over time.

Multiple data sets are needed to better assess whether an individual worker is supporting others or being supported by others. For example, a minimum wage worker may be the sole earner for themselves and their dependents, or they may be secondary earners in non-poverty households, such as high school students or spouses.

Project lead Jennie Romich said, “The team is tackling a deceivingly complex question – what is a household?  And how do we operationalize a household based on sporadic interactions with public institutions? It’s really fun to see how the different perspectives and strengths of the data fellows each contribute to the whole – and it’s really important to have a mix of perspectives as we’re working simultaneously on parsing social meaning, questioning dominant assumptions about human life, thinking backwards and forwards in time, and designing efficient-enough code that we can make sense of our massive data.”

Fellow Ihsan Kahveci, a Ph.D. student in Sociology at UW, said, “When the challenges of data science and the complexity of human behavior are combined, even the seemingly trivial task of creating households requires meticulous work and informed decisions.”

Fellow Eliot Stanton, a recent graduate in Data Science and Analytics at Simmons University, said their prior work for the U.S Census and their academic work interrogating the harm that administrative systems can cause marginalized people, has helped guide their work at the DSSG. “As a team, with help from the researchers and civil servants we’ve met as stakeholders, we have spent a substantial amount of time understanding the many different ways people show up in our data. While we cannot feasibly account for every household scenario, we do our best to avoid systemic oversight of certain populations and we are clear about the limitations of our methods,” they said.

The final DSSG presentations will take place on Wednesday, August 17th via Zoom from 1:00 to 3:00 p.m. Pacific. The event is open to the public. Registration is required. More information is available here.