Data Science for Social Good team analyzes equity of congestion pricing on Interstate 405

By Emily Keller-O’Donnell

August 19, 2019

Cars driving up a hill with the title "I-405 High Occupancy Toll Lanes: Usage, Benefits, and Equity
Click to enlarge

A team in the eScience Institute’s Data Science for Social Good (DSSG) program has partnered with the Washington State Department of Transportation (WSDOT) to study the usage patterns, price sensitivities and equity impacts of congestion pricing on Interstate 405.

The project utilizes data on the more than 16 million trips taken in the high-occupancy toll (HOT) lanes of I-405 in 2018, including 56 million toll transactions, entry and exit points, overall speed and volume in each lane, and a categorization of high-occupancy vehicles (HOVs) versus single-occupancy vehicles (SOVs). WSDOT also provided the U.S. Census block-groups associated with registration addresses for a large fraction of the HOV and SOV accounts, which allowed the extraction of census data describing the socio-economic attributes of users of the facility. Finally, detailed facility performance data describing traffic volumes and speeds was obtained for both the HOT lanes and the parallel general purpose (GP) lanes. The tolling data was provided by WSDOT and is not publicly available. License plate and vehicle account information is fully anonymized.

The Interstate 405 HOT Lanes run from Bellevue to Lynnwood, Washington. On weekdays between 5 a.m. and 7 p.m., HOVs can use the HOT lanes for free, and SOVs can pay a toll to use the lanes. All vehicles can use the general purpose lanes immediately adjacent to the HOT lanes for free. Prices on the HOT lanes vary based on fluctuating congestion levels and distance, with the total price capped at $10. The rates are set with a goal of maintaining a 45 mile per hour traffic flow in the HOT lanes.

Project lead Mark Hallenbeck, director of the Washington State Transportation Center at the UW, said the inclusion of anonymized (or hashed) account and license plate data, along with the census block-group information enabled the team to expand the scope of the project beyond performance and optimization issues, in order to look at social equity impacts of HOT lane use from a variety of angles. However, considerable work was needed to derive conclusions, or “find the signal in the noise,” using multiple data sets without being misled by emerging theories early in the research process. “As soon as you get an answer you can rationalize that answer in transportation, which is very scary because almost any answer can sound plausible. You can make cool stories, but sometimes early summary results lead to misleading [final] results,” said Mark.

Differentiating individual patterns from aggregate data trends was an important component of the team’s work, as these two approaches can often tell very different stories. Cory McCartan, a Ph.D. student in the Department of Statistics at Harvard University and one of four student fellows working on the project, used methodological validation through simulation to test different hypotheses. “Individual patterns can reverse themselves when you aggregate. We don’t want to conclude that these lanes have a usage pattern that’s not true on the individual level,” Cory said. Speaking about the inherent limitations of working with data sets that illuminate only a segment of a larger system, such as one interstate within a vast regional transportation network, he said, “We’d like to have more data, but it’s also been a cool thing—here’s a limitation, how are you going to get around it, can you ask a different question that gets at the same thing—and that’s been really valuable, I think,” Cory said.

For their equity analysis, the team set out to understand the distribution of costs and benefits among different groups of users, based on factors such as location, income, demographics, travel frequency, time of day, and trip direction. “It feels like it has a clear social good angle rather than just the efficiency of the operation of the system,” said Vaughn Iverson, research scientist at the eScience Institute, and the project’s data science lead. “This is truly a remarkable data set to have, transaction individual-level data,” he added.

The team combined data from different sources to generate a more full picture of equity issues. For example, WSDOT does not collect income data from drivers, although the tolling data does contain the residential block group from the U.S. Census, which shows median income for 430,000 vehicles. Those vehicles are registered to addresses in 3,100 different block groups (which contain 3,000 – 5,000 people each). The team cross-referenced this data to identify the income distribution in the block groups where the vehicles are registered to gain a better understanding of the relationship between income and HOT lane usage patterns.

To get a hands-on understanding of their subject, the team visited the WSDOT Northwest region headquarters in Shoreline and viewed the office with more than 100 video screens showing current traffic conditions throughout the region, along with diagrams of roads and current control patterns. They also spoke to the WSDOT staff operating the regional freeway system and drove the corridor to observe the I-405 HOT lanes in action. 

“When we went to the headquarters, we got to meet with the people who work there and are in charge of things. We got to learn about how they set up the tolling algorithm, which was a really important part of understanding what we were doing,” said fellow Shirley Leung, a Ph.D. student in the UW School of Oceanography. “Driving the corridor helped me imagine what it was like being a user, places where it would be easy or hard to enter or exit, and how that drives people’s decisions,” she added.

Fellow C.J. Robinson, an undergraduate student in the UW Department of Economics and Department of Political Science, said, “I feel like this project is a really nice intersection of asking those larger questions about equity in the theoretical and political sense, while also getting into the granularity of identifying behavior and seeing how that behavior actually shapes equitable outcomes.” Regarding the role of field trips in the project, C.J. said, “It helped with the part of data science of trying to tell a story because it tests the assumptions you had beforehand and the narratives you were creating before actually putting yourself in that situation.”

Fellow Kiana Roshan Zamir, a Ph.D. student in Operations Research/Transportation Engineering at the University of Maryland, said she hopes to apply what she learned about equity analysis as a participant in the DSSG program in her ongoing work studying dock-less and station-based bike-share systems. Working on a project about traffic also adds a new level of experience to her existing transportation research. “It encouraged me to use the concepts from this study for other transportation modes. I can also use it in the future for other kinds of data and other kinds of modes,” she said.

One of the remaining tasks before the program ends is to document the work of the summer for a hand-off to WSDOT and any future researchers to continue the team’s research or utilize their work to generate policy decisions. For example, a possible subsidy for low-income drivers could be considered as a result. Documentation has been a core part of the ten-week program, and they will make an hour-long presentation on their project results with a Q&A session at WSDOT on August 22nd.

The team presented the results of their analysis at the DSSG final presentation as one of four project teams on Wednesday, Aug. 21st.

Learn more about this project in a segment on TVW’s “Washington to Washington” program on YouTube (the segment starts at minute 23.)

Data Science for Social Good team members write on the writeable wall