Project leads: Stephen Barham, Data Scientist, and Alex Hagenah, Data Librarian, Seattle Department of Transportation
Data scientist leads: Joseph Hellerstein and Ryan Maas
DSSG fellows: Rebeca de Buen Kalman, Darius Irani, Hyeon Jeong Kim, Amandalynne Paullada, Woosub Shin
Project Summary: The Seattle Mobility Index Project measures transportation mode choice, affordability, and reliability at 450 Census Block Groups in Seattle and predicts mode share (the percentage of travelers using each transportation option) based on their mobility indices. The project represents a low-cost, granular approach to measuring and communicating mobility that can be replicated anywhere, similar to Redfin’s “Walk Score” and “Transit Score”, which measure walkability and transit options in proximity to any location. The Seattle Mobility Indices, however, are based on the ability to reach a “market basket” of destinations, or common travel points, derived from actual travel patterns, not solely based on locations nearby. Our indices vary with time of day and are sensitive to near- and long-term changes in the transportation system.
Using the Google Distance Matrix API, we will consume millions of distance and travel time estimates for driving, transit, walking, and bike travel. We will also access aggregated travel pattern information from the Puget Sound Regional Council Household Travel survey (see links below) to validate and tune our approach. We expect to complete the project in three distinct steps:
- Market Basket of Destinations. We will refine an algorithm that identifies a “market basket” of destinations relevant to people who travel in Seattle. The basket may include collections of trips to nearby points of interest and activity centers that are specific to each origin, and a collection of trips to citywide destinations that are the same for all starting points. The basket algorithm is a low-cost approach to creating a transportation origin-destination model.
- Mobility Indices. We will analyze travel from each Census Block Group to the Block Group’s basket of destinations and develop scalable algorithms that return the following indices:
- Mode Choice: the quantity of modes available to reach the basket of travel destinations, within designed parameters.
- Affordability: the relative cost to reach the basket of travel destinations, based on the costs of the least expensive modes and the costs of the fastest modes.
- Reliability: measurements of actual travel times versus optimal times and the amount of travel that exceeds percentile thresholds. Travel time reliability algorithms will be applied to data that has been collected over a period of time.
- Mode Share Predictions. We will attempt to model and predict the probability that a traveler will use a single occupancy vehicle and other modes given the Mode Choice, Affordability, and Reliability scores for their location.
Seattle is entering an expanded era of intense public and private construction projects that transportation planners have called the “Period of Maximum Constraint.” For the next 5 to 10 years, measuring the ability to drive, walk, bike, and use transit will be critical to mitigating the impacts. This research is particularly important to the City’s race and social justice equity programs because it will enable us to identify where geographic and time-of-day disparities in mobility exist and quantify how they are impacted by changes in the transportation system.
The mobility indices are a key component of the Seattle Department of Transportation’s Strategic Data Initiative and performance metrics that enable the City to drive outcomes, make decisions, and move our work from being project driven to outcome driven. The indicators will be baselined, tracked, and used to communicate the status and health of the transportation system.
Project Outcomes: The project delivered a software package that processes transportation indices for mode choice, affordability, and reliability. These mobility indices differ from current solutions because they are based on where and when people travel, not just what is located in close proximity. The indices will be used by the Seattle DOT and the community to understand the transportation system, support collaboration, and identify mobility equity challenges. The resolution supported by this project is such that the City can create an analytical baseline, analyze performance at a granular level, and understand how small and large changes to the transportation system impact mobility.
The project used machine learning techniques to develop traveler personas that shed light on the needs, experiences, and travel patterns of different groups of people. The personas methodology is used to reflect household characteristics in the mobility measurements, and also supports broader transportation planning efforts. Additionally, the project modeled drive-alone rates using only the new indices as machine learning features. This simple predictive model scored comparatively to a similar approach that incorporates dozens of travel and household attributes.
To support collaboration and equity analysis, the project developed a stand-alone universal geocoding tool that can batch process geography information such as Block Group, neighborhood, Council District, and zip code from point coordinates. The Python package can encode 100,000 locations in approximately one minute.
The methods developed are reproducible, scalable, and can be conducted at a low cost to the City or other entities seeking similar results. Coding standards and design values of simplicity and modularity are built in to the project so that the City’s Data Science Team can integrate it with their internal workflow, modify parameters, and add features.
View the final presentation slide deck (PDF). Watch the video on YouTube. View the poster that fellows Darius Irani and Woosub Shin presented at the West Big Data Innovation Hub’s All Hands Meeting in Boise, Idaho in September 2018.
Parallel worlds of pangolin conservation and Data Science for Social Good by Hyeon Jeong Kim
Putting perspective into practice by Amandalynne Paullada
Why data scientists should care about the social good by Darius Irani
Learning to code and coding to learn by Rebeca de Buen Kalman