Use of ORCA data for improved transit system planning and operation
Map of average Seattle ORCA transfers per weekday
Project Leads: Mark Hallenbeck & Anat Caspi – CEE (Civil & Environmental Engineering), Taskar Center
Data Scientist Leads: Bernease Herman (primary) and Anthony Arendt (secondary)
DSSG Fellows: Carolina Johnson, Victoria Sass, Yiqin Shen & Sean Wang
Project Summary: Seven regional transportation agencies use a common electronic fare payment system, called ORCA – One Regional Card for All. When ORCA was initially conceived and adopted (it has been in use since June 2009), the regional expectation was that one advantage of moving from simple visual card (A visual, paper monthly pass) to electronic media was that the resulting data would provide travel behavior information that could be used to improve regional transportation system planning and decision making. To date, that secondary purpose for ORCA data has not been routinely realized.
The UW has been granted access to nine weeks of ORCA data. That nine weeks of data corresponds to ~21,000,000 transit boardings, or roughly 15,500,000 transit trips – with ~5,500,000 transfers. These ORCA transaction records have already been linked to vehicle location data (AVL) to determine where those boardings took place. In addition, we have estimated for about half of those trips where the traveler exited the bus, and if they transferred, how long that transfer took place. We have requested a second nine weeks of data (The new data will describe travel after the new rail stations opened, and now with the low income fare card in full operation. The second nine weeks of data have yet to be received or processed.)
We – and the transit and planning agencies of the region – are interested in a variety of computer science activities, social science analyses, and transportation analyses. For the analyses within each of these fields we have to be extremely conscious of the privacy of individuals who ORCA cards, as well as the rights of the employers that often subsidize those cards.
For CS analyses, we are interested in better ways to process, store, and handle the very large data sets involved in these analyses. For example, to estimate boarding and alighting locations we have to search multi-gigabyte AVL files to find specific bus locations at specific times and dates, often without being able to process those look ups in a time sequenced fashion, and often switching between the AVL files for different transit agencies for one trip made by one individual.
On the transportation side, we are interested in turning these data sets into decision support tools. We need better ways to store, visualize, summarize and query these data. For example, we would like to show graphically where large numbers of transfer take place, and then allow users to drill down into those locations to determine which routes transfer to which routes, how long it takes riders to transfer, and how far they walk to perform those transfers. Similarly, we have ~7,000,000 origin/ destination pairs (where people board and alight from transit vehicles). We need ways to summarize and display where and when these trips take place. We are interested in describing how ridership patterns changed when the two new light rail stations opened, as well as demonstrating how the available data can be used for service and transit operations planning.
On the social science side, we are interested in understanding when, where, and how often low income users take transit, and how those travel patterns differ from other users. We are interested in expanding our initial work that examines how employer transit subsidies effect transit use, and how a wide range of built environment variables effect transit use. The transit agencies are also interested in gaining a better understanding of the geographic connections being found in the data. Which portions of the region are interacting the most, and how efficient are the movements between those geographic zones? Does the transit system effectively serve all geographic areas? Are lower income groups being effectively served by transit?
We are also open to analyses you might suggest.