DRIVE Net: Seeing the Big Transportation Picture
Yinhai Wang, Professor, Civil & Environmental Engineering

With increasing numbers of commuters encountering congestion it has become an American past time to complain about the traffic. This is especially true in the greater Seattle area, where congestion has taken hold of freeways, arterials, and even the occasional residential street. Every commuter knows congestion is frustrating, both to be mired in and to anticipate waiting in. With increasing traffic demands and the corresponding increases in congestion, efficient allocation of limited transportation resources is increasingly important.
While commuters often understand traffic simply in terms of the cars around them and the length of their commute, a small cadre of transportation engineers in the University of Washington (UW) Smart Transportation Applications and Research Laboratory (STAR Lab) of sees a much larger picture—one composed of traffic data from many disparate sources and jurisdictions. Their goal is to get the vast amount of traffic data collected, stored, and organized so that this data can be utilized for research and translated into useful information for people who need it: legislators, transportation planners, engineers, researchers, and commuters.
This data is potentially useful to a range of users, from traffic engineers to researchers, legislators, the public...people with concerns about air quality, public health, and urban planning applications. The challenge is how to extract information in ways that are useful to a diverse range of interested parties.
The STAR Lab is located in the Civil & Environmental Engineering (CEE) department of the UW. Its primary goals are to support research in Intelligent Transportation Systems (ITS) and provide hands-on training in the use and design of instruments and software applications for ITS students and professionals. Professor Yinhai Wang of the CEE department is the founder and director of the STAR Lab. He, along with his PhD students Yao-Jan Wu, Xiaolei Ma, Jonathan Corey, and Runze Yu, have created an eScience transportation platform, called the Digital Roadway Interactive Visualization and Evaluation Network (or DRIVE Net) for transportation data sharing, visualization, modeling, and analysis.
Professor Wang observes that, “in recent years, traffic detectors have been intensively deployed in major highway systems across the country. These sensors generate tremendous traffic data that is extremely valuable for traffic management, flow forecast, operations, and planning.” This data is potentially useful to a range of users, from traffic engineers to researchers, legislators, and to the public. The data is also of interest beyond direct transportation applications, including air quality, public health, and urban planning applications. The challenge (as with all computationally driven science), is how to manage the data efficiently. And how to extract information in ways that are useful to a diverse range of interested parties.
DRIVE Net is designed to serve as a platform for managing, analyzing, and visualizing transportation-related data from various sources and assorted jurisdictions. As the datasets increase in size, efficiently searching for specific information and querying across diverse datasets is also a significant challenge.
Traditional Approaches (and Limitations) to Working with Transportation Data
Traditionally, transportation engineers have relied on mathematical equations to describe and solve transportation problems, such as traffic flow dynamics. The development of these equations typically involves assumptions that are hard to verify. Consequently, solutions developed from mathematics may not be directly applicable to transportation engineering practice. Other problems are difficult to formulate mathematically, such as “What will the impact of the Alaska Way viaduct reconstruction be on peak hour travel times?”
With the limitations of mathematical approaches, researchers have increasingly turned to simulation to study traffic conditions and collect simulation outputs for use in research and decision making. One problem with simulation-based methods is that, for the simulation model to work correctly, it must be configured and calibrated based on field observations, but such observations are often not available. This leaves researchers to estimate or guess many parameters in the simulation model. As Jonathan said, “With traffic simulations, you’re always in ‘best guess’ territory, because you’re generally not calibrating on the conditions that you’re trying to test.” This is particularly true for scenarios such as major bridge and highway closures, where diverting traffic can significantly alter traffic patterns from the base data used to calibrate the model. Imagine modeling the closure of the I-90 and I-5 interchange and trying to get useful data from a model calibrated using a normal day’s traffic data.
Another issue is that most current traffic simulation packages cannot simulate drivers’ behaviors as accurately as researchers wish. When simulating lane changes, for example, current tools assume that whether a driver can merge to an adjacent lane depends solely on the availability of an acceptable gap and fail to account for human factors. Most drivers in adjacent lanes may ignore his/her lane changing signal and intention. Most of us know from experience that it is likely a gap will eventually be created by a courteous driver. Similarly, traffic flow dynamics and congestion evolution are equally difficult to accurately describe with mathematical equations and/or simulations because of the complexity of human factors as well as traffic flow characteristics.
Despite these limitations, researchers have been using methods driven by either mathematical equations or traffic simulation for decades due to data constraints. They have had no alternatives because there has been no platform or technology developed capable of bridging the gap between the scattered data collecting entities and the researchers who would use the data.
Bridging the Gap Between Data and the People who Need It
As in many sciences, a significant gap exists between the available traffic data and the people who need to use that data. On the one hand, those collecting data (such as city and state transportation agencies) say, “we have all these data and no one’s using it!” On the other hand, people complain “we need data and there isn’t any!” The hope of those developing DRIVE Net is that it will provide a way to bridge that gap.
Managing Data Volume
The first issue to address is to manage the volume of data now available. Traffic sensors generate data continuously. DRIVE Net’s data currently grows by gigabytes each day and new data sources and analysis systems are added as the team has opportunity. Interconnecting various data sources and analysis modules are essential for all the elements of DRIVE Net to work as a system. Given the scale of datasets involved, it can be a daunting task only made possible through the use of extensive automation.
Responding to User Queries
Once the data management issue is resolved, the next important issue to address is figuring out how to respond to user queries. Finding data from a specific area at a specific time is relatively easy. More complicated queries may require analyses based on multiple data sources, such as traffic sensor data and air quality data. Identifying locations where traffic demand, roadway geometry, weather data, and lighting conditions combine to form a crash hazard or congestion propagation point is much more difficult to conduct and needs innovative methodologies built into the system.
Researchers such as public health scientists want to get their hands on transportation data in order to investigate the impacts of traffic pollution on human health. But they haven’t had an easy way to access it, manage it, or query it. Typically, the biggest challenge in cross-specialty research has been finding data sources and penetrating the storage format. DRIVE Net is designed to generate or provide access to transportation-related data sets over the web in a clear and concise manner so that as many researchers as possible, from as many specialties as possible can incorporate transportation impacts into their research.
Incorporating eScience Methods
Professor Wang believes strongly that, in order to solve these issues, transportation professionals must incorporate computational data-driven (eScience) methods into their research and analysis. He and his students began formally working on DRIVE Net in 2006 using their spare time. His primary goal was, and continues to be, bridging the gap between data sources and researchers through an online eScience platform.
The Three Mission Objectives of DRIVE Net
One: Gather Data
The first of DRIVE Net's three mission objectives is to gather as much transportation data as possible. “We’re not restricting the ways that data is submitted. We’re a data sponge: we want as much transportation data as we can get our hands on. People can submit it in whatever form they have it in.” Currently, DRIVE Net imports data from WSDOT, Seattle DOT, Bellevue DOT, Lynnwood DOT, Washington Incident Tracking System, and Highway Safety Information System, among others. The data comes from many sources:
- loop detectors
- traffic controllers
- GPS tracking data
- traffic surveillance video cameras
- incident reports
- survey reports
- mobile sensors
- border crossing reports
- mountain pass condition reports
Two: Facilitate Scientific Explorations in Transportation Research
Data collection, quality control, and analysis often consume a big piece of the research budget for transportation research. Due to the lack of data sharing and analytical platform, the data collected are only rarely reused. This largely limits the sample size of data available for each study. DRIVE Net intends to enable one-stop shopping for high quality transportation data. Furthermore, this regional map based system provides various tools for data visualization and analysis. Modeling efforts can also be easily incorporated into the system through eScience design and support.
Three: Get Data into the Hands of Researchers
DRIVE Net developers want to put the data into consumable and searchable formats so that anyone interested in traffic conditions can query it according to their interests. For example, public health researchers can query the data to learn: how many hours of congestion a specific region experiences, or the locations of the worst congestion. “If we can provide this information, public health researchers can easily incorporate transportation impacts into their research.” Similarly, legislators might be interested in knowing how much travel improvement has been made through a specific investment. This information could be used to quantify the benefit of public funds in roadway improvements in specific regions.
DRIVE Net vs. Existing Resources
When asked how DRIVE Net differs from existing resources such as the WSDOT and City of Bellevue traffic flow maps, Jonathan and Professor Wang explained, “If you want to plan a driving route that spans multiple regions, you have to check multiple sources and then compile that data.” And different sources focus on different parts of the picture: WSDOT has information about freeway traffic, but doesn’t provide information about arterial conditions. City resources provide an arterial picture, but don’t show the freeway situation. Any given trip a commuter makes will often cross at least three jurisdictions, the commuter’s home city, the freeway systems and the destination city. DRIVE Net incorporates data from multiple systems, so commuters would no longer have to check multiple data sites to see how their commute looks. “We know that there are a lot of traffic information systems operated at different transportation agencies, but travel is not constrained by jurisdictional boundaries. So we want to extract information from separated systems and combine them into one network picture to help travellers.”
Interactivity
As for the differences between DRIVE Net and other online traffic information systems such as Google Traffic, Professor Wang emphasized the interactive capability and analytical functions built in the DRIVE Net system. The system is designed, through a regional map-based interface, to be highly interactive. For example, a user can specify his/her travel origin and destination (on a limited set of the network) and the system will compute an optimum path that accounts for the actual travel conditions in real time for each segment of the trip, not just the distance. Such systems are computationally difficult to implement over large networks due to the uncertainty of traffic congestion evolution and lack of drivers’ behavior data. Improving these routing algorithms is a focus of ongoing research.
Real-Time and Historical Data
Most importantly, DRIVE Net provides both real time and historical data. This allows data-driven decision making to account for situations that modeling may not be practical for. Given enough time, many unusual situations will occur naturally. For example, if you are trying to analyze what would happen if the SR 520 bridge were to close or fail, it would be impractical to simulate such an event because the calibration datasets would be different from the conditions you are trying to analyze. However, DRIVE Net’s historical data would enable a researcher to locate data where the bridge was closed for some reason, such as scheduled maintenance, an incident, wind, etc, and use the observed data to make more accurate predications.
The US DOT has been funding ongoing research projects under an umbrella called the Connected Vehicle Technology Challenge. These projects have the goal of bringing vehicles into the data picture. Connected vehicles would interact with roadside equipment (such as traffic signals) with the goal of increasing signal efficiency by telling the signal control system which direction each vehicle waiting at the intersection wants to go. This would allow the signal to optimally divide time between the various directions. Such systems will not spring forth fully formed, however. In the intermediate timeframe, such systems will be capable of generating additional data, but will not be present in enough vehicles to rely on for signal operation. There will be enough data to help inform signal timing parameters and other transportation decisions, if such data is available to and consumable by the people making the decisions, whether they be commuters, engineers, or policymakers.

Challenges
Of course, such an ambitious project has its share of challenges. For instance, DRIVE Net imports data from loop detectors, which measure how long it takes for a car to travel a particular stretch of roadway. This data comes from many disparate points, and to get data for an entire roadway segment, one must extrapolate traffic conditions from point detector data. So one constant question is, how well does the point data from detectors generalize to an entire link?
Importing data from so many diverse resources also presents challenges in setting up their internal connections for supporting system functions. Jonathan notes, “One of the things that we spend a lot of time on is network topology: matching data from the various agencies to our platform. Depending on the data source, we might have to spend time cleaning the data, e.g. loop detectors that don’t operate correctly. The big challenges here are quality control of data and matching up network and data source. We’re trying to figure out the best way to obtain diverse data sets in a reliable and sustainable way. We also want to automate the process by which we normalize data.” Some of the work that went into Jonathan’s master’s thesis, on automating a process for cleaning the data, may be applicable to this problem.
As DRIVE Net’s data sources and analytical modules increase in size, number and complexity they will outstrip the currently available computing resources. The DRIVE Net team is already making plans to move DRIVE Net to a cloud service where data and processing scalability will help offset increased data set sizes, user numbers and analysis complexity.
Projects such as DRIVE Net are the next generation of traffic research. While traditional research involves mathematical models fit to relatively small datasets DRIVE Net can draw upon all of the data in its stores to show the actual events. Additionally, DRIVE Net can automate the query and display of some analyses so that researchers will not need to perform basic analyses on the data themselves, speeding research.
Also in... Get Help Now
Latest eScience News
Links
Please help us support your research by including the following acknowledgment in publications to which we have contributed:
Supported in part by the University of Washington eScience Institute.

