eScience Institute’s Data Science for Social Good Fellows

On Thursday, August 20th, the four project teams for eScience’s Data Science for Social Goodsummer program gathered one last time to make their final presentations in front of a room filled to capacity with distinguished guests. The culmination of ten weeks work now boiled down to thirty-minute presentations, and the recently-christened Data Science for Social Good Fellows were eager to report on the merits of their respective projects.

Modeled after programs at Georgia Tech and the University of Chicago, an initial applicant pool of 144 students was narrowed down to sixteen graduate and undergraduate DSSG Fellows, split among four projects whose focus was solving 21st Century urban challenges using data science. Each team was joined by a project lead and an eScience Data Scientist, along with students from Alliances for Learning and Vision for Underrepresented Americans (ALVA), a University of Washington program that targets underrepresented students in a variety of disciplines for summer internships.

The teams confidently presented their projects and the processes and tools built to solve the big data problems inherent in each. For Predictors of Permanent Housing for Homeless Families in King, Snohomish, and Pierce County, part of the data problem lay in how each county handled their homelessness data and how each defined both a “family” and an occurrence of homelessness. “We spent the bulk of the summer,” said team member Joan Wang, “trying to find ways to process the data into an analyzable format.” One of the creative tools the team came up with was an immersive Sankey diagram that visualizes how families enter into and out of homelessness, and what programs helped contribute to a successful exit. (Click here to watch Predictors of Permanent Housing for Homeless’ final project presentation.)

For the DSSG Fellows on the King County Metro Paratransit team, their goal was to create a tool that analyzes historical usage data and combine it with other data factors, including per-trip costs, to help mitigate service impacts and schedule — and in the case of service disruptions, reschedule — riders more efficiently. Instead of having to generate an entirely new ride in the case of a service disruption, with the team’s tool it’s now possible to automatically locate nearby buses to pick up stranded riders, measuring cost savings in the process. (Click here to watch Patatransit’s final project presentation.)

Assessing Community Well-Being Through Open Data and Social Media’s goal is to provide neighborhood communities a better understanding of the factors that impact their well-being. The team’s presentation focused on providing the means for crowd-sourced community networks to leverage social media and other open data sources, including Socrata and Crime Reports, so neighborhoods can identify issues affecting their community and create coordinated, neighborhood-positive responses. (Click here to watch Community Well-Being’s final presentation.)


eScience Institute’s Data Science for Social Good ALVA Students

The team behind Open Sidewalk Graph for Accessible Trip Plan faced an interesting “dirty” data problem. Their project was an information challenge to design an open source software toolkit and set of algorithms to help those with limited mobility plan a commute, assembling disconnected sidewalk segments into a coherent map that provided easy routing for those needing to avoid steep hills, uncrossable intersections, stairs, or sidewalk-blocking construction. What made the project’s problem so interesting was the amount of time, thought, and manual and computational labor needed to “clean” intersections. The team had to stitch together a very ugly and unusable set of sidewalk maps, as well as associated data with broken or incomplete intersections and “noisy” sidewalks, into a software package that 1) quickly and easily produced a routable sidewalk map, 2) was usable for all cities, and 3) was free and open source. Talk about the ultimate complete makeover! (Click here to watch Open Sidewalk’s final presentation.)

All of the presentations and Fellows were met with enthusiastic applause throughout, and deservedly so. Each team took on a difficult project with big data problems and in a short amount of time produced elegant, reproducible, data-driven solutions. All four project teams — the Fellows, ALVA students, project leads and eScience staff — should be immensely proud of what they’ve accomplished. And if the social get together in the WRF Data Science Studio following the presentations was any indicator, they were jumping with joy at the summer program’s success.