Category: Incubator Project
-
Developing a Workflow for Managing Large Hydrologic Spatial Datasets to Assist Water Resources Management and Research
Project Lead: Nicoleta Cristea, Civil and Environmental Engineering, University of Washington Project Collaborators: Jessica Lundquist, Ryan Currier, Karl Lapo eScience Liaisons: Anthony Arendt, Rob Fatland Large, spatially distributed datasets have increasingly become more abundant, but there is currently no workflow that efficiently manages, analyzes and visualizes these datasets, ultimately dampening their usability and assistance in water resource management/research. Within the…
-
Analysis of Large-Scale Patterns in Phytoplankton Diversity
Project Lead: Sophie Clayton (Oceanography) eScience Liaison: Daniel Halperin Microscopic algae (called phytoplankton) form the base of the oceanic food chain, and are key players in the biogeochemical cycles of many climatically-active elements. Ecological theory predicts that diverse ecosystems are more stable, i.e. more resistant to stressors, than less diverse ecosystems. However data on the diversity of oceanic…
-
Innovation: Evidence from Patents
Project Lead: Matthew Denes (Finance and Business Economics) eScience Liaison: Andrew Whitaker One of the key drivers of long-term economic growth studied in economics and finance is technological innovation. A common proxy of innovative activity is patents. Patents provide researchers with a clear and well-recorded measure of innovation, where the number of patents and patent citations are…
-
Analysis of .Gov Web Archive Data
Project Leads: Emily Gade (Political Science) eScience Liaison: Andrew Whitaker Data are revolutionizing all fields of science including political science. Managing unstructured data (particularly text) is a non-trivial challenge for social scientists, especially at a large scale. An example is the .gov dataset curated by the Internet Archive (IA). The IA curates web crawls from 1996 to…
-
Simulating Competition in the U.S. Airline Industry
Project Lead: Charlie Manzanares (Economics) eScience Liaisons: Andrew Whitaker, Daniel Halperin Since 2005, the U.S. airline industry has experienced the most dramatic merger activity in its history, which has reduced the number of major carriers in the U.S. from eight to four. My project seeks to provide novel estimates of changes in consumer and producer welfare in the…
-
Students’ Sleep and Academic Performance
Project Lead: Ângela M. Katsuyama, UW Biology Advisor: Horacio O. de la Iglesia, UW Biology eScience Liaisons: Bill Howe, Daniel Halperin This project investigates the impact of sleep in college academic performance. We hypothesize that poor academic performance in college students correlates with poor sleep behaviors. To address this hypothesis, we collected data from 72 senior students…
-
Kernel-Based Moving Object Detection
Project Lead: Andrew Becker, UW Astronomy eScience Liaison: Daniel Halperin With assistance from: Andrew Whitaker, Bill Howe Kernel-Based Moving Object Detection (KBMOD) describes a new technique to discover faint moving objects in time-series imaging data. The essence of the technique is to filter each image with its own point-spread-function (PSF), and normalize by the image noise, yielding a likelihood…
-
ASPASIA: Adult Service Providers and Some Incidental Addenda
Project Lead: Sam Henly, a PhD student in the UW Department of Economics eScience Liaison: Andrew Whitaker, Data Scientist, eScience Institute Most prostitution in the United States is organized through Internet media. This presents an opportunity for research into a market that, historically, has proved impenetrable to systematic investigation. APSASIA is an effort to collect all of…
-
Scalable Manifold Learning for Large Astronomical Survey Data
Project lead: Marina Meila, UW Department of Statistics eScience Liaison: Jake VanderPlas, Director of Research – Physical Sciences, UW eScience Institute Manifold Learning (ML), also known as Non-linear dimension reduction, finds a non-linear representation of high-dimensional data with a small number of parameters. ML is data intensive; it has been shown statistically that the estimation accuracy depends…
-
Efficient Computation on Large Spatiotemporal Network Data
Project Lead: Ian Kelley, Ph.D., Research Consultant, Information School eScience Liaison: Andrew Whitaker, Ph.D., Research Scientist, eScience Institute The pervasive and rich data available in today’s networked computing environment provides many major opportunities for innovative data-intensive applications. Particularly challenging are data analysis projects that rely upon input from millions of sparse, highly dimensional, and dirty data files…