For an overview of the Incubator Program click here.

Experimental diffusion analysis to extract tissue structure function in the diseased brain

Project Lead: Chad Curtis, PhD student, Chemical Engineering Department
eScience Liaison: Ariel Rokem

Diffusion and feature analysis of nanoparticles in the brain. (a) Mean squared displacements of nanoparticle trajectories, (b) Heatmap of diffusion coefficients in a small section of a rat brain slice.

Diffusion and feature analysis of nanoparticles in the brain. (a) Mean squared displacements of nanoparticle trajectories, (b) Heatmap of diffusion coefficients in a small section of a rat brain slice. Image credit: Chad Curtis

One highly underrepresented, but critically needed, area of study is inflammation-mediated central nervous system (CNS) disease. Inflammation in the CNS, mediated by activated microglia and astrocytes, is implicated in the development of several neurologic disorders in both children and adults. Strategies to target microglia/astrocytes and treat neuroinflammation could not only slow disease progression, but also promote repair and regeneration, enabling normal development and maturation of the brain. Hence, there is a crucial need for development of tailorable therapeutic platforms targeting neuroinflammation for the treatment and management of neurological disorders. However, therapeutic delivery to the brain also presents a challenge on multiple levels due to the presence of the blood-brain barrier (BBB), the diffuse nature of most brain diseases, and the brain microenvironment, of which a therapeutic must navigate to reach a target site.

The ways in which changes to the extracellular matrix, brain oedema, glial cell function, and BBB disruption affect the diffusion, interactions, and cellular uptake of therapeutics following injury to the brain represents a current major knowledge gap. Nanotechnology, which consists of small, highly-tailorable platforms, can provide a modality to survey the disease environment,7-10 providing information through analysis of nanoparticle behavior and compartmentalization to fill this knowledge gap. Nanoparticle diffusion in the brain is subject to the geometry of the extracellular space (ECS) and is influenced by interaction with cells and proteins or through encountering local fluids flows within the brain microenvironment.

The overarching objective of our research proposal is to use data science tools to extract tissue-structure function from nanoparticle diffusion data obtained in the living brain. We utilize multiple particle tracking in a highthroughput, quantitative, real-time organotypic brain slice platform that captures the complexity of the in vivo tissue environment. In developing this technology, we aim to create both database and data analysis tools to extract robust data for more-in depth understanding of the tissue-specific structure-function relationships, particularly in the context of disease. We hypothesize that understanding mechanisms of nanoparticle behavior and compartmentalization can elucidate changes in the microenvironment in the brain in the presence of injury, specifically inflammation. We will start with analyzing data we have collected in the normal healthy developing brain to establish regional based differences in diffusion that can be correlated to known anatomical and microenvironmental constraints, and to current diffusion weight MRI (DWI) data in humans.

 

Deciphering climate clues via carbon flux simulation

Project Lead: Qiaoyun Peng, Department of Atmospheric Sciences
eScience Liaisons: Amanda Tan and Rob Fatland

A combination of machine learning methods and wavelet analysis of the micrometeorological drivers is used to identify the hierarchy of the climatic controls of the ecosystem carbon flux as well as their multidimensional functional relationships.

A combination of machine learning methods and wavelet analysis of the micrometeorological drivers is used to identify the hierarchy of the climatic controls of the ecosystem carbon flux as well as their multidimensional functional relationships. Image credit: Qiaoyun Peng

Carbon and energy exchanges between the terrestrial biosphere and the atmosphere are important drivers of the Earth’s climate system. The net carbon exchange results from a balance between ecosystem uptake (photosynthesis) and losses (respiration), which could be measured and quantified by eddy covariance flux method. Monitoring, mapping and modeling of carbon fluxes in different terrestrial ecosystems are essential for understanding the contribution of these ecosystems to the regional carbon budget. This information will be particularly useful for decision making regarding various carbon-related climate change mitigation strategies.

Although CO2 inventory databases offer the potential for estimating the regional ecosystem carbon budgets in some locations, ecological models have proven to be essential tools for expanding the coverage of these data as compared to the dispersed point site measurements. Therefore, modeling of CO2 flux from other micrometeorological variables is vital for large scale assimilation. But the non-linearity of the relationship between CO2 flux and other micrometeorological flux parameters (such as energy fluxes) limits the applicability of process-based carbon flux models to accurately estimate the flux dynamics. The application of data-driven models by machine learning (ML) methods (e.g., artificial neural networks, support vector machines, regression and model trees) provides an empirical model based on the patterns contained in data, and is able to identify the complex non-linear relationship and estimate land surface–atmosphere fluxes from site level to regional or even global scales. This enables us to diagnose the state of the biosphere from observational data streams, which provide valuable insights for local climate variations.

 

Incubating a DREAM

Project Leads: Christina Bandaragoda, senior research scientist, Watershed Dynamics Research Group, Civil & Environmental Engineering, with Purshottam Shivraj, Data Science Master’s Student, Nicoleta Cristea, research associate, Mountain Hydrology Lab, Civil and Environmental Engineering
eScience Liaisons: Amanda Tan and Rob Fatland

By combining artificial and human intelligence to improve mountain flood prediction, researchers hope to facilitate a better understanding of floods in coastal Washington watersheds. During a November 2015 flood of the Skagit River near Sedro-Woolley, Wash., resident Greg Platt moves bicycles to higher ground. Photo credit: Skagit Valley Herald. The modeled flood shown was the largest most recent flood caused by an atmospheric river (AR) event in 2006. Snow and ice are critical natural reservoirs of water resources. Improved understanding is expected to improve management decisions, planning, climate impact assessment, flood & drought resiliency. Before the incubator, hydrologic modeling predications were reported with unknown uncertainty estimates. The work accomplished during the Winter Incubator resulted in a standard method for ensuring ergodic, optimal model behavior required to support hypothesis testing used in scientific decision making. Given the hours of model run time required, the computing demand could not have been possible without the help of UW IT Cloud Services and Amazon Web Services.

By combining artificial and human intelligence to improve mountain flood prediction, researchers hope to facilitate a better understanding of floods in coastal Washington watersheds. During a November 2015 flood of the Skagit River near Sedro-Woolley, Wash., resident Greg Platt moves bicycles to higher ground. Photo credit: Skagit Valley Herald. The modeled flood shown was the largest most recent flood caused by an atmospheric river (AR) event in 2006.

Sparcity of temporal data in high elevations is a hurdle for every water resources researcher investigating hydrologic processes in watersheds with greater than 1000 m rise in elevation. Temperature is interpolated to high elevations from low elevation measurements as a lapse rate (C/km), are correlated to precipitation (rain or snow) and short and long wave radiation. Across the continental United States, an annual average temperature lapse rate of 6.5 C/km is commonly used; in the Pacific Northwest, where clear-sky assumptions do not hold, an annual average temperature lapse rate of 4.8 C/km has been reported (Minder et al, 2012). Although annual average constant lapse rates are representative for the long term average hydrologic response, the sensitivity and importance of sub-daily, sub-watershed distributed mountain microclimatology and atmospheric mechanisms that control temperature lapse rates for capturing atmospheric river, rain on snow, and snowmelt driven flood events is not well understood. Interpolated and gridded shorter time scale lapse rates (e. g., daily) based on low elevation weather observations (COOP stations) are generated at continental scales with no local control on lapse rate assumptions. We need to understand the tradeoff between capturing long term watershed behavior (annual or monthly average) and short term events dependent on high elevation atmospheric interactions (with limited observations).

To address this problem we plan to develop a process to update gridded hydrometerology climate forcings at a watershed scale using local datasets, interpolating across elevation ranges based on theoretical and empirical relationships (e.g. Unsworth and Monteith, 1975). We will use data collected beginning with 2015 by the researchers from the UW Watershed Dynamics Research Group and the Nooksack Indian Tribe. The temperature observation field campaign is across a North Fork Nooksack transect of Mt. Baker (600-1800 elevation range). These results show that 3hr time series of lapse rates vary +/- 5 C/km from an 4.8 C/km annual average based on precipitation (or clouds indicated by relative humidity. Our proposed methods are to use Dakota software (Adams et al, 2015), and a DiffeRential Evolutionary Adaptive multi-objective optimization algorithm (DREAM; Vrugt, 2016) to calibrate a Landlab (Hobley et al, 2017) hydrometeorology component that uses low elevation observations (following Livneh et al, 2015) and physics-based lapse rates to match high elevation atmospheric model (WRF; Henn et al., 2017; Currier, 2016; Currier et al., 2017) monthly averages, from a cloud computing environment (e.g. HydroShare). The feasibility of our proposed process is supported by the fact that our team and projects (PREEVENTS, Landlab, HydroShare) are published authors in each field of expertise; during this incubator we will combine existing utilities to address this problem.

 

Political Twitter images project summary and goals

Project Leads: Nora Webb Williams, PhD Candidate, Department of Political Science, with Wesley Zuidema, PhD Student, Department of Political Science, John D. Wilkerson, Professor, Department of Political Science, and Andreu Casas, Moore Sloan Research Fellow, New York University  
eScience Liaison: Bernease Herman

This is the schema for the MySQL database that our research team built during the Incubator to organize and share our data. The image was built in MySQL Workbench. Image credit, Nora Webb Williams

This is the schema for the MySQL database that our research team built during the Incubator to organize and share our data. The image was built in MySQL Workbench. Image credit, Nora Webb Williams

How do outsider political groups use social media to mobilize supporters online? What types of social media techniques, messages and images are most likely to capture attention and motivate action? Prior research demonstrates the people are more responsive to visual cues than text. We test whether images impact message sharing and followership, and if so, which types of images are the most effective at mobilizing supporters.

To begin to address these questions, we are tracking the Twitter posts of roughly 1,300 public affairs organizations (obtained from the Encyclopedia of Associations), national and state politicians (including every member of the 115th Congress), and news organizations. For each tracked account, we are streaming all tweets, collecting any accompanying images or videos, and periodically collecting account data (e.g., the number of account followers). We are also streaming tweets for every hashtag that one of these organizations uses more than once, with some standard exclusions. For example, if any organization uses the hashtag #LasVegasShooting more than once, we automatically start collecting the entire stream of #LasVegasShooting tweets by all organizations and individuals.

One purpose of the methodology is to capture social mobilization efforts in their early stages – something we could not do if we were to focus on known successful cases. There are many potential questions that could be addressed with the data, however. The challenge, from a data management perspective, is that these overlapping processes are producing a large quantity of data. The Twitter data collection is ongoing, and we will soon embark on a secondary stage of data collection, hiring annotators on Mechanical Turk to provide labels for each collected image (for example, we will ask how much sadness a respondent feels after looking at a given image). Currently all of the data is stored in AWS S3 buckets.

 

Hitting the mark: targeting strategy development for SDSS V with a robotic fiber positioning system

Project Lead: Jennifer Sobeck, APOGEE-2 project manager, with Michael Blanton, associate professor, Department of Physics, New York University, and Jose Sanchez Gallego, research scientist, Department of Astronomy 
eScience Liaisons: Jacob VanderPlas

This is an initial plot of potential telescope pointings and associated tile centers for the SDSS-V Survey for both the Northern and Southern Hemisphere Sites.

This is an initial plot of potential telescope pointings and associated tile centers for the SDSS-V Survey for both the Northern and Southern Hemisphere Sites. Image credit: Jennifer Sobeck

In 2020, the fifth generation of the Sloan Digital Sky Survey (SDSS-V) will undertake a 5-year spectroscopic survey of over six million objects, building on the two decade SDSS legacy of high-quality data analysis, collaboration infrastructure, and product deliverables. SDSS-V will be groundbreaking as it will conduct simultaneous optical and near-infrared spectroscopic observations from both the Northern and Southern hemispheres with new hardware that allows for rapid reconfiguration to acquire regions of high target density and targets of opportunity as well as perform time-domain monitoring. The Survey will also provide contiguous integral-field spectroscopic coverage of the Milky Way and Local Volume galaxies. In essence, SDSS-V will be the first-ever panoptic spectroscopic survey, generating a comprehensive spectral dataset from millions of sources spread across the entire sky.

SDSS-V will consist of three cornerstone programs: the Milky Way Mapper (MWM), a time- domain stellar spectroscopic survey of the Milky Way analyzing Galactic formation as well as the physics of its resident stars and interstellar medium; the Black Hole Mapper (BHM), a time-domain spectroscopic quasar survey probing black hole growth and mapping the X-ray sky; and, the Local Volume Mapper (LVM), an integral-field survey of the Milky Way and its galactic neighbors exploring star formation and the physics of the interstellar medium. The MWM and BHM pro- grams will constitute the multi-object spectrographic (MOS) component of SDSS-V and will jointly harness the newly built Robotic Fiber Positioning System (RFPS). The current SDSS plug-plate fiber system will be replaced with robotic positioners with 500 arms, 300 of which will contain both an optical and an near-infrared fiber while the remaining 200 will contain only an optical fiber. The RFPS conveys several advantages over the plug-plate system as it substantially reduces target reconfiguration time (from 20 minutes to under 2 minutes) and boosts survey efficiency as it better accounts for atmospheric refraction (which consequently increases the available observing window). With the RFPS, the targeting plan can also be modified on short timescales to permit observations of transients and other targets of opportunity.

Planning the time-domain, multi-object component of the SDSS-V Survey will be far more complex than simply covering the sky. Vital to the fulfillment of the data acquisition goals (for the specified 6 million stars) will be the development of new algorithms that optimize the targeting strategy over the duration of SDSS-V.