The UW eScience Institute is pleased to announce the Winter Quarter 2022 Data Science Incubator Program

The goal of the Data Science Incubator is to enable new science by bringing together data scientists and domain scientists to work on focused, intensive, collaborative projects.  Our team of data scientists provides expertise in state-of-the-art technology and methods in statistics and machine learning, data manipulation and analytics at all scales, cloud and cluster computing, software design and engineering, visualization, and other topics. We invite short proposals (1-2 pages) for a remote one-quarter data-intensive research collaboration focusing on extracting insight from large, noisy, and/or heterogeneous datasets.

The program is open to any faculty, postdoc, staff, or student whose research can be significantly advanced by intensive collaboration with a data science expert. To apply, we require a short project proposal describing the science goals, the relevant datasets, and the expected technical challenges.  The ideal proposal will clearly identify both the datasets involved and the questions to be answered, and will explain how the technical component of the project is critical to delivering exciting new findings. This year we are happy to announce the availability of cloud resources to support the incubator projects and we welcome applications that wish to use cloud computing.

Each project must include a project lead who is willing to work with the incubator staff for the equivalent of 16 hours a week, including attending a weekly meeting with all the project teams. We find that collaboration in shared virtual or physical spaces is important for deeper technical engagement and provides opportunities for “cross-pollination” among multiple concurrent projects. We anticipate that the Incubator will be primarily remote but there may be a possibility of hybrid or in-person work depending on university policy and individual preferences and comfort levels. On the proposal form you will be asked your preference for working fully remote, hybrid, or in-person. Preference will not impact project selection.

Incubator projects are not “for-hire” software jobs — the project lead will work in collaboration with the data scientists and the broader eScience community. Each project lead will be responsible for successful project completion, with the eScience team providing guidance on methods, technologies, and best practices as well as general software engineering. Data scientists often make substantial contributions to Incubator projects. We expect that the work of data scientists and the role of the eScience Institute Incubator program will be properly attributed in any related talks, publications, software releases, etc.

How to Get Started

We will be holding two informational meetings, see RSVP links below. We also recommend that anyone planning to submit a project for an Incubator consult with one of our Data Scientists during their Office Hours for guidance:

Frequently Asked Questions (FAQs)

Important Dates for the Winter 2022 Incubator:

The application form will ask for the following information:

  • Contact information for the project lead — the one who be responsible for carrying out the project.
  • A description of your data. At least the size, formats, where the data currently resides, and any privacy and access restrictions. We strongly favor projects that have already collected the relevant data rather than “preparatory” projects that involve building software in the anticipation of future data collection activities.
  • Project summary / objective (~1 page) similar to the Specific Aims sections in NIH and NSF proposals. This document should include the key science questions the data will help answer and the key technical challenges you face in answering these questions. For example: Do you need new methods or algorithms? Do you need to scale up existing methods? Do you need to integrate data so it can be analyzed? Do you need to publish data and/or code to improve collaborative opportunities and reproducibility?

Proposals are prioritized based on the following criteria:

  • Participant availability and engagement
  • Ability to answer fundamentally new research questions
  • Clarity and shovel-readiness
  • Capacity for measurable outcomes
  • Capabilities and interests of the incubator research staff

We expect that some good proposals will not meet every criteria. And great proposals may not be selected due to limited bandwidth or expertise of our data scientists.

Examples of Past Incubator Projects

Is the Incubator Program right for you? Check out some of our past projects to get an idea of the work we do. Our team has a strong track record of building systems that get real use. Below are listed some of our previous collaborations.

Apryl Craig, PhD Candidate, Environmental & Forest Sciences – “Deer Fear: Using Accelerometers and Video Camera Collars to Understand if Wolves Change Deer Behavior”

Gabrielle Rocap, Professor, Oceanography – “Systems Level Analysis of Metabolic Pathways Across a Marine Oxygen Deficient Zone”

Charles Zhou, Staff Scientist, Anesthesiology & Pain Medicine – “Data Analytics for Demixing and Decoding Patterns of Population Neural Activity Underlying Addiction Behavior”

Kwong-Yu Wong, PhD Candidate, Economics – “Beneficial competition under rationing: evidence from food delivery service”

Stuart Ian Graham, PhD Candidate, Biology – “A Network Analysis of Tree Competition: Which tree species make the best neighbors?”

Emily Kalah Gade, PhD Candidate, Political Science – “Analysis of .gov Web Archive Data

Julian Olden, Professor, SAFS – “Predicting human-mediated vectors for invasive species from mobile technology”

Jay Rutherford, PhD Candidate, Chemical Engineering – “Atmospheric particulate matter source identification using excitation emission fluorescence spectroscopy”

Marina Meilǎ, Professor, Statistics – “Scalable Manifold Learning for Large Astronomical Survey Data

Or view all of our Past Projects.