Submit an Incubator Project

  1. Using Data Science
  2. /
  3. Data Science Incubator
  4. /
  5. Submit an Incubator Project

Work closely with data science professionals and students to make better use of your data.

The Data Science Incubator enables new science by bringing together data scientists and domain scientists to work on focused, intensive, collaborative projects. Our team of data scientists provides expertise in state-of-the-art technology and methods in statistics and machine learning, data manipulation and analytics at all scales, cloud and cluster computing, software design and engineering, visualization, and other topics. We invite short proposals (1-2 pages) for a remote one-quarter data-intensive research collaboration focusing on extracting insight from large, noisy, and/or heterogeneous datasets.

The program is open to any faculty, postdoc, staff, or student whose research can be significantly advanced by intensive collaboration with a data science expert. To apply, we require a short project proposal describing the science goals, the relevant datasets, and the expected technical challenges.  The ideal proposal will clearly identify both the datasets involved and the questions to be answered, and will explain how the technical component of the project is critical to delivering exciting new findings. Cloud resources are also to support the Incubator projects, and we welcome applicants who wish to use cloud computing.

The Winter 2023 Incubator program will run January 3rd through March 9th, 2023. There will be several remote informational Q&A sessions on October 28th and November 3rd if you want more info about the program or have questions about submitting a proposal.

The deadline for 2023 proposals is November 15, 2022. We will announce the selected Incubator projects by December 9th, and there will be a program kick-off meeting on January 3rd, 2023.

The Incubator program increased my exposure not only to methodologies but also to resources to help me handle the challenges of working with large datasets.”

Lauren Kuntz, Winter 2019 Incubator participant

Learn how eScience can help you can make the most of your data.

Getting Started

Each project must include a Project Lead who is willing to work with the incubator staff for the equivalent of 16 hours a week, including attending a weekly meeting with all the project teams. We find that collaboration in shared virtual or physical spaces is important for deeper technical engagement and provides opportunities for “cross-pollination” among multiple concurrent projects. We anticipate that we will continue to support both remote and in-person participation in the Incubator program. On the proposal form you will be asked your preference for working fully remote, hybrid, or in-person. Preference will not impact project selection.

Incubator projects are not “for-hire” software jobs — the project lead will work in collaboration with the data scientists and the broader eScience community. Each project lead will be responsible for successful project completion, with the eScience team providing guidance on methods, technologies, and best practices as well as general software engineering. Data scientists often make substantial contributions to Incubator projects. We expect that the work of data scientists and the role of the eScience Institute Incubator program will be properly attributed in any related talks, publications, software releases, etc.

Application Details

The Data Science Incubator application will ask for the following information from you:

  • Contact information for the Project Lead — the one who be responsible for carrying out the project.
  • A description of your data: the size, formats, where the data currently resides, and any privacy and access restrictions. We strongly favor projects that have already collected the relevant data rather than “preparatory” projects that involve building software in the anticipation of future data collection activities.
  • 1-page project summary/objective similar to the Specific Aims sections in NIH and NSF proposals (see below). This document should include the key science questions the data will help answer and the key technical challenges you face in answering these questions. For example: Do you need new methods or algorithms? Do you need to scale up existing methods? Do you need to integrate data so it can be analyzed? Do you need to publish data and/or code to improve collaborative opportunities and reproducibility?

Tips on NIH and NSF Specific Aims writing styles can be found here and here.

Proposal Priorities

Incubator proposals are based on the following criteria. We expect that some good proposals will not meet every criteria, and great proposals may not be selected due to limited bandwidth or expertise of our data scientists.

  • Participant availability and engagement
  • Ability to answer fundamentally new research questions
  • Clarity and shovel-readiness
  • Capacity for measurable outcomes
  • Capabilities and interests of the incubator research staff

See past Data Science Incubator projects here.