Work closely with data science professionals and students to make better use of your data.
The Data Science Incubator enables new science by bringing together data scientists and domain scientists to work on focused, intensive, collaborative projects. Our team of data scientists provides expertise in state-of-the-art technology and methods in statistics and machine learning, data manipulation and analytics at all scales, cloud and cluster computing, software design and engineering, visualization, and other topics. We invite short proposals (1-2 pages) for a one-quarter data-intensive research collaboration focusing on extracting insight from large, noisy, and/or heterogeneous datasets.
The program is open to any faculty, postdoc, staff, or student whose research can be significantly advanced by intensive collaboration with a data science expert. To apply, we require a short project proposal describing the science goals, the relevant datasets, and the expected technical challenges. The ideal proposal will clearly identify both the datasets involved and the questions to be answered, and will explain how the technical component of the project is critical to delivering exciting new findings. Cloud resources are also available to support the Incubator projects, and we welcome applicants who wish to use cloud computing.
The Winter 2024 Incubator program will run January 4rd through March 8th, 2024. There will be informational Q&A sessions on October 27th and November 6th if you want more info about the program or have questions about submitting a proposal. We also strongly recommend that anyone planning to submit an Incubator proposal consult with one of our Data Scientists during their Office Hours for guidance.
The deadline for 2024 proposals is November 14, 2023. We will announce the selected Incubator projects by December 12th, and there will be a program kick-off meeting on January 4th, 2024.
The Incubator program increased my exposure not only to methodologies but also to resources to help me handle the challenges of working with large datasets.”
— Lauren Kuntz, Winter 2019 Incubator participant
Learn how eScience can help you can make the most of your data.
Each project must include a Project Lead who is willing to work with the incubator staff for the equivalent of 16 hours a week, including attending a weekly meeting with all the project teams. We find that collaboration in shared virtual or physical spaces is important for deeper technical engagement and provides opportunities for “cross-pollination” among multiple concurrent projects. We anticipate that we will continue to support both remote and in-person participation in the Incubator program. On the proposal form you will be asked your preference for working fully remote, hybrid, or in-person. Preference will not impact project selection.
Incubator projects are not “for-hire” software jobs — the project lead will work in collaboration with the data scientists and the broader eScience community. Each project lead will be responsible for successful project completion, with the eScience team providing guidance on methods, technologies, and best practices as well as general software engineering. Data scientists often make substantial contributions to Incubator projects. We expect that the work of data scientists and the role of the eScience Institute Incubator program will be properly attributed in any related talks, publications, software releases, etc.
The Data Science Incubator application will ask for the following information from you:
- Contact information for the Project Lead — the one who be responsible for carrying out the project.
- A description of your data: the size, formats, where the data currently resides, and any privacy and access restrictions. We strongly favor projects that have already collected the relevant data rather than “preparatory” projects that involve building software in the anticipation of future data collection activities.
- 1-page project summary/objective similar to the Specific Aims sections in NIH and NSF proposals (see below). This document should include the key science questions the data will help answer and the key technical challenges you face in answering these questions. For example: Do you need new methods or algorithms? Do you need to scale up existing methods? Do you need to integrate data so it can be analyzed? Do you need to publish data and/or code to improve collaborative opportunities and reproducibility?
- October 27th: Information meeting. Time: 9:00 -10:00 a.m. PT. via Zoom.
- November 6th: Information meeting. Time: 1:00 – 2:00 p.m PT. in-person in the WRF Data Science Studio or via Zoom.
- Info sessions slides will be posted following the sessions.
- DEADLINE November 14th: Proposals due by 11:59 p.m. PT.
- December 12th: Notification of proposal selections.
- January 4th: Kickoff meeting.
- The Incubator will run January 4th – March 8th, 2024.
Incubator proposals are based on the following criteria. We expect that some good proposals will not meet every criteria, and great proposals may not be selected due to limited bandwidth or expertise of our data scientists.
- Participant availability and engagement
- Ability to answer fundamentally new research questions
- Clarity and shovel-readiness
- Capacity for measurable outcomes
- Capabilities and interests of the incubator research staff