What are the expectations regarding the source(s) of project data?
Data usage should be discussed on a case-by-case basis between Project Leads and Data Scientists during the project vetting process. As a Project Lead, it is your responsibility to secure the necessary rights for the data you intend to use, and it is very important that you answer the questions in the application form regarding any restrictions applicable to those data. Project Leads will be expected to demonstrate that any licenses or other permissions necessary to use the data for the DSSG program are in place prior to the final selection of their projects. For publicly available data sources, it is important that the proposed project is compatible with any terms and conditions attached to those data sources. For data licensed from non-public sources, the license terms must be compatible with the requirements of the DSSG program, as determined by DSSG program staff.
Can you help me collect data?
No. Collecting data is beyond the scope of our DSSG projects. Given the compressed timeline of the program, it is absolutely critical that the data already be in hand. Prospective Project Leads will need to ensure that all of the data they plan to use in the project is in their possession prior to the final selection of DSSG projects, with all needed licenses or other permissions secured and reviewed. However, DSSG teams will sometimes augment their analyses with additional data sources that they identify over the course of the summer. In these cases, the same standards will apply to ensure that appropriate permissions are in place for the DSSG program.
What are the expectations regarding any intellectual property generated by the project?
The eScience Institute does not claim ownership over the work produced in the program, but we do strongly encourage open access and open source practices to the greatest extent possible. Each participant will retain copyrights of their individual contributions, with the expectation that these contributions will be available for publication under a standard open source license. We understand that some projects may have attributes that prevent full public disclosure. If you believe this is the case for your proposed project, this must be disclosed and discussed with eScience staff up-front during the project vetting process. We expect that students and data scientists participating in the project will be included as authors on any publications based substantially on the work undertaken as part of the DSSG program, in accordance with their contributions and the conventions of the field. We also expect acknowledgment of eScience support in such publications. If you believe that your proposed project may produce or contribute to novel patentable ideas, please highlight this in your application so that we can evaluate any arrangements necessary to properly account for this special case.
What if I can’t be available for the 16 hour weekly time commitment required by the DSSG program?
Then this program probably isn’t the best fit. You may be interested in learning about other collaborative data science opportunities at UW that have a different program structure. A description of several such opportunities can be found at the bottom of our Call for Proposals. For projects that have two Project Leads, coverage can sometimes be shared.
Do I have to spend two full days each week working with my DSSG team?
We require Project Leads to commit to being available to their teams on average 16 hours per week. For teams that collaborate in-person, this usually means working from our Data Science Studio the equivalent of two days per week. For teams that collaborate remotely, this usually means being available for meetings and responsive on email, Slack, or other channels of communication adopted by the team.
Are there any specific times that I am required to be present?
While you have the flexibility to schedule coworking time with your team as you see fit, there are some events that we expect Project Leads to be present for. During the first week of the program, we will hold several orientation and team development workshops, in which Project Lead participation is crucial. That means it’s important that you have a fair degree of flexibility in your schedule during the week beginning June 13, 2022, but we will develop the specific agenda in consultation with selected Project Leads. It will be important for Project Leads to be present for final public presentations and a closing reception at the end of the summer; that date will be determined prior to final project selection. Project Leads are also expected to participate in a couple of regularly scheduled program-wide meetings that recur weekly or biweekly throughout the summer. Each week we hold a “spotlight” meeting that showcases the progress of project teams and a program-wide meeting for check-ins and announcements that Project Leads are highly encouraged to attend; in the past, these have fallen on Monday and Tuesday mornings and Thursday afternoons, respectively. Additionally, every other week, we hold a one-hour meeting between all Project Leads, Data Scientists, and program administrators to ensure that the program is proceeding smoothly; the timing of this meeting will be established in consultation with all Project Leads selected into the program.
What if I need to be away for part of the DSSG program?
We understand that people have commitments over the summer, and we can accommodate absences of short duration. Prolonged or frequent absences would be detrimental to your project, and there are specific times that are crucial for Project Leads to be present; for example, any absence during the first or last two weeks of the program (June 13-24 and August 8-19) would be disruptive. Please let us know well in advance what your plans are and we will work with you to make sure your absence doesn’t impact the success of your project.
Will I be paid for the time I spend in the program?
Typically we provide stipends for the students who serve as Fellows in the DSSG program, but not for Project Leads because we expect that the projects are central components of work they are already doing within their departments, labs, or organizations. However, starting in 2022 we are making a pool of funds available for Project Leads in situations that warrant financial support. Please see the “Financial Support” section in the Call for Proposals for more information and a link to the application for funding.
What do I get out of the program?
Some of the potential benefits to Project Leads include: access to a team of talented and motivated students; a period of intensive focus on advancing their projects; exposure to new methods and approaches; guidance in best practices for software development, reproducible science, and human-centered design; experience working with highly interdisciplinary teams; networking opportunities that may lead to longer-term collaborations; and publicity from participating in the program.
What are the responsibilities of the Project Lead and how does that relate to other roles on the team?
The Project Lead is expected to write and submit the project proposal, and bears primary responsibility for project design and execution throughout the summer. Together with the Data Scientists from the eScience Institute, the Project Leads will co-manage the student teams. This may include ensuring that work is sensibly and fairly distributed, setting milestones for project deliverables, guiding the decision-making process on the team, and navigating relationships with relevant stakeholders. Because we consider stakeholders to be important to the success of a project, we ask all Project Leads to organize at least one virtual or in-person site visit for their teams, preferably in the first three weeks of the program (June 13 – July 1). Project Leads are also in the position of being a “domain expert” or “subject matter expert” on their teams. Because the student fellows come from such a diverse array of educational backgrounds, they will likely not be well-versed in the problem space that their DSSG project addresses; therefore, they will rely on their Project Leads to help them get familiar with relevant prior work and to understand the background, context, and social complexity of the problem they’re working on.
The Data Scientists from the eScience Institute will provide guidance on methods, technologies, and best practices in extracting knowledge from large, noisy, and/or heterogeneous datasets, as well as general software engineering.
Student responsibilities will vary from project to project, but their role may include developing code, selecting methods, conducting analyses, contributing to design, preparing documentation, and incorporating stakeholder perspectives into the project.
The project team may also include external mentors and stakeholders as appropriate.
Can the Project Lead role be shared?
Sometimes. In situations where two people have a pre-existing working relationship on the project, or each of them contribute crucial expertise, a shared Project Lead role can be considered.
I will have a student intern/research assistant working with me this summer. Can they participate in DSSG?
Probably. This has come up several times in the past and we have been able to incorporate the student interns in various ways. With enough advanced notice, the student intern may apply to be a DSSG Fellow and will be considered alongside other applicants based on the merits of their application. In other cases, the intern has coordinated with the DSSG team without being considered a DSSG Fellow.
I have a team of collaborators that I work with in my organization on this project. How can they be involved in the DSSG program?
While it’s important for the DSSG teams to have a designated Project Lead who is their main collaborator and point of contact, we highly encourage regular communication and collaboration with other stakeholders on the projects, including the Project Lead’s team of collaborators. In the past, DSSG Fellows have conducted site visits to the offices of collaborators, provided presentations tailored to them, and held periodic conference calls with them throughout the summer.
What educational backgrounds and skills will the DSSG students have?
Typically, about 75 percent of our students are graduate students, while about 25 percent are junior or senior undergraduates. We intentionally select a cohort with a wide range of backgrounds. In the past, DSSG students have hailed from a wide range of fields such as astronomy, biology, computer science, economics, design, mathematics, public policy, and sociology. This means the students bring a diverse set of perspectives, skills, and knowledge to the program. We look for students with at least a baseline level of programming experience and statistics training, but some students will have more advanced technical skills than others. Please keep in mind that this is an educational opportunity for students.
Will I get to choose the students on my team?
The short answer is no, not directly. During the application process, you will have the opportunity to articulate the kinds of skills or backgrounds you anticipate needing on your project, and if your project is chosen for the program, we will keep this in mind while doing cohort selection. Then, once we have selected a cohort of students, we will share all of the project descriptions with them and ask them to indicate which projects they are most interested in working on. When making final team assignments, we will try to balance the following considerations: which projects the students are interested in, what mix of skills the project needs to be successful, what skills the students already have, and what skills the students hope to learn.
Is there an opportunity to publish the work developed during the summer?
The eScience Institute strongly supports publicly sharing and disseminating work developed in the DSSG. During the program we encourage different opportunities for sharing outcomes: project websites and blogs, presentations, code repositories, dataset archives, etc. In the past, some DSSG teams have produced short papers during the summer to be presented at data-for-good oriented conferences, such as the Bloomberg Data for Good Exchange. Given that the program is short, any commitment to producing such public-facing outputs must be made with consideration for the goals, interests, and pragmatic constraints facing students and Project Leads. Typically, it is not possible to produce a scholarly journal article within the 10-week timeframe of the DSSG program, but peer-reviewed papers have been written based on DSSG work following the end of the summer program.
What happens to the projects at the end of the summer?
This is largely up to the Project Leads. One of the things we consider when reviewing proposals is the likelihood that a proposed project will be sustained beyond the summer. By infusing the program with best practices in reproducible science and human-centered design, we try to ensure that there is a smooth “hand-off” of the work to Project Leads, so that the project can be continued, implemented, or extended by them and their organizations. Having said that, there have been some cases where DSSG projects have led to longer-term collaborations with eScience and/or DSSG students, including additional funded research. eScience has also on occasions in the past supported travel for Project Leads and students to present their DSSG projects at conferences following the end of the summer program.