New Ph.D. Tracks in "Big Data"

The University of Washington has a long history of advancing the methodology and practice of data science, including achievements by the Center of Statistics in the Social Sciences and the eScience Institute.

Now, UW is launching Ph.D. tracks in "Big Data" through a partnership between Computer Science & Engineering and Statistics.

The Big Data tracks will be an overlay on top of departments' regular quals requirements, leading to a new certificate en route to the Ph.D. degree.

The CSE track will have students select three out of the following four courses:

  • Data Management: Principles of Database Management Systems, CSE 544
  • Machine Learning: CSE 546 
  • Data Visualization: A new course to be offered by Jeff Heer.
  • Statistics: A new Big Data course offered jointly by CSE and Statistics, Stat 592 and CSE 5999

These tracks represent the first step toward a long-term vision that brings together CSE students, Statistics students, and various domain science students to form one cohort, work together, and pursue a unique program involving several key tenets: 

  • new curriculum involving courses that “level-the playing-field” in terms of computational and statistical backgrounds, and in terms of the domain knowledge needed to understand the needs of the scientific domains that require Big Data, with a foundation in each of our four core themes. The content of these courses cover three categories: skills that are taught in some form currently, but need greater emphasis or depth (e.g., statistical analysis, basic programming), those that are taught in computer science but not in domain science (e.g., databases, parallel algorithms, machine learning), and those that are not currently taught in any curricula (e.g., deploying cloud services for science).
  • Multidisciplinary supervision: Every student will have an advisor in the core field, and a secondary advisor in a complementary field.
  • Interdisciplinary Projects and Cyberinfrastructure Development: Each student will participate in projects providing exposure to research in complementary fields and contributing to or utilizing tools and services from the frontiers of research in computer science and statistics.
  • Industrial Practical Training: Each student will do two internships providing exposure to Big Data infrastructure and analysis in industry and/or national labs.

To increase multidisciplinary interaction, all of the students in this new program, independently of their core research area, will have offices in the same lab.