UW Data Science Seminar: Tyler McCormick


4:30 pm – 5:20 pm


Electrical and Computer Engineering Building, Room 105
185 W Stevens Way NE, Seattle, WA, 98195

Please join us for a UW Data Science Seminar on Thursday, February 15th from 4:30 to 5:20 p.m. PST. The seminar will feature Tyler McCormick, a Senior Data Science Fellow and a Professor of Statistics and Sociology at UW.

The seminar will be held in the Electrical & Computer Engineering Building (ECE), Room 105


“Robustly estimating heterogeneity in factorial data using Rashomon Partitions”

Abstract: Many statistical analyses begin with a fundamental question: How does the outcome vary with observable covariates?  Do more experienced employees receive higher wages? Do vaccinated individuals get sick less frequently than those who are unvaccinated? Do trout in warm water eat less than trout in cold water? Does Venus’s atmosphere contain a higher fraction of nitrogen than Mars?  Do patients taking Metformin have lower A1C measures than those taking DPP4-inhibitors?

In settings where these covariates are discrete, this problem yields a factorial-like structure, and it is impossible to enumerate all possible combinations of covariates for any scientifically interesting setting. In this paper, we propose an approach to enumerating heterogeneity in the relationship between an outcome and discrete covariates by creating a Rashomon Partitions Set (RPS). Each Rashomon partition consists of the feature combinations that maximize heterogeneity in the outcome space. We construct this by pooling similar feature combinations using priors over pooling patterns in an overarching Bayesian model.  We show that we can characterize the set of Rashomon Partitions in terms of its fraction of the overall posterior and size.  Further, we demonstrate that the RPS is enumerable in meaningful settings by leveraging the insight that many potential combinations of features are, in practice, nonsensical for pooling because they represent different dimensions in the covariate space.  We demonstrate RPS construction in the context of two practical settings: finding heterogeneity in outcomes of a randomized trial and examining racial disparities in health outcomes in a large clinical dataset.  This is joint work with Arun Chandrasekhar (Stanford Economics) and Aparajithan Venkateswaran (UW Statistics).

Bio: Tyler McCormick is a Professor of Statistics and Sociology at the University of Washington, where he is also a core faculty member in the Center for Statistics and the Social Sciences.  He is also a Senior Data Science Fellow at the eScience Institute.  Tyler’s work develops statistical models that infer dependence structure in scientific settings where data are sparsely observed or observed subject to error.  His recent projects include estimating features of social networks (e.g. the degree of clustering or how central an individual is) using data from standard surveys, inferring a likely cause of death (when deaths happen outside of hospitals) using reports from surviving caretakers, and quantifying & communicating uncertainty in predictive models for global health policymakers.  He holds a Ph.D. in Statistics (with distinction) from Columbia University and is the recipient of an NIH New Innovator (DP2) Award, NIH Career Development (K01) Award, Army Research Office Young Investigator Program Award, and a Google Faculty Research Award.  Tyler is the former Editor of the Journal of Computational and Graphical Statistics (JCGS) and a Fellow of the American Statistical Association.

The UW Data Science Seminar is an annual lecture series at the University of Washington that hosts scholars working across applied areas of data science, such as the sciences, engineering, humanities and arts along with methodological areas in data science, such as computer science, applied math and statistics. Our presenters come from all domain fields and include occasional external speakers from regional partners, governmental agencies and industry.

The 2023-2024 seminars will be held in person, and are free and open to the public.