Please join us for a UW Data Science Seminar on Thursday, February 1st from 4:30 to 5:20 p.m. PST. The seminar will feature Evan Komp, recent PhD in Chemical Engineering Data Science.
The seminar will be held in the Electrical & Computer Engineering Building (ECE), Room 105
“Leveraging Nature’s Translation Between Low and High Temperature Proteins with Deep Learning”
Abstract: This work presents Neural Optimization for Melting-temperature Enabled by Leveraging Translation (NOMELT), a novel approach for designing and ranking high-temperature stable proteins using neural machine translation. The training required the development of a new dataset of protein homologous pairs occurring in organisms adapted to low and high temperatures, which is detailed. The dataset is orders of magnitude larger than any dataset of its kind, with 25 million protein pairs. By training on over 4 million of the highest quality pairs, the model demonstrates promising capability in targeting thermal stability. A designed variant of the Drosophila melanogaster Engrailed Homeodomain shows increased stability at high temperatures, as validated by estimators and molecular dynamics simulations. Furthermore, NOMELT achieves zero-shot predictive capabilities in ranking experimental melting and half-activation temperatures across two protein families. It achieves this without requiring extensive homology data or massive training datasets as do existing zero-shot predictors by specifically learning thermophilicity, as opposed to all natural variation. These findings underscore the potential of leveraging organismal growth temperatures in data-rich, context-dependent design of proteins for enhanced thermal stability.
Bio: Evan Komp recently finished his PhD in Chemical Engineering Data Science at the University of Washington under the amazing Prof. David Beck, meanwhile awarded the data science fellowship from the Clean Energy Institute and a stint as a machine learning engineer in pharma. He has worked on a number of topics at the intersection of deep learning and the chemical sciences, including molecular properties, chemical reaction rates, and protein thermal stability. Evan is a strong advocate for the use of data science to help us develop a more sustainable, climate resilient and friendly society, and as such encourages everyone who works in a compute intensive environment to track their computation’s carbon emissions.
The UW Data Science Seminar is an annual lecture series at the University of Washington that hosts scholars working across applied areas of data science, such as the sciences, engineering, humanities and arts along with methodological areas in data science, such as computer science, applied math and statistics. Our presenters come from all domain fields and include occasional external speakers from regional partners, governmental agencies and industry.
The 2023-2024 seminars will be held in person, and are free and open to the public.