Dissecting Gene Regulation with Machine Learning: Discoveries and Challenges

Date: 
February 1, 2019
Time: 
12:30pm to 1:30pm
Place: 
MH-2103

Speaker: Katie Pollard, PhD,  Director, Gladstone Institute of Data Science & Biotechnology; Epi/Biostat; Investigator, Chan-Zuckerberg Biohub

Title: Dissecting Gene Regulation with Machine Learning: Discoveries and Challenges

Summary: Machine learning is a popular statistical approach in many fields, including genomics. We and others have used a variety of supervised machine-learning techniques to predict genes, regulatory elements, 3D interactions between regulatory elements and their target genes, and the effects of mutations on regulatory element function. I will highlight a few of these studies, emphasizing the strengths and weaknesses of different predictive models and the biological insights gained via variable importance analysis. Then I will talk about some of our recent work exploring the limitations of popular machine-learning methods in genomics, where the biology underlying the data used to train the models frequently violates one or both parts of the independent and identically distributed (IID) assumption. The talk will conclude with some thoughts on modeling non-IID data and interpreting over-fit models, with the aim of improving the application of supervised learning to biological data and emphasizing the mechanistic insights gained from modeling over performance statistics per se.

Event Type: 
DEB First Friday Seminar