Integrating gene expression, chromatin modification, and transcription factor binding human embryonic stem cell datasets to understand gene regulation in development

Irene Kaplow, Stanford University

Photo of Irene Kaplow

The human genome sequence alone cannot explain all of the gene regulation that occurs in human cells. Modifications in chromatin, which covers DNA, make substantial contributions to gene regulation. In addition, transcription factor binding events, which often occur near the starts of genes, often play major roles in promoting gene expressions. The ENCODE Consortium and the Epigenomic Roadmap Consortium have done genome-wide studies of gene expression, chromatin modifications, and some transcription factor binding events. However, for most cell lines, the consortia have data for only one time point. Thus, differences observed between cell lines might be due to some aspect of the tissue the cells are in that is not directly related to gene regulation. Julie Baker's lab at Stanford has gathered data on genome-wide chromatin modification, transcription factor binding, and gene expression from the same human embryonic stem cell line at multiple time points in development. We cluster the genes, using the ratio between the gene expression on each day of development and the gene expression on day 0 as features, so that genes with similar patterns are grouped together. We learn a regression to determine how four types of transcription factor binding events and three types of histone modifications are predictive of the gene expression pattern clusters. We plan to incorporate human embryonic stem cell data from the ENCODE and Epigenomic Roadmap consortia to obtain a more refined understanding of this regulation. We hope to ultimately identify transcription factor binding and histone modification events that cause some of the gene expression changes that we see in development.

Abstract Author(s): Irene Kaplow, Sofia Kyriazopoulou-Panagiotopoulou, Duygu Ucar, Julie Baker, and Daphne Koller