EMBER: Expectation Maximization of Binding and Expression pRofiles

Mark Maienschein-Cline, University of Chicago

Photo of Mark Maienschein-Cline

Identifying the target genes regulated by transcription factors (TFs) is the most basic step in understanding gene regulation. Recent advances in high-throughput sequencing technology, together with chromatin immunoprecipitation (ChIP), enable mapping TF binding sites genome-wide, but it is not possible to infer function from binding alone. This is especially true in mammalian systems, where regulation often occurs through long-range enhancers in gene-rich neighborhoods, rather than proximal promoters, preventing straightforward assignment of a binding site to a target gene. Here, I present EMBER (Expectation Maximization of Binding and Expression pRofiles), a method that integrates high-throughput binding data (e.g., ChIP-chip or ChIP-seq) with gene expression data (e.g., DNA microarray) via an unsupervised machine-learning algorithm for inferring the gene targets of sets of TF binding sites. Genes selected are those that match over-represented expression patterns, which can be used to provide information about multiple TF regulatory modes. In my poster, I outline the method and describe applications that validate the results of the algorithm.

Abstract Author(s): Mark Maienschein-Cline, Roger Sciammas, and Aaron R Dinner