Learning to Model Transcription Factor Binding Sites
Date: 2015/4/29

Speaker:Chun-Hsi Huang

Time: April 29th, 11:00-12:00am

Location: Room 220, Building 8, Yueyang Road Campus


A transcription factor (TF) is a protein or protein complex that regulates the expression of its target genes by physically binding to the regulatory regions of these genes. The binding sites of a TF naturally share a common pattern (motif) with one another. Given known binding sites of a TF, a model can be built to scan sequences for putative binding sites. Scientists routinely scan DNA sequences for transcription factor binding sites (TFBSs). Most of the available tools rely on position-specific scoring matrices (PSSMs) constructed from aligned binding sites. Public databases such as TRANSFAC, ORegAnno and PAZAR store a significant amount of unaligned variable-length DNA segments containing binding sites of a TF. Moreover, data produced by chromatin immunoprecipitation (ChIP) experiments has become increasingly available. It represents an important source of unaligned TFBS-containing DNA segments. As work on TFBS alignment has been limited, it is highly desirable to have an alignment algorithm tailored to TFBSs.

In this talk, we will discuss a learning algorithm for aligning known binding sites and how the algorithm may handle the intensive computation while aligning sequences produced by ChIP experiments. In addition, we will discuss a user-friendly integrated web tool that incorporates the learning algorithm and allows users to perform TFBS search without leaving the web site. Important features of the webtool include the acceptance of unaligned variable-length TFBSs, a large collection of TF models, automatic promoter sequence retrieval, visualization in the UCSC Genome Browser, as well as gene regulatory network inference and visualization based on binding specificities.


Dr. Chun-Hsi Huang is an Associate Professor of the Department of Computer Science and Engineering at the University of Connecticut, USA. He received his BS in 1989, MS in 1994 and PhD in 2001 from the National Chiao-Tung University in Taiwan, the University of Southern California and the State University of New York at Buffalo, respectively, all in Computer Science. His current research interests include Computational Biology, High-Performance Computing as well as Combinatorial and Parallel Algorithms.  

                        SIST-Seminar 15007