Bin Liu1, Longyun Fang2, Ren Long2, Xun Lan3, Kuo-Chen Chou4. 1. School of Computer Science and Technology, Key Laboratory of Network Oriented Intelligent Computation, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong 518055, China, Computational Biology, Gordon Life Science Institute, Belmont, MA 02478, USA. 2. School of Computer Science and Technology. 3. Department of Genetics, Stanford University, Stanford, CA 94305, USA and. 4. Computational Biology, Gordon Life Science Institute, Belmont, MA 02478, USA, Center of Excellence in Genomic Medicine Research (CEGMR), King Abdulaziz University, Jeddah 21589, Saudi Arabia.
Abstract
MOTIVATION: Enhancers are of short regulatory DNA elements. They can be bound with proteins (activators) to activate transcription of a gene, and hence play a critical role in promoting gene transcription in eukaryotes. With the avalanche of DNA sequences generated in the post-genomic age, it is a challenging task to develop computational methods for timely identifying enhancers from extremely complicated DNA sequences. Although some efforts have been made in this regard, they were limited at only identifying whether a query DNA element being of an enhancer or not. According to the distinct levels of biological activities and regulatory effects on target genes, however, enhancers should be further classified into strong and weak ones in strength. RESULTS: In view of this, a two-layer predictor called ' IENHANCER-2L: ' was proposed by formulating DNA elements with the 'pseudo k-tuple nucleotide composition', into which the six DNA local parameters were incorporated. To the best of our knowledge, it is the first computational predictor ever established for identifying not only enhancers, but also their strength. Rigorous cross-validation tests have indicated that IENHANCER-2L: holds very high potential to become a useful tool for genome analysis. AVAILABILITY AND IMPLEMENTATION: For the convenience of most experimental scientists, a web server for the two-layer predictor was established at http://bioinformatics.hitsz.edu.cn/iEnhancer-2L/, by which users can easily get their desired results without the need to go through the mathematical details. CONTACT: bliu@gordonlifescience.org, bliu@insun.hit.edu.cn, xlan@stanford.edu, kcchou@gordonlifescience.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Enhancers are of short regulatory DNA elements. They can be bound with proteins (activators) to activate transcription of a gene, and hence play a critical role in promoting gene transcription in eukaryotes. With the avalanche of DNA sequences generated in the post-genomic age, it is a challenging task to develop computational methods for timely identifying enhancers from extremely complicated DNA sequences. Although some efforts have been made in this regard, they were limited at only identifying whether a query DNA element being of an enhancer or not. According to the distinct levels of biological activities and regulatory effects on target genes, however, enhancers should be further classified into strong and weak ones in strength. RESULTS: In view of this, a two-layer predictor called ' IENHANCER-2L: ' was proposed by formulating DNA elements with the 'pseudo k-tuple nucleotide composition', into which the six DNA local parameters were incorporated. To the best of our knowledge, it is the first computational predictor ever established for identifying not only enhancers, but also their strength. Rigorous cross-validation tests have indicated that IENHANCER-2L: holds very high potential to become a useful tool for genome analysis. AVAILABILITY AND IMPLEMENTATION: For the convenience of most experimental scientists, a web server for the two-layer predictor was established at http://bioinformatics.hitsz.edu.cn/iEnhancer-2L/, by which users can easily get their desired results without the need to go through the mathematical details. CONTACT: bliu@gordonlifescience.org, bliu@insun.hit.edu.cn, xlan@stanford.edu, kcchou@gordonlifescience.org SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.