Bin Liu1,2, Kai Li1, De-Shuang Huang3, Kuo-Chen Chou2,4. 1. School of Computer Science and Technology, Harbin Institute of Technology Shenzhen Graduate School, Shenzhen, Guangdong, China. 2. Gordon Life Science Institute, Belmont, MA, USA. 3. Institute of Machine Learning and Systems Biology, School of Electronics and Information Engineering, Tongji University, Shanghai, China. 4. Center for Informational Biology, University of Electronic Science and Technology of China, Chengdu, China.
Abstract
Motivation: Identification of enhancers and their strength is important because they play a critical role in controlling gene expression. Although some bioinformatics tools were developed, they are limited in discriminating enhancers from non-enhancers only. Recently, a two-layer predictor called 'iEnhancer-2L' was developed that can be used to predict the enhancer's strength as well. However, its prediction quality needs further improvement to enhance the practical application value. Results: A new predictor called 'iEnhancer-EL' was proposed that contains two layer predictors: the first one (for identifying enhancers) is formed by fusing an array of six key individual classifiers, and the second one (for their strength) formed by fusing an array of ten key individual classifiers. All these key classifiers were selected from 171 elementary classifiers formed by SVM (Support Vector Machine) based on kmer, subsequence profile and PseKNC (Pseudo K-tuple Nucleotide Composition), respectively. Rigorous cross-validations have indicated that the proposed predictor is remarkably superior to the existing state-of-the-art one in this area. Availability and implementation: A web server for the iEnhancer-EL has been established at http://bioinformatics.hitsz.edu.cn/iEnhancer-EL/, by which users can easily get their desired results without the need to go through the mathematical details. Supplementary information: Supplementary data are available at Bioinformatics online.
Motivation: Identification of enhancers and their strength is important because they play a critical role in controlling gene expression. Although some bioinformatics tools were developed, they are limited in discriminating enhancers from non-enhancers only. Recently, a two-layer predictor called 'iEnhancer-2L' was developed that can be used to predict the enhancer's strength as well. However, its prediction quality needs further improvement to enhance the practical application value. Results: A new predictor called 'iEnhancer-EL' was proposed that contains two layer predictors: the first one (for identifying enhancers) is formed by fusing an array of six key individual classifiers, and the second one (for their strength) formed by fusing an array of ten key individual classifiers. All these key classifiers were selected from 171 elementary classifiers formed by SVM (Support Vector Machine) based on kmer, subsequence profile and PseKNC (Pseudo K-tuple Nucleotide Composition), respectively. Rigorous cross-validations have indicated that the proposed predictor is remarkably superior to the existing state-of-the-art one in this area. Availability and implementation: A web server for the iEnhancer-EL has been established at http://bioinformatics.hitsz.edu.cn/iEnhancer-EL/, by which users can easily get their desired results without the need to go through the mathematical details. Supplementary information: Supplementary data are available at Bioinformatics online.
Authors: Harun Njoroge; Arjen Van't Hof; Ambrose Oruni; Dimitra Pipini; Sanjay C Nagi; Amy Lynd; Eric R Lucas; Sean Tomlinson; Xavi Grau-Bove; Daniel McDermott; Francis T Wat'senga; Emile Z Manzambi; Fiacre R Agossa; Arlette Mokuba; Seth Irish; Bilali Kabula; Charles Mbogo; Joel Bargul; Mark J I Paine; David Weetman; Martin J Donnelly Journal: Mol Ecol Date: 2022-07-12 Impact factor: 6.622