Muhammad Tahir1, Maqsood Hayat2, Muhammad Kabir3. 1. Department of Computer Science, Abdul Wali Khan University Mardan, KP Pakistan. 2. Department of Computer Science, Abdul Wali Khan University Mardan, KP Pakistan. Electronic address: m.hayat@awkum.edu.pk. 3. School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing, 210094, China.
Abstract
BACKGROUND AND OBJECTIVES: Enhancers are pivotal DNA elements, which are widely used in eukaryotes for activation of transcription genes. On the basis of enhancer strength, they are further classified into two groups; strong enhancers and weak enhancers. Due to high availability of huge amount of DNA sequences, it is needed to develop fast, reliable and robust intelligent computational method, which not only identify enhancers but also determines their strength. Considerable progress has been achieved in this regard; however, timely and precisely identification of enhancers is still a challenging task. METHODS: Two-level intelligent computational model for identification of enhancers and their subgroups is proposed. Two different feature extraction techniques including di-nucleotide composition and tri-nucleotide composition were adopted for extraction of numerical descriptors. Four classification methods including probabilistic neural network, support vector machine, k-nearest neighbor and random forest were utilized for classification. RESULTS: The proposed method yielded 77.25% of accuracy for dataset S1 contains enhancers and non-enhancers, whereas 64.70% of accuracy for dataset S2 comprises of strong enhancer and weak enhancer sequences using jackknife cross-validation test. CONCLUSION: The predictive results validated that the proposed method is better than that of existing approaches so far reported in the literature. It is thus highly observed that the developed method will be useful and expedient for basic research and academia.
BACKGROUND AND OBJECTIVES: Enhancers are pivotal DNA elements, which are widely used in eukaryotes for activation of transcription genes. On the basis of enhancer strength, they are further classified into two groups; strong enhancers and weak enhancers. Due to high availability of huge amount of DNA sequences, it is needed to develop fast, reliable and robust intelligent computational method, which not only identify enhancers but also determines their strength. Considerable progress has been achieved in this regard; however, timely and precisely identification of enhancers is still a challenging task. METHODS: Two-level intelligent computational model for identification of enhancers and their subgroups is proposed. Two different feature extraction techniques including di-nucleotide composition and tri-nucleotide composition were adopted for extraction of numerical descriptors. Four classification methods including probabilistic neural network, support vector machine, k-nearest neighbor and random forest were utilized for classification. RESULTS: The proposed method yielded 77.25% of accuracy for dataset S1 contains enhancers and non-enhancers, whereas 64.70% of accuracy for dataset S2 comprises of strong enhancer and weak enhancer sequences using jackknife cross-validation test. CONCLUSION: The predictive results validated that the proposed method is better than that of existing approaches so far reported in the literature. It is thus highly observed that the developed method will be useful and expedient for basic research and academia.