Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data.

Literature DB >> 31067315

iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data.

Zhen Chen¹, Pei Zhao², Fuyi Li³, Tatiana T Marquez-Lago^4,5, André Leier^4,5, Jerico Revote³, Yan Zhu⁶, David R Powell⁶, Tatsuya Akutsu⁷, Geoffrey I Webb⁸, Kuo-Chen Chou^9,10, A Ian Smith^3,11, Roger J Daly³, Jian Li⁶, Jiangning Song^3,8,11.

Abstract

With the explosive growth of biological sequences generated in the post-genomic era, one of the most challenging problems in bioinformatics and computational biology is to computationally characterize sequences, structures and functions in an efficient, accurate and high-throughput manner. A number of online web servers and stand-alone tools have been developed to address this to date; however, all these tools have their limitations and drawbacks in terms of their effectiveness, user-friendliness and capacity. Here, we present iLearn, a comprehensive and versatile Python-based toolkit, integrating the functionality of feature extraction, clustering, normalization, selection, dimensionality reduction, predictor construction, best descriptor/model selection, ensemble learning and results visualization for DNA, RNA and protein sequences. iLearn was designed for users that only want to upload their data set and select the functions they need calculated from it, while all necessary procedures and optimal settings are completed automatically by the software. iLearn includes a variety of descriptors for DNA, RNA and proteins, and four feature output formats are supported so as to facilitate direct output usage or communication with other computational tools. In total, iLearn encompasses 16 different types of feature clustering, selection, normalization and dimensionality reduction algorithms, and five commonly used machine-learning algorithms, thereby greatly facilitating feature analysis and predictor construction. iLearn is made freely available via an online web server and a stand-alone toolkit.

Entities: Disease

Keywords: automated modeling; bioinformatics; biomedical data mining; data clustering; feature selection; integrated platform; machine learning; sequence analysis

Year: 2020 PMID： 31067315 DOI： 10.1093/bib/bbz041

Source DB: PubMed Journal: Brief Bioinform ISSN： 1467-5463 Impact factor: 11.622

Keyword Cloud
Cited

64 in total

1. Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework.

Authors: Fuyi Li; Jinxiang Chen; Zongyuan Ge; Ya Wen; Yanwei Yue; Morihiro Hayashida; Abdelkader Baggag; Halima Bensmail; Jiangning Song
Journal: Brief Bioinform Date: 2021-03-22 Impact factor: 11.622

2. DeepBL: a deep learning-based approach for in silico discovery of beta-lactamases.

Authors: Yanan Wang; Fuyi Li; Manasa Bharathwaj; Natalia C Rosas; André Leier; Tatsuya Akutsu; Geoffrey I Webb; Tatiana T Marquez-Lago; Jian Li; Trevor Lithgow; Jiangning Song
Journal: Brief Bioinform Date: 2021-07-20 Impact factor: 11.622

3. DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites.

Authors: Fuyi Li; Jinxiang Chen; André Leier; Tatiana Marquez-Lago; Quanzhong Liu; Yanze Wang; Jerico Revote; A Ian Smith; Tatsuya Akutsu; Geoffrey I Webb; Lukasz Kurgan; Jiangning Song
Journal: Bioinformatics Date: 2020-02-15 Impact factor: 6.937

4. Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning.

Authors: Haodong Xu; Peilin Jia; Zhongming Zhao
Journal: Brief Bioinform Date: 2021-05-20 Impact factor: 11.622

5. iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization.

Authors: Zhen Chen; Pei Zhao; Chen Li; Fuyi Li; Dongxu Xiang; Yong-Zi Chen; Tatsuya Akutsu; Roger J Daly; Geoffrey I Webb; Quanzhi Zhao; Lukasz Kurgan; Jiangning Song
Journal: Nucleic Acids Res Date: 2021-06-04 Impact factor: 16.971

6. Critical assessment of computational tools for prokaryotic and eukaryotic promoter prediction.

Authors: Meng Zhang; Cangzhi Jia; Fuyi Li; Chen Li; Yan Zhu; Tatsuya Akutsu; Geoffrey I Webb; Quan Zou; Lachlan J M Coin; Jiangning Song
Journal: Brief Bioinform Date: 2022-03-10 Impact factor: 11.622

7. BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches.

Authors: Bin Liu; Xin Gao; Hanyu Zhang
Journal: Nucleic Acids Res Date: 2019-11-18 Impact factor: 16.971

Review 8. Large-scale comparative review and assessment of computational methods for anti-cancer peptide identification.

Authors: Xiao Liang; Fuyi Li; Jinxiang Chen; Junlong Li; Hao Wu; Shuqin Li; Jiangning Song; Quanzhong Liu
Journal: Brief Bioinform Date: 2021-07-20 Impact factor: 11.622

9. HSM6AP: a high-precision predictor for the Homo sapiens N6-methyladenosine (m^6 A) based on multiple weights and feature stitching.

Authors: Jing Li; Shida He; Fei Guo; Quan Zou
Journal: RNA Biol Date: 2021-02-12 Impact factor: 4.652

10. Accurate identification of RNA D modification using multiple features.

Authors: Lijun Dou; Wenyang Zhou; Lichao Zhang; Lei Xu; Ke Han
Journal: RNA Biol Date: 2021-03-17 Impact factor: 4.652