Literature DB >> 31067315

iLearn: an integrated platform and meta-learner for feature engineering, machine-learning analysis and modeling of DNA, RNA and protein sequence data.

Zhen Chen1, Pei Zhao2, Fuyi Li3, Tatiana T Marquez-Lago4,5, André Leier4,5, Jerico Revote3, Yan Zhu6, David R Powell6, Tatsuya Akutsu7, Geoffrey I Webb8, Kuo-Chen Chou9,10, A Ian Smith3,11, Roger J Daly3, Jian Li6, Jiangning Song3,8,11.   

Abstract

With the explosive growth of biological sequences generated in the post-genomic era, one of the most challenging problems in bioinformatics and computational biology is to computationally characterize sequences, structures and functions in an efficient, accurate and high-throughput manner. A number of online web servers and stand-alone tools have been developed to address this to date; however, all these tools have their limitations and drawbacks in terms of their effectiveness, user-friendliness and capacity. Here, we present iLearn, a comprehensive and versatile Python-based toolkit, integrating the functionality of feature extraction, clustering, normalization, selection, dimensionality reduction, predictor construction, best descriptor/model selection, ensemble learning and results visualization for DNA, RNA and protein sequences. iLearn was designed for users that only want to upload their data set and select the functions they need calculated from it, while all necessary procedures and optimal settings are completed automatically by the software. iLearn includes a variety of descriptors for DNA, RNA and proteins, and four feature output formats are supported so as to facilitate direct output usage or communication with other computational tools. In total, iLearn encompasses 16 different types of feature clustering, selection, normalization and dimensionality reduction algorithms, and five commonly used machine-learning algorithms, thereby greatly facilitating feature analysis and predictor construction. iLearn is made freely available via an online web server and a stand-alone toolkit.
© The Author(s) 2019. Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

Entities:  

Keywords:  automated modeling; bioinformatics; biomedical data mining; data clustering; feature selection; integrated platform; machine learning; sequence analysis

Year:  2020        PMID: 31067315     DOI: 10.1093/bib/bbz041

Source DB:  PubMed          Journal:  Brief Bioinform        ISSN: 1467-5463            Impact factor:   11.622


  64 in total

1.  Computational prediction and interpretation of both general and specific types of promoters in Escherichia coli by exploiting a stacked ensemble-learning framework.

Authors:  Fuyi Li; Jinxiang Chen; Zongyuan Ge; Ya Wen; Yanwei Yue; Morihiro Hayashida; Abdelkader Baggag; Halima Bensmail; Jiangning Song
Journal:  Brief Bioinform       Date:  2021-03-22       Impact factor: 11.622

2.  DeepBL: a deep learning-based approach for in silico discovery of beta-lactamases.

Authors:  Yanan Wang; Fuyi Li; Manasa Bharathwaj; Natalia C Rosas; André Leier; Tatsuya Akutsu; Geoffrey I Webb; Tatiana T Marquez-Lago; Jian Li; Trevor Lithgow; Jiangning Song
Journal:  Brief Bioinform       Date:  2021-07-20       Impact factor: 11.622

3.  DeepCleave: a deep learning predictor for caspase and matrix metalloprotease substrates and cleavage sites.

Authors:  Fuyi Li; Jinxiang Chen; André Leier; Tatiana Marquez-Lago; Quanzhong Liu; Yanze Wang; Jerico Revote; A Ian Smith; Tatsuya Akutsu; Geoffrey I Webb; Lukasz Kurgan; Jiangning Song
Journal:  Bioinformatics       Date:  2020-02-15       Impact factor: 6.937

4.  Deep4mC: systematic assessment and computational prediction for DNA N4-methylcytosine sites by deep learning.

Authors:  Haodong Xu; Peilin Jia; Zhongming Zhao
Journal:  Brief Bioinform       Date:  2021-05-20       Impact factor: 11.622

5.  iLearnPlus: a comprehensive and automated machine-learning platform for nucleic acid and protein sequence analysis, prediction and visualization.

Authors:  Zhen Chen; Pei Zhao; Chen Li; Fuyi Li; Dongxu Xiang; Yong-Zi Chen; Tatsuya Akutsu; Roger J Daly; Geoffrey I Webb; Quanzhi Zhao; Lukasz Kurgan; Jiangning Song
Journal:  Nucleic Acids Res       Date:  2021-06-04       Impact factor: 16.971

6.  Critical assessment of computational tools for prokaryotic and eukaryotic promoter prediction.

Authors:  Meng Zhang; Cangzhi Jia; Fuyi Li; Chen Li; Yan Zhu; Tatsuya Akutsu; Geoffrey I Webb; Quan Zou; Lachlan J M Coin; Jiangning Song
Journal:  Brief Bioinform       Date:  2022-03-10       Impact factor: 11.622

7.  BioSeq-Analysis2.0: an updated platform for analyzing DNA, RNA and protein sequences at sequence level and residue level based on machine learning approaches.

Authors:  Bin Liu; Xin Gao; Hanyu Zhang
Journal:  Nucleic Acids Res       Date:  2019-11-18       Impact factor: 16.971

Review 8.  Large-scale comparative review and assessment of computational methods for anti-cancer peptide identification.

Authors:  Xiao Liang; Fuyi Li; Jinxiang Chen; Junlong Li; Hao Wu; Shuqin Li; Jiangning Song; Quanzhong Liu
Journal:  Brief Bioinform       Date:  2021-07-20       Impact factor: 11.622

9.  HSM6AP: a high-precision predictor for the Homo sapiens N6-methyladenosine (m^6 A) based on multiple weights and feature stitching.

Authors:  Jing Li; Shida He; Fei Guo; Quan Zou
Journal:  RNA Biol       Date:  2021-02-12       Impact factor: 4.652

10.  Accurate identification of RNA D modification using multiple features.

Authors:  Lijun Dou; Wenyang Zhou; Lichao Zhang; Lei Xu; Ke Han
Journal:  RNA Biol       Date:  2021-03-17       Impact factor: 4.652

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.