Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 geneRFinder: gene finding in distinct metagenomic data complexities.

Literature DB >> 33632132

geneRFinder: gene finding in distinct metagenomic data complexities.

Raíssa Silva^1,2, Kleber Padovani², Fabiana Góes³, Ronnie Alves^4,5.

Abstract

BACKGROUND: Microbes perform a fundamental economic, social, and environmental role in our society. Metagenomics makes it possible to investigate microbes in their natural environments (the complex communities) and their interactions. The way they act is usually estimated by looking at the functions they play in those environments and their responsibility is measured by their genes. The advances of next-generation sequencing technology have facilitated metagenomics research however it also creates a heavy computational burden. Large and complex biological datasets are available as never before. There are many gene predictors available that can aid the gene annotation process though they lack handling appropriately metagenomic data complexities. There is no standard metagenomic benchmark data for gene prediction. Thus, gene predictors may inflate their results by obfuscating low false discovery rates.
RESULTS: We introduce geneRFinder, an ML-based gene predictor able to outperform state-of-the-art gene prediction tools across this benchmark by using only one pre-trained Random Forest model. Average prediction rates of geneRFinder differed in percentage terms by 54% and 64%, respectively, against Prodigal and FragGeneScan while handling high complexity metagenomes. The specificity rate of geneRFinder had the largest distance against FragGeneScan, 79 percentage points, and 66 more than Prodigal. According to McNemar's test, all percentual differences between predictors performances are statistically significant for all datasets with a 99% confidence interval.
CONCLUSIONS: We provide geneRFinder, an approach for gene prediction in distinct metagenomic complexities, available at gitlab.com/r.lorenna/generfinder and https://osf.io/w2yd6/ , and also we provide a novel, comprehensive benchmark data for gene prediction-which is based on The Critical Assessment of Metagenome Interpretation (CAMI) challenge, and contains labeled data from gene regions-available at https://sourceforge.net/p/generfinder-benchmark .

Entities: Chemical Disease Species

Keywords: Gene prediction; Machine learning; Metagenomics

Mesh：

Year: 2021 PMID： 33632132 PMCID： PMC7905635 DOI： 10.1186/s12859-021-03997-w

Source DB: PubMed Journal: BMC Bioinformatics ISSN： 1471-2105 Impact factor: 3.169

30 in total

1. Interactions between commensal intestinal bacteria and the immune system.

Authors: Andrew J Macpherson; Nicola L Harris
Journal: Nat Rev Immunol Date: 2004-06 Impact factor: 53.106

2. Study of DNA binding sites using the Rényi parametric entropy measure.

Authors: A Krishnamachari; Vijnan moy Mandal
Journal: J Theor Biol Date: 2004-04-07 Impact factor: 2.691

3. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences.

Authors: Weizhong Li; Adam Godzik
Journal: Bioinformatics Date: 2006-05-26 Impact factor: 6.937

4. Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms.

Authors:
Journal: Neural Comput Date: 1998-09-15 Impact factor: 2.026

Review 5. Pattern recognition analysis on long noncoding RNAs: a tool for prediction in plants.

Authors: Tatianne da Costa Negri; Wonder Alexandre Luz Alves; Pedro Henrique Bugatti; Priscila Tiemi Maeda Saito; Douglas Silva Domingues; Alexandre Rossi Paschoal
Journal: Brief Bioinform Date: 2019-03-25 Impact factor: 11.622

6. Recognition of protein coding regions in DNA sequences.

Authors: J W Fickett
Journal: Nucleic Acids Res Date: 1982-09-11 Impact factor: 16.971

7. Prodigal: prokaryotic gene recognition and translation initiation site identification.

Authors: Doug Hyatt; Gwo-Liang Chen; Philip F Locascio; Miriam L Land; Frank W Larimer; Loren J Hauser
Journal: BMC Bioinformatics Date: 2010-03-08 Impact factor: 3.169

8. FragGeneScan: predicting genes in short and error-prone reads.

Authors: Mina Rho; Haixu Tang; Yuzhen Ye
Journal: Nucleic Acids Res Date: 2010-08-30 Impact factor: 16.971

9. MetaGene: prokaryotic gene finding from environmental genome shotgun sequences.

Authors: Hideki Noguchi; Jungho Park; Toshihisa Takagi
Journal: Nucleic Acids Res Date: 2006-10-05 Impact factor: 16.971

10. Back to the Future of Soil Metagenomics.

Authors: Joseph Nesme; Wafa Achouak; Spiros N Agathos; Mark Bailey; Petr Baldrian; Dominique Brunel; Åsa Frostegård; Thierry Heulin; Janet K Jansson; Edouard Jurkevitch; Kristiina L Kruus; George A Kowalchuk; Antonio Lagares; Hilary M Lappin-Scott; Philippe Lemanceau; Denis Le Paslier; Ines Mandic-Mulec; J Colin Murrell; David D Myrold; Renaud Nalin; Paolo Nannipieri; Josh D Neufeld; Fergal O'Gara; John J Parnell; Alfred Pühler; Victor Pylro; Juan L Ramos; Luiz F W Roesch; Michael Schloter; Christa Schleper; Alexander Sczyrba; Angela Sessitsch; Sara Sjöling; Jan Sørensen; Søren J Sørensen; Christoph C Tebbe; Edward Topp; George Tsiamis; Jan Dirk van Elsas; Geertje van Keulen; Franco Widmer; Michael Wagner; Tong Zhang; Xiaojun Zhang; Liping Zhao; Yong-Guan Zhu; Timothy M Vogel; Pascal Simonet
Journal: Front Microbiol Date: 2016-02-10 Impact factor: 5.640

1 in total

1. NGS read classification using AI.

Authors: Benjamin Voigt; Oliver Fischer; Christian Krumnow; Christian Herta; Piotr Wojciech Dabrowski
Journal: PLoS One Date: 2021-12-22 Impact factor: 3.240

1 in total