Literature DB >> 33206960

OCCAM: prediction of small ORFs in bacterial genomes by means of a target-decoy database approach and machine learning techniques.

Fabio R Cerqueira1,2, Ana Tereza Ribeiro Vasconcelos3.   

Abstract

Small open reading frames (ORFs) have been systematically disregarded by automatic genome annotation. The difficulty in finding patterns in tiny sequences is the main reason that makes small ORFs to be overlooked by computational procedures. However, advances in experimental methods show that small proteins can play vital roles in cellular activities. Hence, it is urgent to make progress in the development of computational approaches to speed up the identification of potential small ORFs. In this work, our focus is on bacterial genomes. We improve a previous approach to identify small ORFs in bacteria. Our method uses machine learning techniques and decoy subject sequences to filter out spurious ORF alignments. We show that an advanced multivariate analysis can be more effective in terms of sensitivity than applying the simplistic and widely used e-value cutoff. This is particularly important in the case of small ORFs for which alignments present higher e-values than usual. Experiments with control datasets show that the machine learning algorithms used in our method to curate significant alignments can achieve average sensitivity and specificity of 97.06% and 99.61%, respectively. Therefore, an important step is provided here toward the construction of more accurate computational tools for the identification of small ORFs in bacteria.
© The Author(s) 2020. Published by Oxford University Press.

Entities:  

Year:  2020        PMID: 33206960      PMCID: PMC7673341          DOI: 10.1093/database/baaa067

Source DB:  PubMed          Journal:  Database (Oxford)        ISSN: 1758-0463            Impact factor:   3.451


  57 in total

1.  6S RNA regulates E. coli RNA polymerase activity.

Authors:  K M Wassarman; G Storz
Journal:  Cell       Date:  2000-06-09       Impact factor: 41.582

Review 2.  Identification of small RNAs in diverse bacterial species.

Authors:  Jonathan Livny; Matthew K Waldor
Journal:  Curr Opin Microbiol       Date:  2007-03-23       Impact factor: 7.934

3.  Comparison of novel decoy database designs for optimizing protein identification searches using ABRF sPRG2006 standard MS/MS data sets.

Authors:  Luca Blanco; Jennifer A Mead; Conrad Bessant
Journal:  J Proteome Res       Date:  2009-04       Impact factor: 4.466

4.  The Aerotactic Response of Caulobacter crescentus.

Authors:  Michael Morse; Remy Colin; Laurence G Wilson; Jay X Tang
Journal:  Biophys J       Date:  2016-05-10       Impact factor: 4.033

5.  HAltORF: a database of predicted out-of-frame alternative open reading frames in human.

Authors:  Benoît Vanderperre; Jean-François Lucier; Xavier Roucou
Journal:  Database (Oxford)       Date:  2012-05-20       Impact factor: 3.451

6.  Computational discovery and annotation of conserved small open reading frames in fungal genomes.

Authors:  Shuhaila Mat-Sharani; Mohd Firdaus-Raih
Journal:  BMC Bioinformatics       Date:  2019-02-04       Impact factor: 3.169

7.  Serum metabolite markers of early Mycoplasma hyopneumoniae infection in pigs.

Authors:  Meera Surendran Nair; Dan Yao; Chi Chen; Maria Pieters
Journal:  Vet Res       Date:  2019-11-26       Impact factor: 3.683

8.  ZrO2-ZnO Nanoparticles as Antibacterial Agents.

Authors:  Ayodeji Precious Ayanwale; Simón Yobanny Reyes-López
Journal:  ACS Omega       Date:  2019-11-04

9.  MUMAL: multivariate analysis in shotgun proteomics using machine learning techniques.

Authors:  Fabio R Cerqueira; Ricardo S Ferreira; Alcione P Oliveira; Andreia P Gomes; Humberto J O Ramos; Armin Graber; Christian Baumgartner
Journal:  BMC Genomics       Date:  2012-10-19       Impact factor: 3.969

Review 10.  The small RNA regulators of Escherichia coli: roles and mechanisms*.

Authors:  Susan Gottesman
Journal:  Annu Rev Microbiol       Date:  2004       Impact factor: 16.232

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.