Literature DB >> 19237256

Towards large-scale FAME-based bacterial species identification using machine learning techniques.

Bram Slabbinck1, Bernard De Baets, Peter Dawyndt, Paul De Vos.   

Abstract

In the last decade, bacterial taxonomy witnessed a huge expansion. The swift pace of bacterial species (re-)definitions has a serious impact on the accuracy and completeness of first-line identification methods. Consequently, back-end identification libraries need to be synchronized with the List of Prokaryotic names with Standing in Nomenclature. In this study, we focus on bacterial fatty acid methyl ester (FAME) profiling as a broadly used first-line identification method. From the BAME@LMG database, we have selected FAME profiles of individual strains belonging to the genera Bacillus, Paenibacillus and Pseudomonas. Only those profiles resulting from standard growth conditions have been retained. The corresponding data set covers 74, 44 and 95 validly published bacterial species, respectively, represented by 961, 378 and 1673 standard FAME profiles. Through the application of machine learning techniques in a supervised strategy, different computational models have been built for genus and species identification. Three techniques have been considered: artificial neural networks, random forests and support vector machines. Nearly perfect identification has been achieved at genus level. Notwithstanding the known limited discriminative power of FAME analysis for species identification, the computational models have resulted in good species identification results for the three genera. For Bacillus, Paenibacillus and Pseudomonas, random forests have resulted in sensitivity values, respectively, 0.847, 0.901 and 0.708. The random forests models outperform those of the other machine learning techniques. Moreover, our machine learning approach also outperformed the Sherlock MIS (MIDI Inc., Newark, DE, USA). These results show that machine learning proves very useful for FAME-based bacterial species identification. Besides good bacterial identification at species level, speed and ease of taxonomic synchronization are major advantages of this computational species identification strategy.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19237256     DOI: 10.1016/j.syapm.2009.01.003

Source DB:  PubMed          Journal:  Syst Appl Microbiol        ISSN: 0723-2020            Impact factor:   4.022


  4 in total

1.  New marker of FAME profile of Pseudomonas aurantiaca total lipids.

Authors:  R I Zhdanov; I I Salafutdinov; A Arslan; M Y Ibragimova
Journal:  Dokl Biochem Biophys       Date:  2012-09-02       Impact factor: 0.788

2.  From learning taxonomies to phylogenetic learning: integration of 16S rRNA gene data into FAME-based bacterial classification.

Authors:  Bram Slabbinck; Willem Waegeman; Peter Dawyndt; Paul De Vos; Bernard De Baets
Journal:  BMC Bioinformatics       Date:  2010-01-30       Impact factor: 3.169

3.  Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle?

Authors:  Wouter G Touw; Jumamurat R Bayjanov; Lex Overmars; Lennart Backus; Jos Boekhorst; Michiel Wels; Sacha A F T van Hijum
Journal:  Brief Bioinform       Date:  2012-07-10       Impact factor: 11.622

4.  What variables are important in predicting bovine viral diarrhea virus? A random forest approach.

Authors:  Gustavo Machado; Mariana Recamonde Mendoza; Luis Gustavo Corbellini
Journal:  Vet Res       Date:  2015-07-24       Impact factor: 3.683

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.