Literature DB >> 33651810

Using neural networks to mine text and predict metabolic traits for thousands of microbes.

Timothy J Hackmann1, Bo Zhang1.   

Abstract

Microbes can metabolize more chemical compounds than any other group of organisms. As a result, their metabolism is of interest to investigators across biology. Despite the interest, information on metabolism of specific microbes is hard to access. Information is buried in text of books and journals, and investigators have no easy way to extract it out. Here we investigate if neural networks can extract out this information and predict metabolic traits. For proof of concept, we predicted two traits: whether microbes carry one type of metabolism (fermentation) or produce one metabolite (acetate). We collected written descriptions of 7,021 species of bacteria and archaea from Bergey's Manual. We read the descriptions and manually identified (labeled) which species were fermentative or produced acetate. We then trained neural networks to predict these labels. In total, we identified 2,364 species as fermentative, and 1,009 species as also producing acetate. Neural networks could predict which species were fermentative with 97.3% accuracy. Accuracy was even higher (98.6%) when predicting species also producing acetate. Phylogenetic trees of species and their traits confirmed that predictions were accurate. Our approach with neural networks can extract information efficiently and accurately. It paves the way for putting more metabolic traits into databases, providing easy access of information to investigators.

Entities:  

Year:  2021        PMID: 33651810      PMCID: PMC7954334          DOI: 10.1371/journal.pcbi.1008757

Source DB:  PubMed          Journal:  PLoS Comput Biol        ISSN: 1553-734X            Impact factor:   4.475


  26 in total

Review 1.  Natural strategies for the spatial optimization of metabolism in synthetic biology.

Authors:  Christina M Agapakis; Patrick M Boyle; Pamela A Silver
Journal:  Nat Chem Biol       Date:  2012-05-17       Impact factor: 15.040

Review 2.  The microbial nitrogen-cycling network.

Authors:  Marcel M M Kuypers; Hannah K Marchant; Boran Kartal
Journal:  Nat Rev Microbiol       Date:  2018-02-05       Impact factor: 60.633

3.  Accurate estimation of microbial sequence diversity with Distanced.

Authors:  Timothy J Hackmann
Journal:  Bioinformatics       Date:  2020-02-01       Impact factor: 6.937

4.  Fast, scalable generation of high-quality protein multiple sequence alignments using Clustal Omega.

Authors:  Fabian Sievers; Andreas Wilm; David Dineen; Toby J Gibson; Kevin Karplus; Weizhong Li; Rodrigo Lopez; Hamish McWilliam; Michael Remmert; Johannes Söding; Julie D Thompson; Desmond G Higgins
Journal:  Mol Syst Biol       Date:  2011-10-11       Impact factor: 11.429

5.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies.

Authors:  Alexandros Stamatakis
Journal:  Bioinformatics       Date:  2014-01-21       Impact factor: 6.937

6.  The MACADAM database: a MetAboliC pAthways DAtabase for Microbial taxonomic groups for mining potential metabolic capacities of archaeal and bacterial taxonomic groups.

Authors:  Malo Le Boulch; Patrice Déhais; Sylvie Combes; Géraldine Pascal
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

7.  Toward systematic review automation: a practical guide to using machine learning tools in research synthesis.

Authors:  Iain J Marshall; Byron C Wallace
Journal:  Syst Rev       Date:  2019-07-11

8.  phyloseq: an R package for reproducible interactive analysis and graphics of microbiome census data.

Authors:  Paul J McMurdie; Susan Holmes
Journal:  PLoS One       Date:  2013-04-22       Impact factor: 3.240

9.  LPSN--list of prokaryotic names with standing in nomenclature.

Authors:  Aidan C Parte
Journal:  Nucleic Acids Res       Date:  2013-11-15       Impact factor: 16.971

10.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining.

Authors:  Jinhyuk Lee; Wonjin Yoon; Sungdong Kim; Donghyeon Kim; Sunkyu Kim; Chan Ho So; Jaewoo Kang
Journal:  Bioinformatics       Date:  2020-02-15       Impact factor: 6.937

View more
  2 in total

1.  A New Pathway for Forming Acetate and Synthesizing ATP during Fermentation in Bacteria.

Authors:  Bo Zhang; Christopher Lingga; Courtney Bowman; Timothy J Hackmann
Journal:  Appl Environ Microbiol       Date:  2021-06-25       Impact factor: 4.792

2.  Redefining the coenzyme A transferase superfamily with a large set of manually annotated proteins.

Authors:  Timothy J Hackmann
Journal:  Protein Sci       Date:  2022-02-07       Impact factor: 6.725

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.