| Literature DB >> 35485976 |
Yueyu Jiang1, Metin Balaban2, Qiyun Zhu3, Siavash Mirarab1.
Abstract
Placing new sequences onto reference phylogenies is increasingly used for analyzing environmental samples, especially microbiomes. Existing placement methods assume that query sequences have evolved under specific models directly on the reference phylogeny. For example, they assume single-gene data (e.g., 16S rRNA amplicons) have evolved under the GTR model on a gene tree. Placement, however, often has a more ambitious goal: extending a (genome-wide) species tree given data from individual genes without knowing the evolutionary model. Addressing this challenging problem requires new directions. Here, we introduce Deep-learning Enabled Phylogenetic Placement (DEPP), an algorithm that learns to extend species trees using single genes without pre-specified models. In simulations and on real data, we show that DEPP can match the accuracy of model-based methods without any prior knowledge of the model. We also show that DEPP can update the multi-locus microbial tree-of-life with single genes with high accuracy. We further demonstrate that DEPP can combine 16S and metagenomic data onto a single tree, enabling community structure analyses that take advantage of both sources of data.Entities:
Keywords: Deep learning; Gene tree discordance; Metagenomics; Microbiome analyses; Neural networks; Phylogenetic placement
Year: 2022 PMID: 35485976 DOI: 10.1093/sysbio/syac031
Source DB: PubMed Journal: Syst Biol ISSN: 1063-5157 Impact factor: 9.160