| Literature DB >> 32149021 |
Dana L Carper1, Travis J Lawrence1, Alyssa A Carrell1,2, Dale A Pelletier1, David J Weston1.
Abstract
BACKGROUND: Microbiomes are extremely important for their host organisms, providing many vital functions and extending their hosts' phenotypes. Natural studies of host-associated microbiomes can be difficult to interpret due to the high complexity of microbial communities, which hinders our ability to track and identify individual members along with the many factors that structure or perturb those communities. For this reason, researchers have turned to synthetic or constructed communities in which the identities of all members are known. However, due to the lack of tracking methods and the difficulty of creating a more diverse and identifiable community that can be distinguished through next-generation sequencing, most such in vivo studies have used only a few strains.Entities:
Keywords: 16S rRNA; Constructed community; In vivo experimentation; Microbiome; Synthetic community; Taxonomic profiling
Year: 2020 PMID: 32149021 PMCID: PMC7049465 DOI: 10.7717/peerj.8534
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Demonstration of custom nucleotide Hamming distance.
Demonstration of Python Hamming distance and custom nucleotide Hamming distance, which takes into account nucleotide ambiguities.
Figure 2Workflow schematic of the loop that adds new members to the community, starting with the pairwise distance dictionary.
Inset: schematic of adding members with fewest connections at a specified DNA distance. Circles represent individuals, and lines indicate that the connected individuals are at a sequence distance of 3. Green indicates user input of file.
Subsampled bacterial class proportions.
Bacterial class proportions used to subsample the community generated from the Ribosomal Database Project database and the actualized proportions of the resultant community.
| Bacterial class | Input proportions | Output proportions |
|---|---|---|
| Actinobacteria | 0.0885 | 0.0906 |
| Alphaproteobacteria | 0.1857 | 0.1875 |
| Anaerolineae | 0.004 | 0.0013 |
| Aquificae | 0.0003 | 0.0013 |
| Bacteroidia | 0.1 | 0.0982 |
| Betaproteobacteria | 0.1286 | 0.1301 |
| Chitinivibrionia | 0.004 | 0.0013 |
| Chloroflexia | 0.005 | 0.0051 |
| Deferribacteres | 0.0003 | 0.0013 |
| Deinococci | 0.0003 | 0.0026 |
| Deltaproteobacteria | 0.0418 | 0.0434 |
| Fibrobacteria | 0.0004 | 0.0026 |
| Fusobacteriia | 0.0003 | 0.0026 |
| Gammaproteobacteria | 0.4112 | 0.4133 |
| Gemmatimonadetes | 0.0073 | 0.0026 |
| Ktedonobacteria | 0.0097 | 0.0013 |
| Nitrospira | 0.0036 | 0.0051 |
| Planctomycetia | 0.009 | 0.0102 |
Figure 3Benchmarking of distance database and create module.
Benchmarking of custom nucleotide Hamming distance function for DNA and the create module at various numbers of 16S rRNA sequences subsampled from the Ribosomal Database Project dataset.
Figure 4Recovery and taxonomic assignment of sequences from Illumina simulator.
(A) The percent of sequences from the input community that were recovered from the Illumina simulator with the dereplication and the dada2 method. (B) The percent of sequences that had a taxonomic assignment or were not assigned.
| disco create –i-alignment RDP_aligned_sequences.fasta –p-editdistance 3 –p-seed 10 –i-metadata |
| RDP_Metadata_Taxonomy.txt –o-community-list RDP_Community_ED3_seed10.txt |
| disco subsample –i-input-community RDP_Community_ED3_seed10.txt –p-seed 10 –p-group-by Class –p-proportion RDP_Class_Proportions_file.txt |