| Literature DB >> 33216898 |
Martin Kolisko1,2, Olga Flegontova1, Anna Karnkowska3,4, Gordon Lax5, Julia M Maritz6, Tomáš Pánek7, Petr Táborský1, Jane M Carlton6, Ivan Čepička7, Aleš Horák1,2, Julius Lukeš1,2, Alastair G B Simpson5, Vera Tai8.
Abstract
The small subunit ribosomal RNA (SSU rRNA) gene is a widely used molecular marker to study the diversity of life. Sequencing of SSU rRNA gene amplicons has become a standard approach for the investigation of the ecology and diversity of microbes. However, a well-curated database is necessary for correct classification of these data. While available for many groups of Bacteria and Archaea, such reference databases are absent for most eukaryotes. The primary goal of the EukRef project (eukref.org) is to close this gap and generate well-curated reference databases for major groups of eukaryotes, especially protists. Here we present a set of EukRef-curated databases for the excavate protists-a large assemblage that includes numerous taxa with divergent SSU rRNA gene sequences, which are prone to misclassification. We identified 6121 sequences, 625 of which were obtained from cultures, 3053 from cell isolations or enrichments and 2419 from environmental samples. We have corrected the classification for the majority of these curated sequences. The resulting publicly available databases will provide phylogenetically based standards for the improved identification of excavates in ecological and microbiome studies, as well as resources to classify new discoveries in excavate diversity.Entities:
Year: 2020 PMID: 33216898 PMCID: PMC7678783 DOI: 10.1093/database/baaa080
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Phylogenetic and compositional overview of the Excavata EukRef databases. (A) Maximum likelihood phylogenetic tree of SSU sequences in the Excavata database. Monophyletic clusters corresponding to deep-level taxa within the Excavata were collapsed at their common ancestral nodes when strongly supported by bootstrapping. The tree was constructed using a GTRCAT nucleotide substitution model. Bootstrap values are shown at nodes with at least 70% support. (B–E) Pie charts showing the proportion of sequences for metadata categories for each Excavata database. (B) Source of the organism from which the SSU sequence was derived. ‘Environmental’ indicates sequences obtained from the DNA of bulk environmental samples. ‘Culture’ indicates sequences obtained from organisms grown in culture and in culture collections. ‘Isolate’ indicates sequences obtained from organisms isolated from the environment, either as single cells or in enrichments, but not from established cultures (C) Biotic relationship of the organism. Symbiont consists of organisms with mutualist or commensal relationships, (D) Environment from which the organism was sampled, (E) Geographical location of sampled environment.
List of Excavata databases
| Higher taxonomy | Database | # of SSU rRNA sequences | # of taxonomic ranks |
|---|---|---|---|
|
| Fornicata | 103 | 8 |
|
| Parabasalia | 715 | 9 |
|
| Preaxostyla | 102 | 8 |
|
| Jakobida | 114 | 8 |
|
| Heterolobosea | 448 | 8 |
|
| Euglenida | 856 | 10 |
|
| Diplonemea | 525 | 9 |
| Kinetoplastea | 3258 | 11 |