| Literature DB >> 32016319 |
Elisa Banchi1,2, Claudio G Ametrano1, Samuele Greco1, David Stanković1,3, Lucia Muggia1, Alberto Pallavicini1,2,4.
Abstract
DNA metabarcoding combines DNA barcoding with high-throughput sequencing to identify different taxa within environmental communities. The ITS has already been proposed and widely used as universal barcode marker for plants, but a comprehensive, updated and accurate reference dataset of plant ITS sequences has not been available so far. Here, we constructed reference datasets of Viridiplantae ITS1, ITS2 and entire ITS sequences including both Chlorophyta and Streptophyta. The sequences were retrieved from NCBI, and the ITS region was extracted. The sequences underwent identity check to remove misidentified records and were clustered at 99% identity to reduce redundancy and computational effort. For this step, we developed a script called 'better clustering for QIIME' (bc4q) to ensure that the representative sequences are chosen according to the composition of the cluster at a different taxonomic level. The three datasets obtained with the bc4q script are PLANiTS1 (100 224 sequences), PLANiTS2 (96 771 sequences) and PLANiTS (97 550 sequences), and all are pre-formatted for QIIME, being this the most used bioinformatic pipeline for metabarcoding analysis. Being curated and updated reference databases, PLANiTS1, PLANiTS2 and PLANiTS are proposed as a reliable, pivotal first step for a general standardization of plant DNA metabarcoding studies. The bc4q script is presented as a new tool useful in each research dealing with sequences clustering. Database URL: https://github.com/apallavicini/bc4q; https://github.com/apallavicini/PLANiTS.Entities:
Mesh:
Substances:
Year: 2020 PMID: 32016319 PMCID: PMC6997939 DOI: 10.1093/database/baz155
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1Schematic workflow of bc4q script for the evaluation of cd-hit clusters. The identity level is set at 90%.
Taxa number in the PLANiTS reference datasets
|
|
|
| |
|---|---|---|---|
| Phylum | 2 | 2 | 2 |
| Class | 22 | 22 | 22 |
| Order | 133 | 134 | 130 |
| Family | 538 | 567 | 527 |
| Genus | 9927 | 10 055 | 9737 |
| Species | 57 324 | 58 893 | 55 690 |
| Total seq. | 100 224 | 96 771 | 97 550 |
Total number of sequences (Total seq.) is also reported
List of plants orders
| Chlorophyta |
| Chlorodendrophyceae |
| |
| Chlorophyceae |
| |
| |
| |
| |
| |
| Phaeophilales |
| |
| |
| |
| Chloropicophyceae |
| |
| Mamiellophyceae |
| |
| |
| |
| Palmophyllophyceae |
| Palmophyllales |
| Prasinococcales* |
| Pedinophyceae |
| Marsupiomonadales* |
| |
| Scourfieldiales |
| Pyramimonadophyceae |
| |
| Trebouxiophyceae |
| |
| Ctenocladales |
| |
| |
| |
| Ulvophyceae |
| |
| Chlorocystidales |
| |
| Dasycladales |
| |
| Oltmansiellopsidales |
| Scotinosphaerales |
| |
| |
| |
| Streptophyta |
| Anthocerotopsida |
| Anthocerotales |
| |
| Notothyladales |
| Phymatocerotales |
| Bryopsida |
| Archidiales |
| |
| |
| |
| Buxbaumiales |
| |
| Diphysciales |
| |
| |
| |
| |
| |
| |
| |
| Hypnodendrales** |
| |
| |
| |
| |
| |
| |
| |
| |
| Charophyceae |
| |
| Cycadopsida |
| |
| Ginkgoopsida |
| |
| Gnetopsida |
| |
| |
| Welwitschiales |
| Jungermanniopsida |
| |
| |
| |
| |
| |
| Pleuroziales |
| |
| Ptilidiales** |
| Klebsormidiophyceae |
| |
| Liliopsida |
| |
| |
| |
| |
| |
| |
| |
| |
| Petrosaviales |
| |
| |
| Lycopodiopsida |
| |
| |
| |
| Magnoliopsida |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| °Marchantiopsida |
| Blasiales |
| Lunulariales |
| |
| Neohodgsoniales |
| Sphaerocarpales |
| °Pinopsida |
| |
| |
| |
| °Polypodiopsida |
| |
| |
| Gleicheniales |
| Hymenophyllales |
| Marattiales |
| Ophioglossales |
| Osmundales |
| |
| |
| |
| |
| °Polytrichopsida |
| |
| °Sphagnopsida |
| |
| °Takakiopsida |
| |
| °Tetraphidopsida |
| Tetraphidales** |
| °Zygnemophyceae |
| |
| |
| °Class not assigned |
| |
| |
| Berberidopsidales |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| Escalloniales |
| |
| |
| Huerteales** |
| |
| |
| |
| |
| Paracryphiales |
| Picramniales |
| |
| Vahliales |
| |
| |
Orders in bold are the one present in PLANiTS reference dataset. *Orders present only in PLANiTS1, **orders present only in PLANiTS2. For each order the total number of sequences identified in the three databases (PLANiTS, PLANiTS1 and PLANiTS2) is reported, and the corresponding number of sequences identified at the species level is additionally reported in parentheses