| Literature DB >> 30586452 |
Chetan C Gaonkar1, Roberta Piredda1, Carmen Minucci1, David G Mann2, Marina Montresor1, Diana Sarno1, Wiebe H C F Kooistra1.
Abstract
The species-rich diatom family Chaetocerotaceae is common in the coastal marine phytoplankton worldwide where it is responsible for a substantial part of the primary production. Despite its relevance for the global cycling of carbon and silica, many species are still described only morphologically, and numerous specimens do not fit any described taxa. Nowadays, studies to assess plankton biodiversity deploy high throughput sequencing metabarcoding of the 18S rDNA V4 region, but to translate the gathered metabarcodes into biologically meaningful taxa, there is a need for reference barcodes. However, 18S reference barcodes for this important family are still relatively scarce. We provide 18S rDNA and partial 28S rDNA reference sequences of 443 morphologically characterized chaetocerotacean strains. We gathered 164 of the 216 18S sequences and 244 of the 413 28S sequences of strains from the Gulf of Naples, Atlantic France, and Chile. Inferred phylogenies showed 84 terminal taxa in seven principal clades. Two of these clades included terminal taxa whose rDNA sequences contained spliceosomal and Group IC1 introns. Regarding the commonly used metabarcode markers in planktonic diversity studies, all terminal taxa can be discriminated with the 18S V4 hypervariable region; its primers fit their targets in all but two species, and the V4-tree topology is similar to that of the 18S. Hence V4-metabarcodes of unknown Chaetocerotaceae are assignable to the family. Regarding the V9 hypervariable region, most terminal taxa can be discriminated, but several contain introns in their primer targets. Moreover, poor phylogenetic resolution of the V9 region affects placement of metabarcodes of putative but unknown chaetocerotacean taxa, and hence, uncertainty in taxonomic assignment, even of higher taxa.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30586452 PMCID: PMC6306197 DOI: 10.1371/journal.pone.0208929
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Position of group-I introns (GI) and spliceosomal introns (SP) in 18S and 28S sequences of Chaetocerotaceae.
| Location | Family | Position | Primers affected | Region affected | |||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | SP | 385–386 | 132 | 106+ | ? | ||||||||||||||||||
| 2 | SP | 442–443 | 90+ | 104 | 110 | 107 | 96 | 125 | ? | ||||||||||||||
| 3 | SP | 548–549 | ? | 108 | ? | ||||||||||||||||||
| 4 | GI | 549–550 | 400 | 221+ | 399 | ? | 360+ | 453 | 547 | 467 | 457 | 425 | 181+ | ? | |||||||||
| 5 | SP | 888–889 | 130 | 124 | 110 | Ch-690F+R | V4 | ||||||||||||||||
| 6 | SP | 889–890 | 121 | Ch-690F+R | V4 | ||||||||||||||||||
| 7 | SP | 891–892 | 115 | 57+ | Ch-690F+R | V4 | |||||||||||||||||
| 8 | SP | 981–982 | ? | 51+ | 123 | ? | |||||||||||||||||
| 9 | SP | 989–990 | 162 | ? | 100 | 107 | 160–199 | ? | 123 | ||||||||||||||
| 10 | SP | 1011–1012 | 92+ | 115 | ? | 96 | 109 | 161 | 124 | ? | Ch-1147F+R | ||||||||||||
| 11 | SP | 1147–1148 | ? | 112 | ? | 101 | 94 | 104 | 133 | 141 | ? | ||||||||||||
| 12 | GI | 1151–1152 | ? | ? | 462–465 | 505–513 | ? | ||||||||||||||||
| 13 | SP | 1195–1196 | ? | 152 | 71+ | 110 | 100 | 104 | 105 | 121 | 141 | ? | |||||||||||
| 14 | SP | 1257–1258 | ? | ? | ? | Ch-1055F+R | |||||||||||||||||
| 15 | SP | 1274–1275 | ? | 80 | 106 | ? | 102 | 86–96 | 134 | ? | |||||||||||||
| 16 | SP | 1414–1415 | ? | 117 | 106+ | 99 | 108 | 35+ | 153 | ? | |||||||||||||
| 17 | SP | 1612–1613 | ? | ? | 98 | ? | Ch-1400F+R | ||||||||||||||||
| 18 | SP | 1615–1616 | ? | ? | 115 | 116 | 118 | 117 | 96–97 | 15 | ? | 154 | 98–154 | Ch-1400F+R & V9f | V9 | ||||||||
| 19 | SP | 1760–1761 | ? | ? | ? | ? | 109+ | ? | ? | ? | ? | 91+ | 91+ | ? | ? | 94+ | 92+ | ? | 73+ | ? | ? | Ch-V9r | V9 |
| #Inserts | |||||||||||||||||||||||
| added Σ bp | 400+ | 428+ | 399 | 57+ | 1073+ | 505 | 316+ | 1613+ | 1221+ | 1033+ | 1829+ | 932 | 970 | 534+ | 813+ | 196+ | 154 | 496+ | |||||
| 1 | SP | 758–759 | 112 | 108 | 112–116 | 115 | 205+ | 94–105 | |||||||||||||||
“Location” refers to order of appearance in the 18S and 28S sequence alignment. Family: SP, spliceosomal intron; GI, Group IC1 intron. “Position” refers to the positions of the 18S core nucleotides flanking the 5′- and 3′- ends of the insert in the 18S rRNA secondary structure model of C. tenuissimus strain CHMS01 (from http://www.rna.icmb.utexas.edu, there listed as Chaetoceros sp.; S1 Fig). Figures in the body of the table under the terminal taxa indicate the maximum length of the inserts of the strains belonging to that terminal taxon in that location (123, exact length; 123–134, length range; 123+ length of sequenced part, actual insert is longer; ‘?’ Presence of insert unknown because sequencing of region failed; 392+, long SP, possibly SP inside another SP). Primers and marker regions (V4 and V9) affected by the insert are indicated to the right. ‘#inserts’ signifies the number of inserts in the 18S of the terminal taxon; ‘added Σ bp’ indicates the extra length, in bp, added by the inserts to the 18S core sequence.
† Only Chaetoceros sp. Clade Na12A3 strain Na43B1
‡ Only C. diversus 2 strain Na56B3
§ Only C. decipiens strain Na12B4
Fig 1Maximum likelihood tree inferred with RAxML from 18S sequences of representative strains in terminal taxa in S2 Fig.
Figures on the left side of clades are bootstrap values (1000 replicates); values ≥90% have been marked “*”. Major clades are indicated with Roman numerals and subclades with “a-d.” Strain codes: Chaetoceros spp represent species requiring taxonomic description; the first code refers to the representative strain of the Clade as a proxy for the species name, the second code refers to the actual strain.
Fig 2Maximum likelihood tree inferred with RAxML from partial 28S sequences of representative strains in terminal taxa in S4 Fig.
Figures on the left side of clades are bootstrap values (1000 replicates); values ≥90% have been marked “*”. Major clades are indicated with Roman numerals and subclades with “a-d.” For explanations, see text. Strain codes: Chaetoceros spp represent species requiring taxonomic description; the first code refers to the representative strain of the Clade, the second code to the actual strain.