| Literature DB >> 32448240 |
Timothy G Stephens1,2, Raúl A González-Pech1,3, Yuanyuan Cheng4,5, Amin R Mohamed6, David W Burt7, Debashish Bhattacharya8, Mark A Ragan1, Cheong Xin Chan9,10,11.
Abstract
BACKGROUND: Dinoflagellates are taxonomically diverse and ecologically important phytoplankton that are ubiquitously present in marine and freshwater environments. Mostly photosynthetic, dinoflagellates provide the basis of aquatic primary production; most taxa are free-living, while some can form symbiotic and parasitic associations with other organisms. However, knowledge of the molecular mechanisms that underpin the adaptation of these organisms to diverse ecological niches is limited by the scarce availability of genomic data, partly due to their large genome sizes estimated up to 250 Gbp. Currently available dinoflagellate genome data are restricted to Symbiodiniaceae (particularly symbionts of reef-building corals) and parasitic lineages, from taxa that have smaller genome size ranges, while genomic information from more diverse free-living species is still lacking.Entities:
Keywords: Cold adaptation; Dinoflagellates; Genome evolution; Genomics; Polarella glacialis
Mesh:
Year: 2020 PMID: 32448240 PMCID: PMC7245778 DOI: 10.1186/s12915-020-00782-8
Source DB: PubMed Journal: BMC Biol ISSN: 1741-7007 Impact factor: 7.431
Assembled genomes of P. glacialis compared to representative publicly available dinoflagellate genomes. A more comprehensive summary including all other available genomes is shown in Additional file 3: Supplementary Table 1. Estimated diploid genome size for P. glacialis isolates is shown in brackets
| Symbiodiniaceae | Parasitic | |||||||
|---|---|---|---|---|---|---|---|---|
| CCMP1383 | CCMP2088 | |||||||
| Reference | This study | This study | Aranda et al. [ | Shoguchi et al. [ | Liu et al. [ | Liu et al. [ | John et al. [ | Gornik et al. [ |
| %G+C | 45.91 | 46.15 | 50.51 | 43.46 | 44.83 | 45.72 | 55.92 | 47.31 |
| Total number of scaffolds | 33,494 | 37,768 | 9695 | 21,899 | 41,289 | 16,959 | 2351 | 869,500 |
| Total assembled bases (Gbp) | 2.98 | 2.76 | 0.81 | 0.61 | 1.03 | 1.05 | 0.09 | 4.77 |
| N50 length of scaffolds (bp) | 170,304 | 129,205 | 573,512 | 125,226 | 98,034 | 268,823 | 83,970 | 17,235 |
| Maximum scaffold length (bp) | 2,170,995 | 1,500,384 | 3,144,590 | 810,747 | 8,337,000 | 5,159,000 | 536,776 | 186,000 |
| Estimated genome size (Gbp) | 1.48 (3.02) | 1.30 (2.65) | 1.10 | 1.5 | 1.19 | 1.07 | 0.12 | 4.8 |
Fig. 1Genomes of Polarella glacialis and repeat content. a GenomeScope 21-mer profile for CCMP1383. b Identification of conserved core eukaryote genes (using CEGMA) in the assembled P. glacialis genomes of CCMP1383 and CCMP2088 compared to the assembled genomes of Cladocopium goreaui and Fugacium kawagutii [12]. c Interspersed repeat landscape and proportion of distinct repeat classes in the assembled genome of CCMP1383, studied using sequence divergence under the Kimura evolutionary model. d Percentage of 3-mers in the assembled genome and the sequence data for CCMP1383 for the ten most abundant 3-mers
Fig. 2DinoSL-type full-length transcripts in P. glacialis. a Percentage of DinoSL-type transcripts of P. glacialis based on the identified start position along the DinoSL sequence, shown for positions 1 through 12. b Structure and number of DinoSL and/or relic DinoSL containing IsoSeq transcripts from each isolate. c Distribution of distances (in bp) between DinoSL-type transcriptional units shown for transcriptomes of CCMP1383 and CCMP2088
Predicted gene models in P. glacialis compared to representative publicly available dinoflagellate genomes. A more-comprehensive summary including gene models from all available dinoflagellate genomes is shown in Additional file 3: Supplementary Table 12. Genes from the Symbiodiniaceae taxa are based on the revised predictions from Chen et al. [39]
| Symbiodiniaceae | Parasitic | ||||||
|---|---|---|---|---|---|---|---|
| CCMP1383 | CCMP2088 | ||||||
| Genes | |||||||
| Number of genes | 58,232 | 51,713 | 29,728 | 32,803 | 39,006 | 31,520 | 19,925 |
| Gene models supported by transcriptome (%) | 94.0 | 94.3 | 79.2 | 89.4 | 76.5 | 69.0 | 24.4 |
| G+C content of CDS (%) | 57.84 | 57.78 | 57.43 | 51.18 | 54.23 | 54.19 | 60.77 |
| Exons | |||||||
| Number of exons per gene | 11.64 | 10.84 | 19.21 | 19.03 | 12.46 | 11.63 | 3.39 |
| Average length (bp) | 105.67 | 108.71 | 115.44 | 101.38 | 130.47 | 158.13 | 577.84 |
| Total length (Mb) | 71.60 | 60.94 | 65.92 | 63.30 | 63.42 | 57.96 | 39.08 |
| Introns | |||||||
| Number of genes with introns (%) | 73.79 | 75.60 | 95.70 | 93.74 | 96.00 | 96.00 | 71.35 |
| Average length (bp) | 1408 | 1296 | 387.92 | 451.33 | 593.53 | 658.41 | 337.11 |
| Total length (Mb) | 837.95 | 636.20 | 210.00 | 267.00 | 265.35 | 220.58 | 16.08 |
| Intergenic regions | |||||||
| Average length (bp) | 21,625 | 20,922 | 15,108 | 5983 | 9538 | 18,050 | 1525 |
Hematodinium sp. is not shown as no predicted genes were reported
Fig. 3Comparison of predicted gene models between the two P. glacialis genomes. a The comparison of predicted proteins in CCMP1383 against those in CCMP2088 is shown, incorporating evidence from the corresponding transcriptome data. b Scenario of RNA editing that would disrupt the alignment of a transcript to the genome
Fig. 4Intergenic regions and tandemly repeated genes. a Distribution of the sizes of intergenic regions (in bp; ≤ 30,000 bp) shown for the assembled P. glacialis genomes of CCMP1383 and CCMP2088. b Frequency of strand-orientation changes in ten-gene windows generated from the predicted genes from isolates of P. glacialis, Symbiodiniaceae, and the other alveolates of Tetrahymena thermophilia (ciliate) and Plasmodium falciparum 3D7 (apicomplexan). c The number of tandemly repeated and/or single-exon genes in CCMP1383 and CCMP2088, shown for genes encoding bacteriorhodopsin and peridinin chlorophyll a-binding proteins
Fig. 5Evolutionary history of ice-binding domains in P. glacialis and dinoflagellates. Only a small part of the 1080-taxon maximum likelihood protein tree is shown. Support values, based on 2000 ultrafast bootstrap approximations, are shown at the internal nodes. Only values > 50% are shown. The unit of branch length is the number of substitutions per site
Fig. 6Genome features of Polarella glacialis as a psychrophilic, free-living dinoflagellate. Summary of key genome features of P. glacialis, focusing on unidirectionality of coding genes, tandemly repeated genes, and single-exon genes