| Literature DB >> 35477350 |
Mengjia Zhang1,2,3,4, Nansheng Chen5,6,7,8.
Abstract
The cosmopolitan Thalassionema species are often dominant components of the plankton diatom flora and sediment diatom assemblages in all but the Polar regions, making important ecological contribution to primary productivity. Historical studies concentrated on their indicative function for the marine environment based primarily on morphological features and essentially ignored their genomic information, hindering in-depth investigation on Thalassionema biodiversity. In this project, we constructed the complete chloroplast genomes (cpDNAs) of seven Thalassionema strains representing three different species, which were also the first cpDNAs constructed for any species in the order Thalassionematales that includes 35 reported species and varieties. The sizes of these Thalassionema cpDNAs, which showed typical quadripartite structures, varied from 124,127 bp to 140,121 bp. Comparative analysis revealed that Thalassionema cpDNAs possess conserved gene content inter-species and intra-species, along with several gene losses and transfers. Besides, their cpDNAs also have expanded inverted repeat regions (IRs) and preserve large intergenic spacers compared to other diatom cpDNAs. In addition, substantial genome rearrangements were discovered not only among different Thalassionema species but also among strains of a same species T. frauenfeldii, suggesting much higher diversity than previous reports. In addition to confirming the phylogenetic position of Thalassionema species, this study also estimated their emergence time at approximately 38 Mya. The availability of the Thalassionema species cpDNAs not only helps understand the Thalassionema species, but also facilitates phylogenetic analysis of diatoms.Entities:
Keywords: Chloroplast genome; Comparative genomics; Divergence time; Thalassionema species
Mesh:
Substances:
Year: 2022 PMID: 35477350 PMCID: PMC9044688 DOI: 10.1186/s12864-022-08532-6
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 4.547
Fig. 1Sampling sites of seven Thalassionema strains analyzed in this study
Fig. 2Morphological and molecular identification of seven Thalassionema strains. (A-G) Micrographs of seven Thalassionema strains (broad girdle view, live material DIC). (H) Phylogenetic tree based on maximum likelihood (ML) analysis of 18S rDNA gene of Thalassionema strains. Thalassionema species were used as references (red) and S. acus was used as out-group taxa (blue)
Fig. 3Gene maps of cpDNAs of seven Thalassionema strains. Genes shown on the inside of the map are transcribed in a clockwise direction, whereas those on the outside of the map are transcribed counterclockwise. The assignment of genes into different functional groups is indicated by different colors. The ring of bar graphs on the inner circle shows the GC content in dark gray
Chloroplast Genome Features of Thalassionema
| Species | ||||||||
|---|---|---|---|---|---|---|---|---|
| Srtains | CNS00831 | CNS00832 | CNS00838 | CNS00836 | CNS00837 | CNS00899 | CNS00894 | - |
| GenBank ID | OK574455 | OK637332 | OK637334 | OK574456 | OK637333 | OK637335 | OK574457 | JQ088178 |
| Total (%GC) | 124,165 (29.78%) | 124,127 (29.79%) | 124,131 (29.78%) | 140,121 (29.84%) | 140,120 (29.84%) | 139,091 (29.68%) | 131,583 (29.01%) | 116,251 (30.57%) |
| IRA | 9470 | 9451 | 9453 | 13,282 | 13,282 | 15,881 | 9052 | 6796 |
| IRB | 9470 | 9451 | 9453 | 13,282 | 13,282 | 15,880 | 9050 | 6795 |
| LSC | 62,496 | 62,496 | 62,496 | 69,833 | 69,832 | 66,603 | 67,639 | 61,723 |
| SSC | 42,729 | 42,729 | 42,729 | 43,724 | 43,724 | 40,727 | 45,842 | 40,937 |
| Total numbers of genes | 151 | 151 | 151 | 154 | 156 | 155 | 155 | 160 |
| PCGs | 120 | 120 | 120 | 121 | 121 | 121 | 121 | 127 |
| Total number of introns | 0 | 0 | 0 | 0 | 0 | 0 | 1 (in | 0 |
| ORFs | ||||||||
| tRNA genes | 27 | 27 | 27 | 27 | 27 | 27 | 27 | 27 |
| rRNA genes | ||||||||
| Other RNAs | ||||||||
| Coding sequence | 79.16% | 79.68% | 79.67% | 73.99% | 74.77% | 75.62% | 78.27% | 88.63% |
| numbers of genes in IRS | 12 | 12 | 12 | 12 | 11 | 14 | 11 | 8 |
| Maximum | 986 | 986 | 986 | 2037 | 2037 | 1580 | 2174 | 324 |
| Minimum | 2 | 2 | 2 | 2 | 2 | 2 | 2 | 0 |
| Average | 163.06 | 157.94 | 157.96 | 224.62 | 215.24 | 206.39 | 179.27 | 78.91 |
Genes duplicated in the IR are only counted once
Fig. 4Gene losses and transfers of the cpDNAs of three Thalassionema species compared to S. acus cpDNA. (A) Presence and absence of 44 PCGs that used to be found lost in diatom cpDNAs in Thalassionema cpDNAs. Blue squares represent the presence of the gene, and white squares indicate the absence of the gene. (B-D) Protein sequences alignments of gene petF, psaE and psaI in the cpDNA of S. acus and in the nuclear genomes from three Thalassionema species, respectively
Fig. 5Expansion of IR regions in cpDNAs of seven Thalassionema strains and S. acus. (A) Comparative analysis of the boundaries of LSC, SSC and IR regions. (B) Comparative analysis of the length and components of IR regions
Fig. 6Intergenic spacers of seven Thalassionema cpDNAs, compared with 55 published diatom cpDNAs
Fig. 7Intra-species comparative analysis of cpDNAs. (A) Synteny comparison of cpDNAs of three T. bacillare strains. (B) Synteny comparison of cpDNAs of three T. frauenfeldii strains. (C) Gene order comparison of two T. frauenfeldii (CNS00899 and CNS00836) cpDNAs. Grey boxes represent the IR regions, and same gene blocks are in the boxes of the same colors. (D) CIRCOS plots show synteny comparison between two T. frauenfeldii (CNS00899 and CNS00836) cpDNAs. Genes with the same color share similar function
113 PCGs shared by cpDNAs of Bacillariophyta and Ochrophyta
| Category | Genes |
|---|---|
| Photosystem I | |
| Photosystem II | |
| Cytochrome b/f complex | |
| ATP synthase | |
| RubisCO subunit | |
| RNA polymerase | |
| Ribosomal proteins (SSU) | |
| Ribosomal proteins (LSU) | |
| Other genes |
Fig. 8Phylogenetic tree based on maximum likelihood (ML) analysis of amino acid (aa) sequence dataset of 113 cpDNA PCGs in Bacillariophyta. The species Triparma laevis (AP014625) (Bolidophyceae, Ochrophyta) was used as the outgroup taxa. Numbers on the branches represent the percentage of 1000 bootstrap values
Fig. 9Phylogenetic analysis based on syntenic comparison of three Thalassionema species cpDNAs. The species S. acus was used as out-group taxa. (A) Syntenic analysis of the three Thalassionema species cpDNAs using Mauve. (B-D) Pairwise comparison of the three cpDNAs. Genes with same color share similar function
Fig. 10Emergence and divergence time estimation for Thalassionema strains. The estimation used Bayesian analysis based on the nucleotide sequences of 113 PCGs shared in 28 Bacillariophyta cpDNAs. The fossil calibration taxa are indicated with red points on the corresponding nodes. Horizontal bars represent 95% highest posterior density (HPD) values of the estimated divergence time