| Literature DB >> 32211179 |
Yi-Ying Liao1, Yu Liu1, Xing Liu2, Tian-Feng Lü2, Ruth Wambui Mbichi3, Tao Wan1, Fan Liu4.
Abstract
Myriophyllum, among the most species-rich genera of aquatic angiosperms with ca. 68 species, is an extensively distributed hydrophyte lineage in the cosmopolitan family Haloragaceae. The chloroplast (cp) genome is useful in the study of genetic evolution, phylogenetic analysis, and molecular dating of controversial taxa. Here, we sequenced and assembled the whole chloroplast genome of Myriophyllum spicatum L. and compared it to other species in the order Saxifragales. The complete chloroplast genome sequence of M. spicatum is 158,858 bp long and displays a quadripartite structure with two inverted repeats (IR) separating the large single copy (LSC) region from the small single copy (SSC) region. Based on sequence identification and the phylogenetic analysis, a 4-kb phylogenetically informative inversion between trnE-trnC in Myriophyllum was determined, and we have placed this inversion on a lineage specific to Myriophyllum and its close relatives. The divergence time estimation suggested that the trnE-trnC inversion possibly occurred between the upper Cretaceous (72.54 MYA) and middle Eocene (47.28 MYA) before the divergence of Myriophyllum from its most recent common ancestor. The unique 4-kb inversion might be caused by an occurrence of nonrandom recombination associated with climate changes around the K-Pg boundary, making it interesting for future evolutionary investigations.Entities:
Keywords: Haloragaceae; Myriophyllum spicatum; hydrophyte; inversion; structure variation
Year: 2020 PMID: 32211179 PMCID: PMC7083656 DOI: 10.1002/ece3.6125
Source DB: PubMed Journal: Ecol Evol ISSN: 2045-7758 Impact factor: 2.912
Figure 1The Myriophyllum spicatum L. (Haloragaceae, Myriophyllum), a perennial submerged aquatic plant widely distributed in Europe, Asia, and north Africa
Taxa used for cp DNA sequencing and PCR diagnosis of the inversion
| No. | Family | Species | Locality | Used for |
|---|---|---|---|---|
| 1 | Haloragidaceae |
| Shanwei, Guangdong, China | cp DNA sequencing |
| 2 | Haloragidaceae |
| UK | PCR diagnosis |
| 3 | Haloragidaceae |
| Zhenjiang, Jiangsu, China | PCR diagnosis |
| 4 | Haloragidaceae |
| Shanwei, Guangdong, China | PCR diagnosis |
| 5 | Haloragidaceae |
| USA | PCR diagnosis |
| 6 | Haloragidaceae |
| Australia | PCR diagnosis |
| 7 | Haloragidaceae |
| Liangzi Lake, Ezhou, Hubei, China | PCR diagnosis |
| 8 | Haloragidaceae |
| USA | PCR diagnosis |
| 9 | Haloragidaceae |
| Ice land | PCR diagnosis |
| 10 | Haloragidaceae |
| Germany | PCR diagnosis |
| 11 | Haloragidaceae |
| USA | PCR diagnosis |
| 12 | Haloragidaceae |
| Wuhan Botanical Gardon, Wuhan, Hubei, China | PCR diagnosis |
| 13 | Haloragidaceae |
| Australia | PCR diagnosis |
| 14 | Haloragidaceae |
| Australia | PCR diagnosis |
| 15 | Haloragidaceae |
| Fuyuan, Heilongjiang, China | PCR diagnosis |
Figure 2The whole assembly of the chloroplast genome of M. spicatum. The inverted repeats (IRa, IRb) were indicated in thick black lines on inner cycle which separate the genome into the large (LSC) and small (SSC) single copy regions. The genes drawn outside of the circle are transcribed counterclockwise, while those inside are clockwise. Gene boxes are colored by functional group as shown in the key. The red arrows denote the location of the 4‐kb inversion
Genes present in Myriophyllum spicatum chloroplast genome
| Category | Group of genes | Genes |
|---|---|---|
| Photosynthesis‐related genes (47) | Rubisco (1) |
|
| Photosystem I (5) |
| |
| Assembly/stability of photosystem I (2) |
| |
| Photosystem II (15) |
| |
| ATP synthase (6) |
| |
| cytochrome b/f compelx (6) |
| |
| cytochrome c synthesis (1) |
| |
| NADPH dehydrogenase (11) |
| |
| Transcription and translation‐related genes (59) | transcription (4) |
|
| ribosomal proteins (20) |
| |
| translation initiation factor (1) |
| |
| ribosomal RNA (4) |
| |
| transfer RNA (30) |
| |
| Other genes (6) | RNA processing (1) |
|
| carbon metabolism (1) |
| |
| fatty acid synthesis (1) |
| |
| proteolysis (1) |
| |
| conserved genes with unknown functions (2) |
|
One and two superscript asterisks indicate one‐ and two‐intron‐containing genes, respectively. Genes located in the IR region are indicated by (x2) after the gene name.
Codon usage in Myriophyllum spicatum chloroplast genome
| Codon | Amino acid | Count | RSCU | tRNA | Codon | Amino acid | Count | RSCU | tRNA |
|---|---|---|---|---|---|---|---|---|---|
| UUU(F) | Phe (F) | 988 | 1.31 | UCU(S) | Ser (S) | 564 | 1.68 | ||
| UUC(F) | Phe (F) | 524 | 0.69 | trnF‐GAA | UCC(S) | Ser (S) | 304 | 0.90 | trnS‐GGA |
| UUA(L) | Leu (L) | 870 | 1.86 | trnL‐UAA | UCA(S) | Ser (S) | 409 | 1.22 | trnS‐UGA |
| UUG(L) | Leu (L) | 560 | 1.19 | trnL‐CAA | UCG(S) | Ser (S) | 198 | 0.59 | |
| CUU(L) | Leu (L) | 593 | 1.27 | CCU(P) | Pro (P) | 424 | 1.58 | ||
| CUC(L) | Leu (L) | 199 | 0.42 | CCC(P) | Pro (P) | 202 | 0.75 | trnP‐UGG | |
| CUA(L) | Leu (L) | 394 | 0.84 | trnL‐UAG | CCA(P) | Pro (P) | 315 | 1.17 | |
| CUG(L) | Leu (L) | 196 | 0.42 | CCG(P) | Pro (P) | 135 | 0.50 | ||
| AUU(I) | I le (I) | 1,103 | 1.45 | ACU(T) | Thr (T) | 542 | 1.61 | ||
| AUC(I) | I le (I) | 418 | 0.55 | trnI‐GAU | ACC(T) | Thr (T) | 243 | 0.72 | trnT‐GGU |
| AUA(I) | I le (I) | 748 | 1.12 | trnI‐CAU | ACA(T) | Thr (T) | 407 | 1.21 | trnT‐UGU |
| AUG(M) | Met (M) | 590 | 0.88 | trnM‐CAU | ACG(T) | Thr (T) | 156 | 0.46 | |
| GUU(V) | Val (V) | 520 | 1.48 | GCU(A) | Ala (A) | 637 | 1.82 | ||
| GUC(V) | Val (V) | 175 | 0.50 | trnV‐GAC | GCC(A) | Ala (A) | 230 | 0.66 | |
| GUA(V) | Val (V) | 525 | 1.49 | trnV‐UAC | GCA(A) | Ala (A) | 401 | 1.15 | trnA‐UGC |
| GUG(V) | Val (V) | 186 | 0.53 | GCG(A) | Ala (A) | 129 | 0.37 | ||
| UAU(Y) | Try (Y) | 785 | 1.62 | UGU(C) | Cys (C) | 227 | 1.52 | ||
| UAC(Y) | Try (Y) | 185 | 0.38 | trnY‐GUA | UGC(C) | Cys (C) | 72 | 0.48 | |
| UAA(*) | Stop | 49 | 0.27 | UGA(*) | Stop | 15 | 0.06 | trnS‐GCU | |
| UAG(*) | Stop | 21 | 0.12 | UGG(W) | Trp (W) | 460 | 1.94 | trnC‐GCA | |
| CAU(H) | His (H) | 471 | 1.52 | CGU(R) | Arg (R) | 340 | 1.48 | ||
| CAC(H) | His (H) | 150 | 0.48 | CGC(R) | Arg (R) | 99 | 0.43 | trnW‐CCA | |
| CAA(Q) | Gln (Q) | 712 | 1.51 | trnH‐GUG | CGA(R) | Arg (R) | 354 | 1.54 | trnR‐ACG |
| CAG(Q) | Gln (Q) | 231 | 0.49 | trnQ‐UUG | CGG(R) | Arg (R) | 129 | 0.56 | |
| AAU(N) | Asn (N) | 968 | 1.52 | AGU(S) | Ser (S) | 446 | 1.33 | ||
| AAC(N) | Asn (N) | 305 | 0.48 | AGC(S) | Ser (S) | 95 | 0.28 | ||
| AAA(K) | Lys (K) | 1,080 | 1.51 | trnN‐GUU | AGA(R) | Arg (R) | 500 | 2.76 | trnR‐UCU |
| AAG(K) | Lys (K) | 350 | 0.49 | trnK‐UUU | AGG(R) | Arg (R) | 154 | 0.85 | |
| GAU(D) | Asp (D) | 887 | 1.63 | GGU(G) | Gly (G) | 595 | 1.32 | ||
| GAC(D) | Asp (D) | 201 | 0.37 | GGC(G) | Gly (G) | 168 | 0.37 | trnG‐GCC | |
| GAA(E) | Glu (E) | 1,007 | 1.50 | trnD‐GUC | GGA(G) | Gly (G) | 716 | 1.59 | trnG‐UCC |
| GAG(E) | Glu (E) | 338 | 0.50 | trnE‐UUC | GGG(G) | Gly (G) | 321 | 0.71 |
Excluding pseudogenes.
Figure 3Distribution of repeat sequences and SSRs in M. spicatum chloroplast genome. GC content is shown
Figure 4Comparison of five Saxifragales chloroplast genome. The top gray arrows and thick black lines show genes with their orientation. The inversion was indicated by thick red line. The y‐axis represents the percent identity within 50%–100%. The x‐axis represents the coordinate in the cp genome. Genome regions are color‐coded as protein‐coding (exon), intron, and conserved noncoding sequences (CNS)
Figure 5Comparison of the borders of LSC, IR, SSC, and LSC regions among five Saxifragales genomes. The adjacent border genes are indicated by boxes with gene names and bps above or below the main line
Figure 6Linearized maps comparison of the plastid genomes of five Saxifragales plants. Syntenic blocks are shown above and gene maps are shown below. Unique regions are boxed in yellow, and the inversion events occurred in M. spicatum are marked with short red line
Figure 7Inferred phylogenetic trees of 15 taxa of Haloragidaceae and related families basing on maximum (ML) and Bayesian inference (BI) analyses of different combined datasets. (a) rpoB‐trnE+trnC‐trnT. (b) ITS+trnK+matK. (c) ITS+trnK+matK+rpoB‐trnE+trnC‐trnT. The ML bootstrap values (below) and Bayesian posterior probability (above) are given for each branch. The 4‐kb inversion rearrangement event was mapped onto the branches with red arrow
Figure 8(a) Illustration of the suggested flip‐flop recombination event in Haloragidaceae resulting in a 4‐kb inversion. The ribbons represent partial of the chloroplast genome, and the genes are colored in purple. (b) Chronogram of Haloragaceae predicts and estimates the origin of the 4‐kb inversion under a Bayesian relaxed clock model by using of the combined ITS, matK, trnK, rpoB‐trnE, and trnC‐trnT matrix. Gray colored bars at nodes indicate the 95% credibility intervals of age estimates. The numbers near the nodes refer to the node age. Red asterisks highlight the 4‐kb inversion rearrangement event