| Literature DB >> 31750905 |
Weiwen Wang1, Robert Lanfear1.
Abstract
The chloroplast genome usually has a quadripartite structure consisting of a large single copy region and a small single copy region separated by two long inverted repeats. It has been known for some time that a single cell may contain at least two structural haplotypes of this structure, which differ in the relative orientation of the single copy regions. However, the methods required to detect and measure the abundance of the structural haplotypes are labor-intensive, and this phenomenon remains understudied. Here, we develop a new method, Cp-hap, to detect all possible structural haplotypes of chloroplast genomes of quadripartite structure using long-read sequencing data. We use this method to conduct a systematic analysis and quantification of chloroplast structural haplotypes in 61 land plant species across 19 orders of Angiosperms, Gymnosperms, and Pteridophytes. Our results show that there are two chloroplast structural haplotypes which occur with equal frequency in most land plant individuals. Nevertheless, species whose chloroplast genomes lack inverted repeats or have short inverted repeats have just a single structural haplotype. We also show that the relative abundance of the two structural haplotypes remains constant across multiple samples from a single individual plant, suggesting that the process which maintains equal frequency of the two haplotypes operates rapidly, consistent with the hypothesis that flip-flop recombination mediates chloroplast structural heteroplasmy. Our results suggest that previous claims of differences in chloroplast genome structure between species may need to be revisited.Entities:
Keywords: chloroplast genome structural heteroplasmy; flip-flop recombination; single copy inversion
Year: 2019 PMID: 31750905 PMCID: PMC7145664 DOI: 10.1093/gbe/evz256
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—The three different structural haplotypes of chloroplast genomes detected in this study. The green region is the long single copy (LSC) region and blue region is the short single copy (SSC) region. The two black regions are the Inverted Repeat (IR) regions. The arrow denotes 5′–3′ orientation. psbA is in the minus strand of LSC region, whereas rrn23 is in the plus strand of IR regions. ndhF is in the minus strand of SSC region, while ccsA is in the plus strand of SSC region. For ease of communication, we use the relative order of three genes (psbA in LSC, and ndhF and ccsA in the SSC) to label these two haplotypes “A” and “B.” In haplotype A, these genes are ordered psbA–ndhF–ccsA. In haplotype B these genes are ordered psbA–ccsA–ndhF. For haplotype C, these three genes are ordered the same as haplotype A, but the repeat regions are in-line rather than inverted.
The Frequency of Existing Chloroplast Genome Haplotypes from 61 Species
| Division | Order | Species | Size (bp) | LSC (bp) | SSC (bp) | IR (bp) | HA | HB | Frequency |
|
|---|---|---|---|---|---|---|---|---|---|---|
| Angiospermae | Poales |
| 139,712 | 81,548 | 12,592 | 22,786 | 31 | 31 | 0.50 | 1.00 |
|
| ∼135,199 | ∼79,447 | ∼12,668 | ∼21,542 | 403 | 373 | 0.52 | 0.30 | ||
|
| ∼135,151 | ∼80,667 | ∼12,646 | ∼20,919 | 14 | 11 | 0.56 | 0.69 | ||
|
| 136,462 | 81,671 | 12,701 | 21,045 | 347 | 329 | 0.51 | 0.51 | ||
|
| ∼133,880 | ∼79,181 | ∼12,591 | ∼21,054 | 27 | 28 | 0.49 | 1.00 | ||
|
| 134,750 | 80,816 | 12,334 | 20,800 | 40 | 38 | 0.51 | 0.91 | ||
|
| 134,567 | 80,596 | 12,357 | 20,807 | 8 | 4 | 0.67 | 0.39 | ||
|
| 136,133 | 81,832 | 12,499 | 20,901 | 5 | 5 | 0.50 | 1.00 | ||
|
| 140,048 | 81,841 | 9,603 | 24,302 | 37 | 22 | 0.63 | 0.07 | ||
|
| 141,176 | 83,046 | 12,540 | 22,795 | 51 | 56 | 0.48 | 0.70 | ||
|
| 141,168 | 83,046 | 12,544 | 22,789 | 25 | 28 | 0.47 | 0.78 | ||
|
| 140,754 | 83,733 | 12,503 | 22,259 | 94 | 71 | 0.57 | 0.09 | ||
|
| 140,384 | 82,352 | 12,536 | 22,748 | 564 | 555 | 0.50 | 0.30 | ||
|
| 135,854 | 81,348 | 12,582 | 20,962 | 35 | 54 | 0.39 | 0.06 | ||
| Zingiberales |
| ∼169,503 | ∼87,828 | ∼11,487 | ∼35,094 | 192 | 210 | 0.48 | 0.40 | |
| Asparagales |
| ∼157,785 | ∼86135 | ∼18,292 | ∼26,679 | 87 | 105 | 0.45 | 0.22 | |
|
| 157,785 | 86,135 | 18,292 | 26,679 | 200 | 202 | 0.50 | 0.96 | ||
| Ranunculales |
| 152,931 | 83,029 | 17,920 | 25,991 | 12 | 13 | 0.48 | 1.00 | |
| Ericales |
| 156,964 | 88,759 | 20,541 | 23,832 | 122 | 94 | 0.56 | 0.11 | |
| Lamiales |
| 155,326 | 86,564 | 17,370 | 25,696 | 65 | 63 | 0.51 | 0.93 | |
|
| ∼153,995 | ∼84,573 | ∼17,590 | ∼25,916 | 15 | 11 | 0.58 | 0.56 | ||
| Solanales |
| 86,749 | 50,978 | 7,063 | 14,354 | 5 | 8 | 0.38 | 0.58 | |
|
| 155,886 | 86,602 | 18,518 | 25,383 | 3 | 6 | 0.33 | 0.51 | ||
| Asterales |
| 152,765 | 84,103 | 18,596 | 25,033 | 39 | 31 | 0.56 | 0.40 | |
| Cucurbitales |
| 158,757 | 87,625 | 18,556 | 26,288 | 5 | 4 | 0.56 | 1.00 | |
| Fabales |
| 154,084 | 84,070 | 18,014 | 26,000 | 31 | 43 | 0.42 | 0.24 | |
|
| 156,391 | 85,946 | 18,797 | 25,824 | 17 | 17 | 0.50 | 1.00 | ||
|
| 152,415 | 81,822 | 17,425 | 26,584 | 4 | 3 | 0.43 | 1.00 | ||
| Malpighiales |
| 157,033 | 85,129 | 16,600 | 27,652 | 58 | 78 | 0.39 | 0.40 | |
|
| 155,590 | 84,454 | 16,220 | 27,458 | 41 | 47 | 0.47 | 0.59 | ||
| Rosales |
| 155,691 | 85,606 | 18,173 | 25,956 | 31 | 25 | 0.55 | 0.51 | |
|
| 155,549 | 85,532 | 18,145 | 25,936 | 4 | 2 | 0.67 | 0.69 | ||
|
| ∼157,859 | ∼85,977 | ∼19,120 | ∼26,381 | 114 | 128 | 0.47 | 0.40 | ||
|
| 156,546 | 85,727 | 18,754 | 26,030 | 5 | 4 | 0.56 | 1.00 | ||
|
| ∼155,760 | ∼85,430 | ∼18,768 | ∼25,781 | 9 | 7 | 0.56 | 0.80 | ||
| Fagales |
| 161,147 | 89,428 | 19,607 | 26,056 | 22 | 16 | 0.58 | 0.42 | |
| Brassicales |
| 154,478 | 84,170 | 17,780 | 26,264 | 135 | 150 | 0.47 | 0.41 | |
|
| 152,860 | 83,030 | 17,760 | 26,035 | 51 | 38 | 0.57 | 0.20 | ||
|
| 153,364 | 83,136 | 17,834 | 26,197 | 1,714 | 1,773 | 0.49 | 0.33 | ||
|
| 153,483 | 83,282 | 17,775 | 26,213 | 1,097 | 1,077 | 0.51 | 0.68 | ||
| Malvales |
| 158,997 | 89,022 | 21,111 | 24,432 | 71 | 96 | 0.43 | 0.06 | |
|
| 160,230 | 88,722 | 20,274 | 25,617 | 38 | 47 | 0.45 | 0.39 | ||
|
| 160,317 | 88,841 | 20,294 | 25,591 | 11 | 14 | 0.44 | 0.69 | ||
|
| 159,422 | 88,073 | 20,183 | 25,583 | 333 | 330 | 0.50 | 0.94 | ||
|
| 160,378 | 88,906 | 20,266 | 25,603 | 93 | 95 | 0.49 | 0.94 | ||
|
| 160,301 | 88,817 | 20,280 | 25,602 | 623 | 586 | 0.52 | 0.30 | ||
|
| 160,241 | 88,667 | 20,278 | 25,648 | 205 | 213 | 0.49 | 0.73 | ||
|
| 160,313 | 88,824 | 20,271 | 25,609 | 50 | 52 | 0.49 | 0.92 | ||
|
| 160,161 | 88,654 | 20,205 | 25,651 | 48 | 38 | 0.56 | 0.33 | ||
|
| 160,433 | 88,932 | 20,273 | 25,614 | 42 | 61 | 0.41 | 0.08 | ||
|
| 160,619 | 89,333 | 20,194 | 25,546 | 86 | 77 | 0.52 | 0.53 | ||
| Myrtales |
| ∼160,076 | ∼88,828 | ∼18,476 | ∼26,386 | 20 | 25 | 0.44 | 0.55 | |
|
| 160,076 | 88,828 | 18,476 | 26,386 | 13 | 16 | 0.45 | 0.71 | ||
|
| 160,386 | 89,073 | 18,557 | 26,378 | 371 | 373 | 0.50 | 0.97 | ||
|
| 159,942 | 88,787 | 18,421 | 26,367 | 1,431 | 1,459 | 0.50 | 0.62 | ||
| Branch tip A | 101 | 110 | 0.48 | 0.63 | ||||||
| Branch tip B | 186 | 184 | 0.50 | 0.96 | ||||||
| Branch tip C | 203 | 191 | 0.51 | 0.58 | ||||||
| Branch tip D | 132 | 114 | 0.54 | 0.28 | ||||||
| Branch tip E | 186 | 189 | 0.50 | 0.92 | ||||||
| Branch tip F | 234 | 255 | 0.48 | 0.37 | ||||||
| Branch tip G | 141 | 150 | 0.48 | 0.64 | ||||||
| Branch tip H | 248 | 266 | 0.48 | 0.45 | ||||||
| Sapindales |
| ∼156,262 | ∼86,018 | ∼18,072 | ∼26,086 | 104 | 94 | 0.53 | 0.52 | |
|
| 160,133 | 87,739 | 18,395 | 26,999 | 2 | 6 | 0.25 | 0.29 | ||
|
| 161,231 | 85,299 | 18,692 | 28,620 | 110 | 119 | 0.48 | 0.60 | ||
| Gymnosperm | Pinales |
| 121,530 | 66,444 | 54,288 | 399 | 0 | 898 | 0.00 | 2.2e-16 |
|
| 124,049 | 67,411 | 55,758 | 440 | 0 | 2,845 | 0.00 | 2.2e-16 | ||
| Pteridophytes | Selaginellales |
| 126,399 | 53,176 | 47,573 | 12,825 | 0 | 0 | — | — |
The long-reads were mapped to the chloroplast genome from the same genius species due to the lack of chloroplast genome of this species, so the lengths of LSC, SSC, and IR are around (∼).
P value ≤0.05.
Selaginella tamariscina only present one haplotype, haplotype C, with 174 counts, and P value is 2.2e-16.
Size, the chloroplast genome size; HA/B, haplotype A/B, the number of supported reads of LSC and SSC having identical/opposite orientation; frequency, the proportion of the count of haplotype A in the count of haplotype A + B.
. 2.—The relationship between haplotype counts and haplotype frequencies. Each point is one species. The frequency of 0.5 is indicated by a red line. A blue dot means binomial P value >0.05, whereas the orange dot means binomial P value ≤0.05. Two species, Pinus taeda, Picea sitchensis only showed evidence of haplotype B, therefore the frequencies of them were 0. For the remaining 58 species, they show higher level of equal frequency between haplotype A and B with the increase of sample size. One species (Selaginella tamariscina) is omitted from this figure because it contained only a third haplotype (haplotype C, fig. 1).