| Literature DB >> 18495022 |
David Roy Smith1, Robert W Lee.
Abstract
BACKGROUND: The magnitude of intronic and intergenic DNA can vary substantially both within and among evolutionary lineages; however, the forces responsible for this disparity in genome compactness are conjectural. One explanation, termed the mutational-burden hypothesis, posits that genome compactness is primarily driven by two nonadaptive processes: mutation and random genetic drift - the effects of which can be discerned by measuring the nucleotide diversity at silent sites (pisilent), defined as noncoding sites and the synonymous sites of protein-coding regions. The mutational-burden hypothesis holds that pisilent is negatively correlated to genome compactness. We used the model organism Chlamydomonas reinhardtii, which has a streamlined, coding-dense mitochondrial genome and an noncompact, intron-rich nuclear genome, to investigate the mutational-burden hypothesis. For measuring pisilent we sequenced the complete mitochondrial genome and portions of 7 nuclear genes from 7 geographical isolates of C. reinhardtii.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18495022 PMCID: PMC2412866 DOI: 10.1186/1471-2148-8-156
Source DB: PubMed Journal: BMC Evol Biol ISSN: 1471-2148 Impact factor: 3.260
Chlamydomonas reinhardtii strains used in this study.
| Strain | Mating Type | Strain Synonym | Geographical Origin (USA) | Reference | |
|---|---|---|---|---|---|
| CC-277 | Amherst, Massachusetts | MA-1 | Harris 1989 [20] | ||
| CC-1373 | South Deerfield, Massachusetts | MA-2 | Hoshaw and Ettl 1966 [41] | ||
| CC-1952 | Plymouth, Minnesota | MN | Gross et al. 1988 [42] | ||
| CC-2342 | Jarvik 6 | Pittsburgh, Pennsylvania | PA-1 | Spanier | |
| CC-2344 | Jarvik 356 | Malvern, Pennsylvania | PA-2 | Spanier | |
| CC-2931 | Harris 6 | Durham, North Carolina | NC | Harris 1989 [20] | |
| CC-2343 | Jarvik 124 | Melbourne, Florida | FL | Spanier |
a Strain abbreviations are based on the USA state from which the strains were isolated.
Figure 1Genetic map of the . Protein-coding regions and regions encoding structural RNAs are red and orange, respectively. S1–S4 represent the small-subunit rRNA-coding modules; L1–L8 represent the large-subunit rRNA-coding modules. The terminal inverted repeats (IR) are black. Intronic regions and their open reading frames are boxed in blue inside their associated genes. The C. reinhardtii strains (Table 1) in which the different introns occur are labelled in parentheses. Solid arrows denote the transcriptional polarities. Note: due to the presence/absence of introns among the different strains, the size of the C. reinhardtii mitochondrial genome can vary from 15,782 nt to 18,990 nt.
Figure 2Partial genetic maps of the 7 . The bracketed segment beneath each map represents the region that was PCR amplified. Left of each map is the name of the gene, the approximate size of the region that was PCR amplified, and the location of the gene within the C. reinhardtii nuclear genome – locations are based on the C. reinhardtii draft nuclear genome sequence version 3.0 [8]. Exons are red; they are labelled with an "E" and a number denoting their position within the gene. Introns are blue and are labelled with a roman numeral denoting their location within the gene. Note: each of these genes is present only once in the C. reinhardtii nuclear genome.
Nucleotide diversity in the mitochondrial and nuclear compartments of Chlamydomonas reinhardtii.
| # of sites | # of Indelsg (length nt) | π × 10-3 (SD × 10-3) | θW × 10-3 (SD × 10-3 | πsyn × 10-3 | πns × 10-3 | Tajima's D test (P value) | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| Complete genomea | 15280 | 134 | 15 (36) | 3.35 (0.68) | 3.66 (0.31) | --- | --- | -0.49 (> 0.10) | ||
| Protein-codingb | 8154 | 44 | 1 (6) | 2.06 (0.43) | 2.20 (0.33) | 8.52 | 0 | -0.38 (> 0.10) | ||
| structural RNA genesc | 3502 | 23 | 4 (4) | 2.42 (0.42) | 2.68 (0.56) | --- | --- | -0.55 (> 0.10) | ||
| 1118 | 9 | 1 (3) | 3.07 (0.87) | 3.29 (1.10) | 7.88 | 1.89 | -0.35 (> 0.10) | |||
| Intergenicd | 2434 | 58 | 9 (23) | 8.92 (1.88) | 9.73 (1.28) | --- | --- | -0.48 (> 0.10) | ||
| Silent sitese | 5152 | 99 | 12 (26) | 8.54 (1.03) | 9.18 (0.65) | --- | --- | --- | ||
| Intronic (overall) | 4294 | 359 | 47 (216) | 33.50 (3.15) | 33.81 (1.79) | --- | --- | -0.16 (> 0.10) | ||
| Silent sitesf | 4824 | 381 | 50 (219) | 31.96 (3.03) | 33.02 (1.88) | --- | --- | --- | ||
| Exonic (overall) | 1614 | 25 | 1 (9) | 6.02 (0.99) | 6.58 (1.29) | 19.57 | 1.42 | -0.48 (> 0.10) | ||
| Exonic (by gene) | 420 | 1 | 0 | 0.68 (0.47) | 0.97 (0.97) | 2.77 | 0 | -1.00 (> 0.10) | ||
| 300 | 2 | 0 | 2.54 (0.78) | 2.72 (1.92) | 9.69 | 0 | -0.27 (> 0.10) | |||
| 222 | 3 | 0 | 6.86 (1.88) | 5.52 (3.18) | 26.35 | 0 | 1.10 (> 0.10) | |||
| 408 | 10 | 1 (9) | 10.50 (2.70) | 10.00 (3.16) | 24.37 | 4.77 | 0.27 (> 0.10) | |||
| 264 | 9 | 0 | 9.74 (2.80) | 13.91 (4.64) | 41.14 | 0 | -1.59 (> 0.10) | |||
| Intronic (by gene) | 578 | 77 | 8 (75) | 58.30 (7.41) | 54.30 (6.20) | --- | --- | 0.26 (> 0.10) | ||
| 621 | 43 | 3 (5) | 30.21 (6.07) | 28.26 (4.31) | --- | --- | 0.39 (> 0.10) | |||
| 691 | 37 | 8 (42) | 23.22 (5.58) | 21.86 (3.59) | --- | --- | 0.20 (> 0.10) | |||
| 560 | 30 | 4 (25) | 21.60 (3.03) | 21.87 (3.99) | --- | --- | -0.07 (> 0.10) | |||
| 847 | 98 | 12 (15) | 43.57 (4.53) | 48.67 (4.77) | --- | --- | -0.61 (> 0.10) | |||
| 790 | 47 | 9 (39) | 22.66 (4.50) | 24.28 (3.54) | --- | --- | -0.38 (> 0.10) | |||
| 207 | 27 | 3 (15) | 53.37 (8.18) | 53.37 (10.2) | --- | --- | 0.01 (> 0.10) | |||
Note--S, number of segregating sites; Indels, insertion and deletions; π, nucleotide diversity; θW, Theta (per-site) from Watterson estimator; πsyn, nucleotide diversity at synonymous sites; πns, nucleotide diversity at nonsynonymous sites; SD, standard deviation.
a Includes only 1 telomere.
b Includes all protein-coding genes excluding rtl.
c Includes rRNA- and tRNA-coding regions.
d Includes intergenic regions and 1 telomere.
e Includes synonymous sites, intergenic regions, and 1 telomere.
f Includes synonymous sites and intronic regions.
g Nucleotide diversity does not include indels. Consecutive indel sites are counted as single event. Indel length refers to the sum of all indels.
McDonald-Kreitman test comparing the ratio of nonsynonymous to synonymous differences within Chlamydomonas reinhardtii to that found between C. reinhardtii and Chlamydomonas incerta.
| Polymorphisms within | Substitutions between | ||||||
|---|---|---|---|---|---|---|---|
| Protein-codinga | Nonsynonymous | 3 | 61 | 0.653 | 0.533 | 0.465 | |
| Synonymous | 36 | 478 | |||||
| Nonsynonymous | 5 | 111 | 0.746 | 0.185 | 0.667 | ||
| Synonymous | 4 | 119 | |||||
| Exonic (overall)b | Nonsynonymous | 1 | 18 | 0.159 | 4.837 | 0.027 | |
| Synonymous | 21 | 60 | |||||
| Nonsynonymous | 0 | 0 | undef | undef | undef | ||
| Synonymous | 1 | 1 | |||||
| Nonsynonymous | 0 | 3 | 0.000 | undef | undef | ||
| Synonymous | 2 | 4 | |||||
| Nonsynonymous | 0 | 2 | 0.000 | undef | undef | ||
| Synonymous | 3 | 18 | |||||
| Nonsynonymous | 1 | 9 | 0.677 | 1.081 | 0.298 | ||
| Synonymous | 6 | 18 | |||||
| Nonsynonymous | 0 | 4 | 0.000 | undef | undef | ||
| Synonymous | 9 | 19 |
Note--NI, neutrality index (ratio of nonsynonymous to synonymous polymorphisms within C. reinhardtii compared to the ratio of nonsynonymous to synonymous fixed differences between C. reinhardtii and C. incerta); G, G-test of independence (determines if the proportion of nonsynonymous substitutions is independent of whether the substitutions are fixed or polymorphic); P, probability of G-test; undef, undefined. C. incerta data came from [6] and [7]. Note: in no case was the McDonald-Kreitman test statistically significant.
a Includes all protein coding genes excluding rtl.
b Concatenated exons from all 5 nuclear genes.
Figure 3Schema of the introns in the L5- and L7-rRNA-coding modules. The vertical arrows in A show the intron insertion sites within the C. reinhardtii mtDNA. B and C depict the introns in the L5- and L7-rRNA-coding modules, respectively; rRNA-coding regions are orange; introns are light blue; intronic open reading frames are boxed in dark blue within their respective introns; L5-frag refers to a duplicated segment of the L5-rRNA-coding module (the first 35 nt of the module are duplicated); bracketed portions of the map represent regions that were shown to be spliced-out in mature transcripts. D depicts the intron insertion sites in the context of the large subunit (LSU) ribosomal RNA sequence of C. reinhardtii; arrows point to the region where the introns are inserted; numbers above the arrows denote the position of the residue that immediately precedes the insertion site: un-bracketed numbers correspond to the residue in the 23S rRNA gene of Escherichia coli [44] and bracketed numbers correspond to the residue in the LSU-rRNA secondary-structure model of Boer and Gray [21]. Note: the C. reinhardtii strains in which these introns occur are shown in Figure 1.
Genbank accession numbers of the Chlamydomonas reinhardtii sequences employed in this study.
| Strain | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| CC-277c | EU306622 | EU306630 | EU306651 | EU306644 | EU306632 | EU306658 | D50838 | U13167 | U13167 |
| CC-1373 | EU306617 | EU306625 | EU306646 | EU306639 | EU306633 | EU306653 | U70571 | U55911 | U55912 |
| CC-1952 | EU306621 | EU306626 | EU306647 | EU306640 | EU306634 | EU306654 | U70563 | U55893 | U55894 |
| CC-2342 | EU306620 | EU306627 | EU306648 | EU306641 | EU306635 | EU306655 | U70569 | U55905 | U55906 |
| CC-2343 | EU306623 | EU306628 | EU306649 | EU306642 | EU306636 | EU306656 | U70561 | U55889 | U55890 |
| CC-2344 | EU306619 | EU306629 | EU306650 | EU306643 | EU306637 | EU306657 | U70562 | U55891 | U55892 |
| CC-2931 | EU306618 | EU306624 | EU306645 | EU306638 | EU306631 | EU306652 | U70568 | U55901 | U55902 |
a Present study.
b Sequences from [18] except for D50838, which is from [17], and U13167, which is from [16].
c The mitochondrial genome from this strain was shown to be identical to that of C. reinhardtii CC-503.