| Literature DB >> 18616572 |
David L Stern1, Virginie Orgogozo.
Abstract
Is genetic evolution predictable? Evolutionary developmental biologists have argued that, at least for morphological traits, the answer is a resounding yes. Most mutations causing morphological variation are expected to reside in the cis-regulatory, rather than the coding, regions of developmental genes. This "cis-regulatory hypothesis" has recently come under attack. In this review, we first describe and critique the arguments that have been proposed in support of the cis-regulatory hypothesis. We then test the empirical support for the cis-regulatory hypothesis with a comprehensive survey of mutations responsible for phenotypic evolution in multicellular organisms. Cis-regulatory mutations currently represent approximately 22% of 331 identified genetic changes although the number of cis-regulatory changes published annually is rapidly increasing. Above the species level, cis-regulatory mutations altering morphology are more common than coding changes. Also, above the species level cis-regulatory mutations predominate for genes not involved in terminal differentiation. These patterns imply that the simple question "Do coding or cis-regulatory mutations cause more phenotypic evolution?" hides more interesting phenomena. Evolution in different kinds of populations and over different durations may result in selection of different kinds of mutations. Predicting the genetic basis of evolution requires a comprehensive synthesis of molecular developmental biology and population genetics.Entities:
Mesh:
Year: 2008 PMID: 18616572 PMCID: PMC2613234 DOI: 10.1111/j.1558-5646.2008.00450.x
Source DB: PubMed Journal: Evolution ISSN: 0014-3820 Impact factor: 3.694
Figure 1Gene structure and definitions of cis-regulatory and coding regions and cis-regulatory and coding mutations. (A) A single gene encodes a complex set of instructions in the DNA sequence. The final gene product can either be a protein, via an mRNA intermediate, or a mature RNA molecule itself (transfer RNA, ribosomal RNA, micro RNA, etc.). Gray boxes indicate DNA regions that encode a protein product. The mRNA molecule is transcribed from the transcription initiation site to the polyadenylation signal and introns are spliced out. Many genes encode alternative mRNA splice variants that can be generated by alternative use of different exons (Graveley 2001; Xing and Lee 2006). This is indicated in the figure by lines above the gene connecting alternative exons. Alternative splice variants are usually expressed in different tissues and at different times in development. The mechanisms regulating splicing are not fully understood, but at least some of the information is encoded in the introns and must be recognized by cell-type-specific splicing factors (Lopez 1998). The mRNA contains 5′ and 3′ untranslated regions (UTRs), which are involved in mRNA stability, mRNA localization, and translation. The basal transcription apparatus binds upstream of the gene-coding region, often at a TA-rich sequence motif called a TATA box. Two enhancer modules are indicated to the left of the exons. Each module can contain binding sites for multiple transcription factors. In some cases, transcription factor binding sites are not clustered into discrete modules. (B) Genes can therefore be divided into coding regions, encompassing all of the exons, and cis-regulatory sequences, which include all other DNA that regulates gene expression. Cis-regulatory sequences include sequences that regulate transcription, RNA stability and splicing, and translation. (C) We define coding mutations as mutations that alter the amino acid sequence encoded by the mRNA or that alter the nucleotide sequence of a mature RNA molecule. (D) Cis-regulatory mutations can occur anywhere in the gene region, including noncoding sequence and coding sequence. In rare cases, synonymous mutations in coding regions alter gene regulation in cis, for example through modification of transcription factor binding sites or through modification of RNA stability (see text for further details). In principle, nonsynonymous mutations could alter both the polypeptide sequence and gene regulation, but no such examples have been reported yet. The regulation of gene expression operates at multiple levels: translation, alternative splicing variants, mRNA stability, mRNA cell localization, translation, etc. (Stern 2003; Alonso and Wilkins 2005). All of these levels of gene regulation are, potentially, available for evolutionary modification (Alonso and Wilkins 2005). However, by far the majority of variation in the distribution of gene products during development is controlled at the transcriptional level (Davidson 2006).
Haploid genome sizes and the proportion of coding and noncoding regions for various eukaryotes (modified from tables 3.2 and 3.3 of Lynch 2007).
| Approximate | Proportion | Proportion | Estimated proportion | |
|---|---|---|---|---|
| haploid genome | coding | noncoding | of noncoding DNA | |
| size (in Mb) | that is regulatory | |||
| 12 | 74.2 | 25.8 | 22 | |
| 30 | 45.9 | 54.1 | 2 | |
| 23 | 52.8 | 47.2 | 2 | |
| 100 | 26.4 | 73.6 | 12 | |
| 137 | 19.4 | 80.6 | 20 | |
| 2500 | 1.4 | 98.6 | 2 | |
| 2900 | 1.4 | 98.6 | 2 |
The amount of regulatory DNA was estimated from islands of conserved DNA sequence between closely related species. See Lynch (2007) for details.
The distribution of evolutionarily relevant mutations in plants and animals.
| Plants | Animals | |
|---|---|---|
| Coding | 71 | 163 |
| 26 | 48 | |
| Other | 16 | 7 |
| Total | 113 | 218 |
| Null | 67 | 32 |
Includes mutations altering mRNA splicing.
Includes gene amplification, gene loss, stable DNA methylation, as well as four cases in plants where the mutations were mapped to a gene but not localized to a coding versus cis-regulatory change.
Number of total alleles that are presumed null based on existence of premature stop codons, altered splice sites that disrupt the protein, and deletions of part or all of the protein-coding sequence.
Distribution of evolutionary relevant mutations among phenotypic classes and among regulatory network levels.
| Morphology | Physiology | Behavior | DGB member | Non-DGB member | |
|---|---|---|---|---|---|
| Coding | 62 | 170 | 2 | 132 | 102 |
| 43 | 29 | 2 | 34 | 37 | |
| Other | 3 | 20 | 0 | 9 | 14 |
| Total | 108 | 219 | 4 | 175 | 153 |
| Null | 41 | 58 | 0 | 22 | 77 |
Gene is a known or presumptive member of a differentiation gene battery (DGB).
Gene known or presumed to reside upstream of a DGB. Three genes could not be assigned to the DGB or non-DGB category because their function is unknown.
Includes mutations altering mRNA splicing.
Includes gene duplications, gene losses, stable DNA methylation, and four cases in which the mutations were mapped to a gene but not localized to a coding versus cis-regulatory change.
Alleles presumed null based on existence of premature stop codons, altered splice sites, and deletions of part or all of the protein-coding sequence.
Figure 2Cumulative number of coding mutations, cis-regulatory mutations and other types of mutations (gene amplification, gene loss, etc.) that have been identified over time as responsible for phenotypic evolution. Results are from data in Appendix 1. Note that the slope for cis-regulatory mutations has increased in recent years. The current discovery rate of cis-regulatory mutations approximately equals the discovery rate of coding mutations. If this reflects the long-term trend, then we expect ultimately to observe approximately equal numbers of cis-regulatory and coding mutations.
Distribution of evolutionary relevant mutations among taxonomic levels.
| Domesticated | Intraspecific | Interspecific | Higher taxonomic level | |
|---|---|---|---|---|
| Coding | 65 | 122 | 28 | 19 |
| 23 | 24 | 24 | 3 | |
| Other | 11 | 11 | 1 | 0 |
| Total | 99 | 157 | 53 | 22 |
| Null | 55 | 39 | 3 | 2 |
Includes recently diverged populations that experience reproductive isolation and are often considered different species, such as divergent stickleback populations.
Comparisons of species that are not sibling species.
Includes mutations altering mRNA splicing.
Includes gene duplications, gene losses, stable DNA methylation, and four cases in which the mutations were mapped to a gene but not localized to a coding versus cis-regulatory change.
Includes alleles presumed null based on existence of premature stop codons and deletions of part or all of the protein-coding sequence.
Figure 3Evolutionarily relevant cis-regulatory mutations are more frequently found in interspecific comparisons than in intraspecific comparisons or among domesticated races. (A) The proportion of all mutations that are cis-regulatory mutations for morphological and physiological traits in the complete dataset. (B) Proportion of cis-regulatory mutations for morphological and physiological traits in the restricted dataset, where only one or two mutations per gene were included. Two mutations were included only if both coding and cis-regulatory mutations were found for a single gene. (C) Proportion of cis-regulatory mutations for DGB versus non-DGB genes in the complete dataset. (D) Proportion of cis-regulatory mutations for DGB versus non-DGB genes in the restricted dataset. The total number of mutations for each category is shown above the bars.
Statistical comparisons of the frequency of cis-regulatory and coding mutations for different phenotypic classes, different gene-network classes and different taxonomic levels.
| Complete dataset | Restricted dataset | |||||
|---|---|---|---|---|---|---|
| G | Intraspecific vs. interspecific | G | Intraspecific vs. interspecific | |||
| Fisher's exact | Fisher's exact | |||||
| Morphology | 25.9 | <0.00002 | <0.00007 | 6.2 | <0.11 | <0.35 |
| Physiology | 6.5 | <0.09 | <0.45 | 4.0 | <0.26 | 1 |
| DGB | 16.8 | <0.0008 | <0.42 | 6.2 | <0.11 | 1 |
| Non-DGB | 27.0 | <0.000006 | <0.000002 | 7.6 | <0.06 | <0.05 |
Value of G test of independence for the number of cis-regulatory versus coding mutations for domesticated, intraspecific, interspecific, and intergeneric taxonomic levels.
P values for all G tests of independence were calculated using three degrees of freedom.
A test of the frequency of cis-regulatory versus coding mutations in intraspecific versus interspecific populations. Data from domesticated races were excluded and the interspecific and higher taxonomic level data were pooled.
The P value for a Fisher's exact test of independence is reported.
Figure 4Scanning electron micrograph of trichomes and bristles on a leg of Drosophila melanogaster. Trichomes are nonsensory cuticular extensions. Bristles are sensory organs innervated by single neurons.
Figure 5Partial regulatory networks patterning (A) trichomes (modified from results in Chanut-Delalande et al. 2006; Overton et al. 2007) and (B) bristles in Drosophila melanogaster (modified from Calleja et al. 2002; Hartenstein 2004).