| Literature DB >> 30137416 |
Mark J Berger1, Aaron M Wenger1, Harendra Guturu2, Gill Bejerano1,3,4,5.
Abstract
Genetic variation in cis-regulatory elements is thought to be a major driving force in morphological and physiological changes. However, identifying transcription factor binding events that code for complex traits remains a challenge, motivating novel means of detecting putatively important binding events. Using a curated set of 1154 high-quality transcription factor motifs, we demonstrate that independently eroded binding sites are enriched for independently lost traits in three distinct pairs of placental mammals. We show that these independently eroded events pinpoint the loss of hindlimbs in dolphin and manatee, degradation of vision in naked mole-rat and star-nosed mole, and the loss of external testes in white rhinoceros and Weddell seal. We additionally show that our method may also be utilized with more than two species. Our study exhibits a novel methodology to detect cis-regulatory mutations which help explain a portion of the molecular mechanism underlying complex trait formation and loss.Entities:
Mesh:
Substances:
Year: 2018 PMID: 30137416 PMCID: PMC6182171 DOI: 10.1093/nar/gky741
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Given enough evolutionary time, independently eroded binding sites should congregate in genomic regions, which encode for independently lost traits. As species diverge, a phenotypic trait and the genomic regions required for that trait are inherited from the ancestral species. The trait of interest is necessary to maintain fitness, and therefore important trait-encoding transcription factor binding sites are under negative selection. As target species 1 evolves, a trait-loss event fixates within the species. However, the sister species outgroup 1 still maintains the trait. Since the trait is lost in target species 1, all trait-dedicated information now switches to neutral selection in the species. This leads to neutral erosion of trait-encoding transcription factor binding sites. Similarly, in target species 2, a trait-loss event (but not necessarily the same event as in target species 1) for the same trait fixates in the population. Here too, sister species outgroup 2 still maintains the trait. Now all trait-encoding information in target species 2 switch to neutral selection, and therefore the trait-encoding binding sites begin to neutrally erode. Using the sister species as outgroups, we can identify all transcription factor binding sites that have eroded in our target species but have been maintained in our outgroup species and many other references species. We refer to these sites as independently eroded binding sites. This very unusual evolutionary signature is shown in Figure 2 and Table 1 to be strongest next to key genes for the development of an important independently lost complex trait.
Figure 2.Independently eroded binding sites are enriched next to key genes for independent complex trait losses. (A) Throughout the evolution of placental mammals, species have independently adapted to a variety of environments. In the three pairs of species shown in a simplified phylogenic tree, we test whether independently eroded binding sites are statistically enriched for functions associated with these adaptations (see Supplementary Figure S1 for the phylogeny of all 58 mammals used in this study, and Figure 1 for the rationale). (B) Independently eroded binding sites can be the result of either a deletion (i.e. dolphin in this example) or mutations, which decrease binding affinity (i.e. manatee). Bases identical to human are represented as dots and single dashes represent deleted bases. (C) Using a library of 1154 transcription factor motifs, we identify ∼9.7 million putative mammalian conserved binding sites. We consider a conserved site to be eroded in the target species if a motif match is absent in the target species, but present in any of the outgroup species. Binding sites are considered independently eroded if they are eroded along two distinct clades of species. For a given pair of species, the 5000 most significant independently eroded sites are agnostically tested against 3538 ontology terms from the MGI Gene Expression Database to identify a most significant shared function (see Table 1).
Independently eroded binding sites congregate in the regulatory domain of important trait relevant genes
| Parallel adaptations | Species | Top MGI expression term | Total eroded binding sites (genes) | False discovery rate ( | Fold enrichment | Affected target genes | Fraction of genes in top term affected |
|---|---|---|---|---|---|---|---|
|
| Dolphin Manatee |
| 37 (10) | 1.30 × 10−4 | 2.85 |
| 83.33% |
| ‘The dolphin… develops hindlimb buds that do form an apical ectodermal ridge which regresses.’ ( | |||||||
| ‘Manatees… retain only small pelvic rudiments and no external hindlimbs.’ ( | |||||||
|
| Naked mole-rat Star-nosed mole |
| 40 (9) | 8.51 × 10−7 | 3.09 |
| 56.25% |
| ‘[Naked mole-rats] have a degenerated eye and optic nerve, suggesting they have poor visual abilities.’ ( | |||||||
|
| |||||||
|
| White rhinoceros Weddell seal |
| 31 (11) | 6.60 × 10−4 | 2.97 |
| 78.57% |
| ‘During the evolutionary history of Laurasiatheria, the scrotum disappears… [in] Rhinocerotidae (rhinoceros)… [and] Phocidae (seals).’ ( | |||||||
| “Abnormalities of the [mesonephric] duct system are common in patients with cryptorchidism.” ( | |||||||
See Figures 1 and 2 for the test we perform; TS, Theiler stage.
Motif mismatches are more common than sequence deletions in the creation of independently eroded transcription factor binding sites
| Species (sites) | Both mismatches | One mismatch, one deletion | Both deletions |
|---|---|---|---|
| Dolphin & Manatee (5011) | 3122 | 1510 | 379 |
| Naked mole-rat & Star-nosed mole (5121) | 2982 | 1682 | 457 |
| White rhinoceros & Weddell seal (5102) | 3384 | 1371 | 347 |
Independently eroded transcription factor binding sites congregate in the regulatory domains of important trait relevant genes using three species
| Parallel adaptations | Species | Top MGI expression term | Total eroded binding sites (genes) | False discovery rate ( | Fold enrichment | Affected target genes | Fraction of genes in top term affected |
|---|---|---|---|---|---|---|---|
|
| Dolphin Killer whale Manatee |
| 35 (9) | 4.24 × 10−4 | 2.67 |
| 75.00% |
|
| White rhinoceros Weddell seal Dolphin |
| 35 (7) | 2.10 × 10−18 | 9.09 |
| 53.84% |
Figure 3.Examples of independently eroded binding sites next to important genes for independent complex trait loss. Bases identical to human are represented as dots, single dashes represent deletions and double dashes represent non-aligning bases at the query species. (A) In both dolphin and manatee, we find an independently eroded MYF5 binding motif upstream of LMX1B. This binding site falls within an active mouse limb enhancer at E10.5 (27). MYF5 is a myogenic transcription factor, the first myogenic factor to become active in the developing limb bud (59). LMX1B is a LIM homeodomain transcription factor responsible for dorsal cell fate in the developing limb (54,55). (B) In both naked mole-rat and star-nosed mole, we find an independently eroded SP4 binding motif upstream of NRP1. This binding sites lies within a DNaseI peak in human retina embryo (125 days) (27). SP4 is a transcription factor, which controls transcription of photoreceptor-specific genes in conjunction with CRX (60). NRP1 is a transmembrane receptor necessary for proper angiogenesis and arteriogenesis of the retina (57). (C) In both white rhinoceros and Weddell seal, we find an independently eroded MYBL2 binding motif upstream of WT1. This binding site intersects both a DNaseI hypersensitivity cluster and H3K27ac peaks from the ENCODE Project (27). WT1 is a transcription factor involved in both renal and gonadal development. Conditional inactivation of Wt1 in mice causes left-sided cryptorchidism with 40% penetrance (58).