| Literature DB >> 28175287 |
Laia Capilla1,2, Rosa Ana Sánchez-Guillén1,3, Marta Farré3, Andreu Paytuví-Gallart4,5, Roberto Malinverni6, Jacint Ventura2, Denis M Larkin3, Aurora Ruiz-Herrera1,5.
Abstract
Understanding how mammalian genomes have been reshuffled through structural changes is fundamental to the dynamics of its composition, evolutionary relationships between species and, in the long run, speciation. In this work, we reveal the evolutionary genomic landscape in Rodentia, the most diverse and speciose mammalian order, by whole-genome comparisons of six rodent species and six representative outgroup mammalian species. The reconstruction of the evolutionary breakpoint regions across rodent phylogeny shows an increased rate of genome reshuffling that is approximately two orders of magnitude greater than in other mammalian species here considered. We identified novel lineage and clade-specific breakpoint regions within Rodentia and analyzed their gene content, recombination rates and their relationship with constitutive lamina genomic associated domains, DNase I hypersensitivity sites and chromatin modifications. We detected an accumulation of protein-coding genes in evolutionary breakpoint regions, especially genes implicated in reproduction and pheromone detection and mating. Moreover, we found an association of the evolutionary breakpoint regions with active chromatin state landscapes, most probably related to gene enrichment. Our results have two important implications for understanding the mechanisms that govern and constrain mammalian genome evolution. The first is that the presence of genes related to species-specific phenotypes in evolutionary breakpoint regions reinforces the adaptive value of genome reshuffling. Second, that chromatin conformation, an aspect that has been often overlooked in comparative genomic studies, might play a role in modeling the genomic distribution of evolutionary breakpoints.Entities:
Keywords: rodents; evolutionary breakpoints; recombination; KRAB genes; epigenome; lamina associated domains
Mesh:
Year: 2016 PMID: 28175287 PMCID: PMC5521730 DOI: 10.1093/gbe/evw276
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
. 1.—EBRs mapped in the time tree of the mammalian species included in the study. Time tree was based on divergence times (autocorrelated rates and hard-bounded constraints) described by Meredith et al. (2011), to the exception of two species (M. musculus and R. norvegicus) and one clade (Muridae) which were estimated from dos Reis et al. (2012) time tree. In the upper section of each branch, the mean rate of EBRs per Myr and the range (in brackets) is shown. Numbers framed in squares represent mammalian phylogenetic nodes: 1: Boreoeutheria; 2: Euarchontoglires; 3: Catarrhini; 4: Hominoidea; 5: Rodentia; 6: Myodonta; 7: Muroidea; 8: Cricetidae + Muridae; 9: Muridae; 10: Laurasiatheria.
. 2.—EBRs mapped in the mouse genome. The positions of EBRs detected (lineage and clade-specific) are color-coded (see inset legend) along mouse (MMU, M. musculus) chromosomes. The number of protein-coding genes detected within each EBR is depicted on the right of each chromosome.
Gene Content in EBRs
| Protein-Coding Genes | ||
|---|---|---|
| EBR Type | ||
| Mouse specific | 0.029 | 2.53 |
| Muridae specific | 0.009 | 1.43 |
| Cricetidae + Muridae specific | 0.049 | 2.95 |
| Muroidea specific | 0.004 | 3.81 |
| Myodonta specific | 0.009 | 2.93 |
| Rodentia specific | 0.003 | 3.21 |
| All EBRs | 0.001 | 6.25 |
Note.—Analysis of 10,000 permutation test. P-values are represented for each type of EBR detected in the mouse genome. Significant P-values indicate an accumulation of genes for each EBR analyzed when compared with the rest of mouse genome.
P-value < 0.05.
P-value < 0.01.
. 3.—Genome wide analysis of gene content and recombination rates. (A) Schematic representation of the genomic regions considered for the analysis (see “Materials and methods” section for details). (B) Distribution of protein-coding genes. The X-axis represents the genomic regions analyzed, whereas the Y-axis display the mean number of genes detected per 10 kbp. (C) Distribution of recombination rates. The X-axis represents the genomic regions analyzed, whereas the Y-axis displays the mean recombination rate detected per 10 kbp. (D) Distribution of constitutive Lamina Associated Domains (cLADs). The X-axis represents de genomic regions analyzed, whereas the y-axis display the mean number of cLADs identified per each 10 kbp windows. Standard error bars are represented. Punctuated lines represent genome-wide means. Asterisk indicates statistical significance (Kruskal–Wallis test, **P-value < 0.001).
Gene Clusters Found Enriched within EBRs
| Chr | EBR Analysis | Gene Analysis | ||||
|---|---|---|---|---|---|---|
| Start (bp) | End (bp) | EBR Type | Gene Family | ID | Distance EBR Start (kbp) | |
| 2 | 25,510,722 | 25,615,814 | Rodentia specific | Calycin | Lcn5: Lipocalin 5 | −2.8 |
| Lcn6: Lipocain 6 | −21.6 | |||||
| Lcn10: Lipocain 10 | −27.5 | |||||
| Lcn13: Lipocalin 13 | −44.8 | |||||
| Lcn14: Lipocalin 14 | −81.8 | |||||
| 26,481,623 | 26,536,687 | Mouse specific | Lcn4: Lipocalin 4 | −41.6 | ||
| 11 | 32,168,628 | 32,232,893 | Mouse specific | Hemoglobin | Hba-X: Hemoglobin X | −7.7 |
| Hba-a1 and Hba-a2: Hemoglobin alpha-like embryonic chain in Hba complex | −14.9 | |||||
| Hbq1b: Hemoglobin, theta 1B | −18.3 | |||||
| Hbq1a: Hemoglobin, theta 1A | −31.4 | |||||
| 13 | 48,534,105 | 48,607,849 | Muridae specific | Krueppel associated box | Zfp169: zinc finger protein 169 | −50.4 |
| 17 | 15,680,043 | 15,701,318 | Muridae specific | Prdm9: PR domain containing 9 | −11.3 | |
| X | 20,596,836 | 20,735,882 | Mouse specific | Zfp182: zinc finger protein 182 | −9.2 | |
| 20,596,836 | 20,735,882 | Mouse specific | Zfp300: zinc finger protein 300 | −59.4 | ||
| 20,596,836 | 20,735,882 | Mouse specific | Ssxa1: Synovial sarcoma, X member A, breakpoint 1 | −96.1 | ||
Note.—For each EBR included in the table we have specified the mouse chromosome (chr), the start and end position (in bp), the corresponding gene enrichment cluster or gene family name, the ID and the distance of the gene start from the up-stream region of the EBR (in kbp).
. 4.—Heat maps representing significant association found when comparing Rodentia EBRs (left panel) and control genome-like regions (right panel) with epigenetic modifications in 58 different mouse cell lines based on 10,000 permutation test with randomization (P-value < 0.05). Red squares indicate positive association (enrichment with P-value < = 0.05); white squares indicate no statistical association (P-value > 0.05), whereas blue squares indicate depletion (P-value < = 0.05). Black squares reflect no data available. The x-axis represents: (1x) Skeletal system, (2x) Muscular system, (3x) Circulatory system, (4x) Nervous system, (5x) Respiratory system, (6x) Digestive system, (7x) Excretory system, (8x) Endocrine system, (9x) Reproductive system, (10x) Lymphatic system, (11x) Stem cells, and (12x) Other. The y-axis shows: (1y) Histone modifications leading to “close” chromatin, (2y) Histone modifications associated with “open” chromatin, (3y) DNase-seq, (4y) Transcription factors, (5y) Other.