| Literature DB >> 26882984 |
Xiaoming Zhong1, Jiguang Peng1, Qing Sunny Shen1, Jia-Yu Chen1, Han Gao1, Xuke Luan2, Shouyu Yan1, Xin Huang1, Shi-Jian Zhang1, Luying Xu1, Xiuqin Zhang1, Bertrand Chin-Ming Tan3, Chuan-Yun Li4.
Abstract
Although population genetics studies have significantly accelerated the evolutionary and functional interrogations of genes and regulations, limited polymorphism data are available for rhesus macaque, the model animal closely related to human. Here, we report the first genome-wide effort to identify and visualize the population genetics profile in rhesus macaque. On the basis of the whole-genome sequencing of 31 independent macaque animals, we profiled a comprehensive polymorphism map with 46,146,548 sites. The allele frequency for each polymorphism site, the haplotype structure, as well as multiple population genetics parameters were then calculated on a genome-wide scale. We further developed a specific interface, the RhesusBase PopGateway, to facilitate the visualization of these annotations, and highlighted the applications of this highly integrative platform in clarifying the selection signatures of genes and regulations in the context of the primate evolution. Overall, the updated RhesusBase provides a comprehensive monkey population genetics framework for in-depth evolutionary studies of human biology.Entities:
Keywords: RhesusBase.; population genetics; primate evolution; rhesus macaque; whole-genome sequencing
Mesh:
Year: 2016 PMID: 26882984 PMCID: PMC4839223 DOI: 10.1093/molbev/msw025
Source DB: PubMed Journal: Mol Biol Evol ISSN: 0737-4038 Impact factor: 16.240
FPopulation genetics analyses in rhesus macaque. (A) Overview of the genome-wide study to identify and visualize the population genetics profile in rhesus macaque. (B) Dendrogram generated from autosomal polymorphism sites. IDs of Chinese-origin and Indian-origin macaque animals were marked in red and blue, respectively. Branch height represents dissimilarity. (C) Circos plot illustrates the genome-wide distribution of the polymorphism profiles across the rhesus macaque genome, as well as the population genetics parameters associated with individual profiles. Chromosomes are ordered in clockwise direction, with the sequences binned in 3 Mb. Pie slices below represent selected magnification of the circos subplots. Heights for different parameters shown in the circos plot are relative to the overall distributions of the respective parameters along the genome (numbers at the lower edge of the pie slices indicate the range of values shown in the selected subcircle).
Comparison of Population Genetics Features for Protein-Coding Genes in Human and Rhesus Macaque.
| Nucleotide Diversity (π × 103) | Population Mutation Rate (θw×103) | ||||||||
|---|---|---|---|---|---|---|---|---|---|
| Autosome | X Chromosome | Autosome | X Chromosome | ||||||
| Median | Mean±SD | Median | Mean±SD | Median | Mean±SD | Median | Mean±SD | ||
| 0.772 | 1.203±1.409 | 0.429 | 0.849±1.176 | 1.649 | 1.895±1.182 | 1.097 | 1.388±1.104 | ||
| 0.144 | 0.339±0.513 | 0.061 | 0.202±0.405 | 0.552 | 0.693±0.542 | 0.340 | 0.457±0.409 | ||
| 0.331 | 0.481±0.533 | 0.151 | 0.291±0.411 | 0.746 | 0.861±0.547 | 0.439 | 0.551±0.417 | ||
| 0.574 | 0.747±0.720 | 0.277 | 0.434±0.518 | 1.081 | 1.198±0.67 | 0.670 | 0.777±0.500 | ||
| 0.484 | 0.594±0.481 | 0.240 | 0.338±0.351 | 0.915 | 0.99±0.475 | 0.543 | 0.607±0.356 | ||
| 0.758 | 0.818±0.452 | 0.418 | 0.459±0.331 | 1.184 | 1.222±0.387 | 0.736 | 0.775±0.288 | ||
| 0.711 | 0.817±0.546 | 0.397 | 0.467±0.361 | 1.186 | 1.229±0.477 | 0.709 | 0.772±0.347 | ||
| 2.738 | 3.328±2.670 | 1.297 | 1.787±1.785 | 4.446 | 4.989±2.913 | 2.024 | 2.553±1.947 | ||
| 0.419 | 0.775±1.037 | 0.263 | 0.585±0.837 | 0.939 | 1.329±1.283 | 0.488 | 0.841±0.945 | ||
| 0.977 | 1.263±1.109 | 0.450 | 0.681±0.748 | 1.709 | 2.006±1.343 | 0.706 | 0.969±0.850 | ||
| 1.65 | 2.035±1.709 | 0.758 | 1.071±1.083 | 2.847 | 3.179±1.984 | 1.240 | 1.583±1.302 | ||
| 1.277 | 1.512±1.087 | 0.537 | 0.737±0.703 | 2.146 | 2.368±1.300 | 0.839 | 1.059±0.818 | ||
| 2.274 | 2.334±1.054 | 1.134 | 1.247±0.904 | 3.327 | 3.395±1.165 | 1.515 | 1.625±0.928 | ||
| 2.107 | 2.256±1.135 | 1.012 | 1.161±0.717 | 3.3 | 3.355±1.276 | 1.402 | 1.533±0.748 | ||
FEvolutionary and functional interrogation of single genes in RhesusBase PopGateway. (A) An expanded view of RhesusBase Genome Browser at the CASP12 locus, shown with comprehensive position-based annotations, such as the gene structure, polymorphisms, nucleotide diversity (π), population mutation rate (θ), Tajima’s D, and linkage disequilibrium. (B) A typical RhesusBase Population Genetics Page showing the comparative population genetics features of CASP12 in human and rhesus macaque. (C) For each gene of interest, three annotation pages—the Gene Information, Gene Exon Usage, and Gene Expression—were developed for visualizing, respectively, the basic gene annotations, the exon splicing efficiency, and gene expression levels in a comparative view of human, rhesus macaque, and mouse. Detailed information could be further accessed by clicking the corresponding links. For example, the transcript expression of the gene of interest is shown in RPKM (Reads Per Kilobase of exon model per Million mapped reads) and marked on the body map, with the color intensity corresponding to its relative expression level in each tissue of the different species.
FApplication of the RhesusBase PopGateway in illustrating primate regulations and their functional implications. (A) The ratios of nucleotide diversity (π) for nonsynonymous sites to synonymous sites were summarized in boxplots, for human-specific pseudogenes and their functional orthologs in rhesus macaque, respectively. (B–D) For a candidate genomic region under balancing selection, the haplotype structures in 31 macaque animals (62 haplotypes) were shown in hierarchical clustering chart (B). The evolutionary relationship of these haplotypes were further summarized in (C), with the size of each node proportional to the haplotype frequency, and the numbers on branches representing nucleotide substitution sites. Pairwise mismatch distribution of these haplotypes were further calculated and shown, by counting the number of differences between all pairs of the 62 haplotypes (D). The multimodal distribution recapitulates the existence of two major types of haplotype.