| Literature DB >> 22747632 |
Qiaoping Yuan1, Zhifeng Zhou, Stephen G Lindell, J Dee Higley, Betsy Ferguson, Robert C Thompson, Juan F Lopez, Stephen J Suomi, Basel Baghal, Maggie Baker, Deborah C Mash, Christina S Barr, David Goldman.
Abstract
BACKGROUND: As a model organism in biomedicine, the rhesus macaque (Macaca mulatta) is the most widely used nonhuman primate. Although a draft genome sequence was completed in 2007, there has been no systematic genome-wide comparison of genetic variation of this species to humans. Comparative analysis of functional and nonfunctional diversity in this highly abundant and adaptable non-human primate could inform its use as a model for human biology, and could reveal how variation in population history and size alters patterns and levels of sequence variation in primates.Entities:
Mesh:
Year: 2012 PMID: 22747632 PMCID: PMC3426462 DOI: 10.1186/1471-2156-13-52
Source DB: PubMed Journal: BMC Genet ISSN: 1471-2156 Impact factor: 2.797
Sequence coverage and putative SNPs detected in 14 rhesus macaques () and 14 humans () using parallel methods in a genic-enriched target region
| Genome size in reference assembly (Mb) | 3,080 | 2,864 | |
| Non-gap reference genome size (Mb) | 2,858 | 2,647 | |
| Unique coding sequence size in reference (Mb) | 32.5 | 31.8 | |
| Sample number | 14 | 14 | |
| Average 36-base reads per sample | 17.4 x 106 | 14.4 x 106 | |
| Total length (Mb) of uniquely mapped reads | 8,770 | 7,266 | |
| Mb in genome with (≥1x sequence coverage) | 1,505 | 1,571 | |
| Mb in genome with (≥3x sequence coverage) | 426 | 435 | |
| SNPs in dbSNP_B 131 | 23,653,737 | 7,880 | |
| SNPs in this study | 230,028 | 462,802 | |
| Also in dbSNP_B131 | 206,267(89.7%) | 34(0.0%) | |
| Transition | AG,GA,TC,CT | 155,836(67.7%) | 312,643(67.4%) |
| Transversion | AC,CA,TG,GT | 37,046(16.1%) | 79,061(17.1%) |
| Transversion | CG,GC | 25,467(11.1%) | 46,820(10.1%) |
| Transversion | AT,TA | 11,679(5.1%) | 24,857(5.4%) |
| Genes with SNPs | 14,675 | 16,797 | |
| Genes with SNPs in exons | 11,200 | 12,466 | |
| SNPs located in intergenic regions | 107,461(46.7%) | 269,390(58.2%) | |
| SNPs located in 5Kb upstream of TSS | 10,036(4.4%) | 26,303(5.7%) | |
| SNPs located in UTR | 18,432(8.0%) | 15,455(3.3%) | |
| SNPs located in intron | 79,875(34.7%) | 130,443(28.2%) | |
| SNPs located in CDS | 14,224(6.2%) | 21,211(4.6%) | |
| Synonymous | 8,329(58.6%) | 13,798(65.1%) | |
| Non-synonymous | 5,877(41.3%) | 7,367(34.7%) | |
| Damaging* | 1,741 (29.6%) | 1,525 (20.7%) | |
| Nonsense | 18(0.1%) | 46(0.2%) | |
*damaging as evaluated by Polyphen.
Figure 1Average SNP density (SNPs/Kb) in humans and rhesus macaques, comparatively measuring variation in different genomic regions.A). SNPs/Kb is calculated for nucleotides with > = 4x sequence coverage. B). Average SNP density of human and macaque calculated for different sequencing coverages. SNP density based on all macaque samples is shown with the blue line. SNP density with macaque K20 omitted is shown with the green line and SNP densities with other individual animals omitted one-by-one are shown with dashed blue lines. SNP density based on 100% Indian macaques only is shown with the pink dashed line. C). The SNP densities of five genomic regions in human and macaque; D). Human and macaque dN/dS ratios for cSNPs. Error bars in (C) and (D) indicate standard error of mean.
Comparative SNP density based on sequencing studies of the human and the rhesus macaque
| Venter | Sanger method | 1.41* |
| Watson | 454 Sequencing System (Roche) | 1.46* |
| Chinese (YH) | Genome Analyzer (Illumina) | 1.35* |
| African (NA18507) | Genome Analyzer (Illumina) | 1.58* |
| African (NA18507) | SOLiD system (Applied Biosystems) | 1.69* |
| Korean (SJK) | Genome Analyzer (Illumina) | 1.50* |
| Korean (AK1) | Genome Analyzer (Illumina) | 1.51* |
| Proband (III-4) | SOLiD system (Applied Biosystems) | 1.50* |
| CEU,YRI | Genome Analyzer, SOLiD, 454 | 1.21-1.48** |
| Humans, in this study | Genome Analyzer (Illumina) | 1.07(0.97-1.26) |
| Macaques, in this study | Genome Analyzer (Illumina) | 2.82(1.88-3.71) |
*The SNP number was from Lupski et al. 2010 [19] and SNPs/Kb was calculated based on the total SNPs reported, an estimated size of 2.85 x 109 for the sequenced human genome and an estimate that 80% of the genome was accessible, for example due to mappability [17].
**Based on SNPs detected and accessible genome size from the high coverage Pilot Trios data of the 1000 Genomes Project [17].
Figure 2Average nucleotide diversity (θ) of humans and rhesus macaques, comparatively measuring variation in different genomic regions. A). θSNP is calculated for nucleotides with > = 6x sequence coverage. B). Nucleotide diversity of human and macaque calculated for different sequencing coverages. C). Average nucleotide diversity of five different genomic regions; D). Average nucleotide diversity for synonymous cSNPs and nsSNPs. Error bars in (C) and (D) indicate standard error of mean.
Figure 3Analysis of homozygosity/heterozygosity for cSNPs, based on the idea that severely damaging variants are less likely to be homozygous. H.S: human synonymous cSNPs; M.S: macaque synonymous cSNPs; H.B: human nsSNPs predicted to be “benign” by PolyPhen; M.B: macaque nsSNPs predicted to be “benign” by PolyPhen; H.D: human nsSNPs predicted to be damaging by PolyPhen; M.D: macaque nsSNPs predicted to be damaging by PolyPhen.