| Literature DB >> 22747675 |
Atsunori Higashino, Ryuichi Sakate, Yosuke Kameoka, Ichiro Takahashi, Makoto Hirata, Reiko Tanuma, Tohru Masui, Yasuhiro Yasutomi, Naoki Osada.
Abstract
BACKGROUND: The genetic background of the cynomolgus macaque (Macaca fascicularis) is made complex by the high genetic diversity, population structure, and gene introgression from the closely related rhesus macaque (Macaca mulatta). Herein we report the whole-genome sequence of a Malaysian cynomolgus macaque male with more than 40-fold coverage, which was determined using a resequencing method based on the Indian rhesus macaque genome.Entities:
Mesh:
Year: 2012 PMID: 22747675 PMCID: PMC3491380 DOI: 10.1186/gb-2012-13-7-r58
Source DB: PubMed Journal: Genome Biol ISSN: 1474-7596 Impact factor: 13.583
Figure 1Overview of the Malaysian cynomolgus macaque genome sequencing and analysis. Fragment runs of eight slides and mate-pair runs of four slides (insert size: two slides for 600 to 800 bp and two slides for 800 to 1,000 bp) were performed on the SOLiD 3 Plus System. In total, 4.9 × 109 sequence reads were generated and mapped on the reference genome. After the high-quality reads were selected, single nucleotide variants (SNVs) and indel analyses were conducted.
Summary of SOLiD libraries and sequence reads
| Library | Read length (bp) | Insert size (bp) | Runs | Reads | Mapped reads | Analyzed readsa | Coverage depth of analyzed reads |
|---|---|---|---|---|---|---|---|
| Fragment | 50 | - | 8 | 2,648,128,521 | 1,976,720,560 (74.7%) | 1,974,496,337 (74.6%) | 33.4 |
| Mate-pair A | 25 (×2) | 600-800 | 2 | 906,783,481 | 621,175,871 (68.5%) | 355,589,008 (39.2%)b | 3.4 |
| Mate-pair B | 25 (×2) | 800-1,000 | 2 | 1,335,583,547 | 814,866,634 (61.0%) | 508,168,736 (38.0%)b | 4.8 |
| Total | - | - | 12 | 4,890,495,549 | 3,412,763,065 (69.8%) | 2,838,254,081 (58.0%) | 41.5 |
aReads mapped on chrM and chrUr were removed. b'PCR or optical duplicates' (defined by Bioscope; mapped more than 100 loci) were removed, and properly paired reads were selected; each read of a pair was mapped on the same chromosome in a proper direction at a proper distance from each other.
Figure 2SNV discovery rate and rhesus macaque genome quality. The red and blue lines represent the rates of homozygous and heterozygous SNVs, respectively, with given rhesus macaque genome sequence quality values (QVs). SNVs at sites having QV < 45 (left of the dashed line) were filtered out.
Number of single nucleotide variants
| Chromosome | Heterozygous SNVs | Homozygous SNVs | Aa | Sb | UTRc | Intronic | Intergenic |
|---|---|---|---|---|---|---|---|
| Autosomes | 4,880,874 | 4,527,169 | 25,079 | 38,233 | 42,930 | 2,878,903 | 6,422,898 |
| X chromosomes | -d | 245,769 | 444 | 701 | 986 | 50,877 | 192,761 |
| Total | 4,880,874 | 4,772,938 | 25,523 | 38,934 | 43,916 | 2,928,970 | 6,615,659 |
aNumber of nonsynonymous SNVs. bNumber of synonymous SNVs. cNumber of SNVs in untranslated regions. dOnly homozygous SNVs were considered on the X chromosome.
Figure 3Distribution of small-indel lengths identified in the Malaysian cynomolgus macaque genome. (a) Indels in coding regions. (b) Indels in intergenic regions. The indels were identified using the information from high-quality reads.
Figure 4Estimation of ancestral population size of cynomolgus macaques with PSMC software. The x and y axes show population size and the time (thousand years (kyr)) from present, respectively. The blue rectangles represent 95% confidence interval with bootstrap resampling.
Figure 5Screenshots of the macaque genome database. (a) cDNA clones, BAC clones, microsatellite markers, gene predictions, and SNVs on chromosome 1 are shown in the genome browser. (b) Detailed information on each SNV is linked from the browser.