| Literature DB >> 27659771 |
Zheng Su1, Junjie Zhang2, Chanchal Kumar3, Cliona Molony4, Hongchao Lu5, Ronghua Chen5,6, David J Stone7, Fei Ling2, Xiao Liu1,8.
Abstract
Nonhuman primates (NHP) are important biomedical animal models for the study of human disease. Of these, the most widely used models in biomedical research currently are from the genus Macaca. However, evolutionary genetic divergence between human and NHP species makes human-based probes inefficient for the capture of genomic regions of NHP for sequencing and study. Here we introduce a new method to resequence the exome of NHP species by a designed capture approach specifically targeted to the NHP, and demonstrate its superior performance on four NHP species or subspecies. Detailed investigation on biomedically relevant genes demonstrated superior capture by the new approach. We identified 28 genes that appeared to be pseudogenized and inactivated in macaque. Finally, we identified 187 genes showing strong evidence for positive selection across all branches of the primate phylogeny including many novel findings.Entities:
Year: 2016 PMID: 27659771 PMCID: PMC5034232 DOI: 10.1038/srep33876
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1The experimental design for methods comparison.
Samples, replicates and aligning references used in this study.
| Sample ID | Population | Capture array | No. of replicates | Reference |
|---|---|---|---|---|
| MC1-HP | Mauritian cynomolgus macaque | HP | 4 | IR/CE |
| MC2-HP | Mauritian cynomolgus macaque | HP | 4 | IR/CE |
| IC1-HP | Indonesian cynomolgus macaque | HP | 4 | IR/CE |
| IC2-HP | Indonesian cynomolgus macaque | HP | 4 | IR/CE |
| MC1-MP | Mauritian cynomolgus macaque | MP | 4 | IR/CE |
| MC2-MP | Mauritian cynomolgus macaque | MP | 4 | IR/CE |
| IC1-MP | Indonesian cynomolgus macaque | MP | 4 | IR/CE |
| IC2-MP | Indonesian cynomolgus macaque | MP | 4 | IR/CE |
| CC1-MP | Vietnamese cynomolgus macaque | MP | 1 | IR/CE |
| CC2-MP | Vietnamese cynomolgus macaque | MP | 1 | IR/CE |
| CR1-MP | Chinese rhesus macaque | MP | 1 | IR/CR |
| CR2-MP | Chinese rhesus macaque | MP | 1 | IR/CR |
Figure 2The exome coverage of Human probes (HP) and Monkey Probes (MP) in correlation with sequence data amount.
Shown in the figure is the percentage of whole exome covered by one read (1x) and twenty reads (20x) in HP and MP platform, by reads aligned to cynomolgus genome. With same amount of raw sequence data, the 20x exome coverage is significantly higher in MP than HP.
Figure 3The distribution of coverage in genes by at least one read.
Bars of different color correspond to performance of MP and HP platform with reads aligned to Indian rhesus reference genome (IR) and cynomolgus genome (CE). For example, green bar (CE-MP) indicates the MP platform with reads aligned to cynomolgus genome. The height of each bar represents the number of genes with the corresponding coverage. (A) distribution of genes with coverage from 0–100%, divided by 10 windows of 10%. (B) distribution of genes with coverage ranging from 90–100%. In both CE and IR alignment, MP tends to provide a more complete gene coverage than HP.
Figure 4Coverage of biomedically relevant genes.
(A) x-axis is the percentage difference of coverage on biomedical genes between MP and HP, while Y-axis is the cumulative number of genes with the corresponding difference of gene coverage. Blue curve illustrates the distribution of genes with better coverage in MP, while green curve illustrates the genes with better coverage in HP. Mean value and Standard Deviation are showed from all the replicates of all the samples. (B) Gene coverage in MP and HP platform for sample MC1-1. Shown are all the biomedical genes covered above 90% by at least one read in MP or HP platform. Green dots are the coverage of each gene in MP, and red dots are the coverage in HP. Blue curve corresponds to the coverage difference of each gene.
The top 50 genes showing evidence for positive selection.
| ENSP00000348444 | 2 | titin | 8.29E-41 | 7.15E-37 | |
| ENSP00000355533 | 1 | ryanodine receptor 2 (cardiac) | 6.04E-26 | 3.47E-22 | |
| ENSP00000370077 | 9 | chromosome 9 open reading frame 93 | 1.69E-13 | 7.27E-10 | |
| ENSP00000296682 | 5 | PR domain containing 9 | 5.78E-12 | 1.66E-08 | |
| ENSP00000348799 | 16 | activating transcription factor 7 interacting protein 2 | 1.87E-11 | 4.14E-08 | |
| ENSP00000271332 | 1 | cadherin, EGF LAG seven-pass G-type receptor 2 | 1.92E-11 | 4.14E-08 | |
| ENSP00000341489 | 19 | synapse defective 1, Rho GTPase, homolog 1 (C. elegans) | 8.22E-11 | 1.58E-07 | |
| ENSP00000159087 | 19 | anoctamin 8 | 1.02E-10 | 1.76E-07 | |
| ENSP00000352572 | 21 | pericentrin | 1.43E-10 | 2.24E-07 | |
| ENSP00000366604 | 11 | splicing factor 1 | 2.11E-10 | 3.04E-07 | |
| ENSP00000273739 | 4 | slit homolog 2 (Drosophila) | 2.46E-10 | 3.27E-07 | |
| ENSP00000322218 | 12 | phosphatidylinositol transfer protein, membrane-associated 2 | 1.13E-09 | 1.39E-06 | |
| ENSP00000233607 | 19 | adenomatosis polyposis coli 2 | 2.43E-09 | 2.79E-06 | |
| ENSP00000354597 | 1 | DENN/MADD domain containing 4B | 3.01E-09 | 3.04E-06 | |
| ENSP00000419981 | 3 | DAZ interacting protein 3, zinc finger | 3.08E-09 | 3.04E-06 | |
| ENSP00000271610 | 1 | protein tyrosine phosphatase, receptor type, C | 3.18E-09 | 3.04E-06 | |
| ENSP00000358001 | 10 | transforming, acidic coiled-coil containing protein 2 | 9.29E-09 | 8.44E-06 | |
| ENSP00000386190 | 2 | chromosome 2 open reading frame 16 | 1.09E-08 | 9.36E-06 | |
| ENSP00000352011 | 17 | calcium channel, voltage-dependent, T type, alpha 1G subunit | 1.42E-08 | 1.15E-05 | |
| ENSP00000353717 | 17 | coiled-coil domain containing 144A | 2.17E-08 | 1.62E-05 | |
| ENSP00000379823 | 2 | collagen, type IV, alpha 3 (Goodpasture antigen) | 3.53E-08 | 2.53E-05 | |
| ENSP00000251020 | 16 | sal-like 1 (Drosophila) | 3.85E-08 | 2.55E-05 | |
| ENSP00000362405 | 2 | kinesin family member 1A | 4.08E-08 | 2.61E-05 | |
| ENSP00000317144 | 13 | progesterone immunomodulatory binding factor 1 | 4.88E-08 | 2.92E-05 | |
| ENSP00000262765 | 17 | glutamine rich 2 | 5.38E-08 | 3.09E-05 | |
| ENSP00000229794 | 6 | mitogen-activated protein kinase 14 | 6.32E-08 | 3.51E-05 | |
| ENSP00000420736 | 3 | forkhead box P1 | 7.24E-08 | 3.90E-05 | |
| ENSP00000395916 | 6 | eyes absent homolog 4 (Drosophila) | 7.91E-08 | 4.13E-05 | |
| ENSP00000264382 | 3 | sucrase-isomaltase (alpha-glucosidase) | 9.48E-08 | 4.81E-05 | |
| ENSP00000341565 | 11 | family with sequence similarity 111, member B | 1.02E-07 | 5.02E-05 | |
| ENSP00000357169 | 1 | Fc receptor-like 3 | 1.32E-07 | 6.33E-05 | |
| ENSP00000384570 | 22 | caspase recruitment domain family, member 10 | 1.42E-07 | 6.62E-05 | |
| ENSP00000272367 | 2 | C-type lectin domain family 4, member F | 3.24E-07 | 1.47E-04 | |
| ENSP00000408893 | 3 | interleukin 1 receptor accessory protein | 4.08E-07 | 1.80E-04 | |
| ENSP00000351338 | 3 | plexin B1 | 4.30E-07 | 1.85E-04 | |
| ENSP00000377003 | 12 | tetraspanin 8 | 6.07E-07 | 2.55E-04 | |
| ENSP00000356975 | 1 | ADAM metallopeptidase with thrombospondin type 1 motif, 4 | 9.23E-07 | 3.70E-04 | |
| ENSP00000383909 | 1 | DMRT-like family A2 | 1.08E-06 | 4.22E-04 | |
| ENSP00000420381 | 7 | calumenin | 1.14E-06 | 4.29E-04 | |
| ENSP00000342305 | 20 | small nuclear ribonucleoprotein polypeptides B and B1 | 1.14E-06 | 4.29E-04 | |
| ENSP00000345873 | 7 | solute carrier family 26, member 3 | 1.33E-06 | 4.79E-04 | |
| ENSP00000398045 | 1 | multiple EGF-like-domains 6 | 1.37E-06 | 4.83E-04 | |
| ENSP00000360916 | 9 | vav 2 guanine nucleotide exchange factor | 1.43E-06 | 4.94E-04 | |
| ENSP00000356116 | 6 | AT rich interactive domain 1B (SWI1-like) | 1.47E-06 | 4.97E-04 | |
| ENSP00000164024 | 3 | cadherin, EGF LAG seven-pass G-type receptor 3 (flamingo homolog, Drosophila) | 1.84E-06 | 6.09E-04 | |
| ENSP00000264818 | 19 | tyrosine kinase 2 | 2.36E-06 | 7.55E-04 | |
| ENSP00000410708 | 3 | ELL associated factor 2 | 2.47E-06 | 7.74E-04 | |
| ENSP00000263723 | 1 | chloride channel accessory 4 | 2.66E-06 | 8.18E-04 | |
| ENSP00000396032 | 18 | microtubule crosslinking factor 1 | 3.55E-06 | 1.06E-03 | |
| ENSP00000391295 | 11 | SIK family kinase 3 | 3.56E-06 | 1.06E-03 |
*Genes were ranked by P values. The P value and FDR were inferred from the species tree. Refer to Table S8 for the gene-tree result.