| Literature DB >> 32970804 |
Jing Li1,3, Zhenxin Fan1,3, Feichen Shen4, Amanda L Pendleton4, Yang Song1, Jinchuan Xing5, Bisong Yue1, Jeffrey M Kidd4, Jing Li1,3.
Abstract
Copy number variation (CNV) can promote phenotypic diversification and adaptive evolution. However, the genomic architecture of CNVs among Macaca species remains scarcely reported, and the roles of CNVs in adaptation and evolution of macaques have not been well addressed. Here, we identified and characterized 1,479 genome-wide hetero-specific CNVs across nine Macaca species with bioinformatic methods, along with 26 CNV-dense regions and dozens of lineage-specific CNVs. The genes intersecting CNVs were overrepresented in nutritional metabolism, xenobiotics/drug metabolism, and immune-related pathways. Population-level transcriptome data showed that nearly 46% of CNV genes were differentially expressed across populations and also mainly consisted of metabolic and immune-related genes, which implied the role of CNVs in environmental adaptation of Macaca. Several CNVs overlapping drug metabolism genes were verified with genomic quantitative polymerase chain reaction, suggesting that these macaques may have different drug metabolism features. The CNV-dense regions, including 15 first reported here, represent unstable genomic segments in macaques where biological innovation may evolve. Twelve gains and 40 losses specific to the Barbary macaque contain genes with essential roles in energy homeostasis and immunity defense, inferring the genetic basis of its unique distribution in North Africa. Our study not only elucidated the genetic diversity across Macaca species from the perspective of structural variation but also provided suggestive evidence for the role of CNVs in adaptation and genome evolution. Additionally, our findings provide new insights into the application of diverse macaques to drug study.Entities:
Keywords: adaptive evolution; drug metabolism; genetic diversity; macaque; structural variation
Year: 2020 PMID: 32970804 PMCID: PMC7846157 DOI: 10.1093/gbe/evaa200
Source DB: PubMed Journal: Genome Biol Evol ISSN: 1759-6653 Impact factor: 3.416
Information on Genome Data in This Study
| Scientific Names | Sample Identifier(s) | GenBank Accession(s) | Sequencing Platform(s) | # Reads | Genome Depth | Total Usable Base Pairs | Sex | Sample Origin(s) | Source(s) |
|---|---|---|---|---|---|---|---|---|---|
|
| Mmul_8 | — | Illumina | 20,100,000 | 5.1× | — | Female | Washington National Primate Research Center |
|
|
| CR | SRA023856 | Illumina | 3,299,851,568 | 45.65× | 2,264,143,011 | Female | Yunnan, China |
|
|
| CE | SRA023855 | Illumina | 3,299,851,568 | 43.96× | 2,245,482,535 | Female | Vietnam |
|
|
| SM | SRX1470574 | Illumina | 1,001,034,260 | 34.55× | 2,280,352,231 | Female | Southwestern China |
|
|
| TM | SRP032525 | Illumina | 1,275,012,390 | 36.92× | 2,281,638,762 | Female | Sichuan, China |
|
|
| PM | SRX1022644 | Illumina | 770,413,198 | 25.59× | 2,246,079,419 | Female | Washington National Primate Research Center | Baylor College of Medicine |
|
| JM | SRR11921216 | Illumina | 2,258,829,541 | 83.85× | 2,271,704,290 | Female | Kyoto Primate Research Center | Fan ZX, Zhou AB, Xing JC, Hey J, Osada N, Melnick DJ, Yue BS, Li J. (unpublished data) |
|
| TwM | SRR11921217 | Illumina | 2,279,695,913 | 24.66× | 2,279,695,913 | Female | Kyoto Primate Research Center | Fan ZX, Zhou AB, Xing JC, Hey J, Osada N, Melnick DJ, Yue BS, Li J. (unpublished data) |
|
| BM | SRR11921218, SRR11927939–SRR11927943 | Illumina | 2,226,490,341 | 45.91× | 2,226,490,341 | Female | Columbia University | In this study |
|
| LM | SRR11921219, SRR11927944–SRR11927948 | Illumina | 2,241,953,780 | 46.49× | 2,241,953,780 | Female | Columbia University | In this study |
Fig. 1Genomic patterns of duplications across the nine macaque species. (A) Cumulative lengths of duplications detected in three or more 1-kb windows across all chromosomes for each sample. (B) Proportion of cumulative duplication lengths for each chromosome. (C) Copy number (blue histograms) across the large duplication on chromosome 14 of Japanese macaque is visualized in UCSC genome browser relative to other macaques (species symbols on left) and in the context of Ensembl gene models (red). Copy number was estimated in windows containing 1 kb of nongap, nonmasked sequence. As a result, the genomic span of individual windows is variable and may include positions annotated as assembly gaps.
Fig. 2Cumulative lengths of the shared duplications detected in three or more 1-kb windows on each chromosome in the nine Macaca species. (A) Average ratios of the cumulative lengths of shared duplications to the cumulative length of duplications per chromosome across the nine samples. Error bars represent the standard deviations of the ratios among the nine species. (B) The cumulative lengths (green bars) of the shared duplications per chromosome and the percentage (red line) of shared duplications on each chromosome in terms of length.
Fig. 3Genomic distribution of all interspecific CNVs (detected in three or more 1-kb windows) across the nine Macaca species. The blue rectangles represent duplication CNVs and the red rectangles represent deletion CNVs.
Enrichment Outputs of Genes Intersecting the CNVs and Their 5-kb Flanking Sequences Using KOBAS 3.0: (A) Enriched KEGG Pathways and (B) Enriched GO Terms (Only Exhibiting the Highest Category in the Tree for GO Terms Containing Exactly the Same Genes)
| Term | ID | Input No. | Background No. |
| Corrected |
| CNV ID |
|---|---|---|---|---|---|---|---|
| ( | |||||||
| Pentose and glucuronate interconversions | mcc00040 | 6 | 21 | 1.9E-05 | 0.034 |
| chr16-19414797–19421121, chr3-160661084–160869232, chr5-65506980–65623157, chr5-65628860–65729639, chr12-116232453–116238344, chr5-65466947–65505089 |
| Ascorbate and aldarate metabolism | mcc00053 | 5 | 14 | 4.0E-05 | 0.037 |
| chr5-65628860–65729639, chr16-19414797–19421121, chr12-116232453–116238344, chr5-65506980–65623157, chr5-65466947–65505089 |
| Viral myocarditis | mcc05416 | 7 | 42 | 7.8E-05 | 0.047 |
| chr1-188324769–188338450, chr4-33338105–33352404, chr4-33355720–33386474, chr4-33406118–33412221, chr4-30203434–30413162, chr6-154795623–154806159, chr1-14308663–14373529, chr4-33928581–33933207, chr3-39508480–39512271 |
| Retinol metabolism | mcc00830 | 7 | 45 | 1.1E-04 | 0.052 |
| chr5-65506980–65623157, chr5-65628860–65729639, chr9-90297250–90344108, chr11-55695239–55754415, chr12-116232453–116238344, chr7-34700918–34706034, chr5-65466947–65505089 |
| Chemical carcinogenesis | mcc05204 | 7 | 55 | 3.5E-04 | 0.013 |
| chr1-110697591–110747857, chr5-65506980–65623157, chr5-65628860–65729639, chr9-90297250–90344108, chr12-116232453–116238344, chr19-36763842–36769753, chr5-65466947–65505089 |
| Antigen processing and presentation | mcc04612 | 6 | 44 | 6.7E-04 | 0.15 |
| chr19-50037741–50073879, chr4-33338105–33352404, chr4-33355720–33386474, chr4-33406118–33412221, chr4-30203434–30413162, chr11-10736952–10744681, chr11-10746926–10788644, chr4-33928581–33933207 |
| Porphyrin and chlorophyll metabolism | mcc00860 | 5 | 29 | 7.4E-04 | 0.15 |
| chr5-65628860–65729639, chr12-116232453–116238344, chr5-65506980–65623157, chr5-65466947–65505089, chr15-85248111–85260241 |
| Drug metabolism: cytochrome P450 | mcc00982 | 6 | 48 | 0.001 | 0.17 |
| chr1-110697591–110747857, chr5-65506980–65623157, chr5-65628860–65729639, chr9-90297250–90344108, chr12-116232453–116238344, chr5-65466947–65505089 |
| Type I diabetes mellitus | mcc04940 | 5 | 32 | 0.0011 | 0.17 |
| chr4-33928581–33933207, chr2-148896350–148910750, chr4-33338105–33352404, chr4-33355720–33386474, chr4-33406118–33412221, chr4-30203434–30413162, chr5-105546464–105549989 |
| Metabolism of xenobiotics by cytochrome P450 | mcc00980 | 6 | 49 | 0.0011 | 0.17 |
| chr1-110697591–110747857, chr5-65506980–65623157, chr5-65628860–65729639, chr12-116232453–116238344, chr19-36763842–36769753, chr5-65466947–65505089 |
| Starch and sucrose metabolism | mcc00500 | 5 | 36 | 0.0018 | 0.25 |
| chr5-65628860–65729639, chr12-116232453–116238344, chr5-65506980–65623157, chr5-65466947–65505089, chr1-104405951–104421047 |
| RNA degradation | mcc03018 | 6 | 64 | 0.0039 | 0.46 |
| chr2-148896350–148910750, chr5-105546464–105549989, chr20-14709561–14721562, chr8-99416856–99422405, chr9-92935279–92941800, chr9-2908518–2937562, chr9-2980923–3000708 |
| Drug metabolism: other enzymes | mcc00983 | 4 | 32 | 0.0072 | 0.46 |
| chr5-65628860–65729639, chr12-116232453–116238344, chr5-65506980–65623157, chr5-65466947–65505089 |
| ( | |||||||
| Glucuronosyltransferase activity | GO:0015020 | 3 | 5 | 5.1E-04 | 0.13 |
| chr12-116232453–116238344, chr5-65506980–65623157, chr5-65628860–65729639 |
| UDP-glycosyltransferase activity | GO:0008194 | 3 | 14 | 0.0053 | 0.46 |
| chr12-116232453–116238344, chr5-65506980–65623157, chr5-65628860–65729639 |
| DNA conformation change | GO:0071103 | 3 | 15 | 0.0063 | 0.46 |
| chrX-144618359–144621948, chr11-6773682–6777531, chrX-97761715–97768667 |
| Flavonoid metabolic process | GO:0009812 | 2 | 5 | 0.009 | 0.46 |
| chr5-65506980–65623157, chr5-65628860–65729639 |
Note.—The term description and ID are provided along with the number of genes identified near our CNVs (input) compared with the total Ensembl gene set of the rhesus macaque (background). The raw and corrected P values are indicated. Input gene names with enrichment signals (P ≤ 0.01) can be found in the last second column.
The novel genes without gene symbols are indicated with Ensembl gene IDs.
Fig. 4Copy number patterns of CYP2C76 and GSTM5 across the nine Macaca species. (A) CYP2C76 (chr9: 90,280,000–90,360,000) and (B) GSTM5 (chr1: 110,670,000–110,770,000). The black baselines in the tracks indicate copy number of two, and CNV regions are indicated with black dashed box. As in figure 1, copy number was estimated in windows containing 1 kb of nongap, nonmasked sequence.
Fig. 5Lineage-specific interspecific CNVs are displayed on the branches of the NJ tree of nine Macaca species, which was generated by SNPhylo based on thinned genomic SNVs (500k sites). Bootstrap values are at each node, as determined by 1,000 bootstraps. Species groups defined by Zinner et al. (2013) and Roos et al. (2014) are labeled in green ovals on the tree.
Fig. 6Comparison of copy number patterns determined by genomic qPCR and bioinformatic analysis. Copy numbers from qPCR (blue) and bioinformatically estimated (red) approaches are provided for genes (A) GSTM5, (B) GSTM1, and (C) UGT1A1. The whiskers stand for standard errors of copy numbers estimated by independent technical replicates of qPCR experiments.