| Literature DB >> 24956385 |
Siti Shuhada Mokhtar1, Christian R Marshall2, Maude E Phipps3, Bhooma Thiruvahindrapuram4, Anath C Lionel4, Stephen W Scherer2, Hoh Boon Peng1.
Abstract
Copy number variation (CNV) has been recognized as a major contributor to human genome diversity. It plays an important role in determining phenotypes and has been associated with a number of common and complex diseases. However CNV data from diverse populations is still limited. Here we report the first investigation of CNV in the indigenous populations from Peninsular Malaysia. We genotyped 34 Negrito genomes from Peninsular Malaysia using the Affymetrix SNP 6.0 microarray and identified 48 putative novel CNVs, consisting of 24 gains and 24 losses, of which 5 were identified in at least 2 unrelated samples. These CNVs appear unique to the Negrito population and were absent in the DGV, HapMap3 and Singapore Genome Variation Project (SGVP) datasets. Analysis of gene ontology revealed that genes within these CNVs were enriched in the immune system (GO:0002376), response to stimulus mechanisms (GO:0050896), the metabolic pathways (GO:0001852), as well as regulation of transcription (GO:0006355). Copy number gains in CNV regions (CNVRs) enriched with genes were significantly higher than the losses (P value <0.001). In view of the small population size, relative isolation and semi-nomadic lifestyles of this community, we speculate that these CNVs may be attributed to recent local adaptation of Negritos from Peninsular Malaysia.Entities:
Mesh:
Year: 2014 PMID: 24956385 PMCID: PMC4067311 DOI: 10.1371/journal.pone.0100371
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Candidate genes primer sequences and copy number amplified in SyBr Green qPCR assay.
| Locus name | CNV size spanned (bp) | Primer sequence | Expected amplicon size (bp) | Annealing temp (°C) | Copy number |
| ADH7 | 153,593 | Forward: gaaggcacaagctgctgttatReverse: catcctgtctttgtcttggatct | 99 | 59.6°C | 3(2.80, 0.108) |
| CSMD1 | 301,535 | Forward: actctgaacggtgtcctggttt Reverse: ttcctaagctgcaaaggtgtg | 92 | 62.2°C | 3(3.11, 0.062) |
| SH2D4B | 15,464 | Forward: atgttctatgctgtggtggatg Reverse: acgaactttgtcagaaacgtga | 101 | 59.9°C | 1(0.45, 0.042) |
| NPAS3 | 25,484 | Forward: ctgttggcttagaggctgagatReverse: agcccttgagatgattcctaca | 109 | 60°C | 1(1.32, 0.65) |
| WDR4 | 165,544 | Forward: acaggtttgtgagccgtatctcReverse: tcaagaatccagaggtgagtga | 106 | 60°C | 2(2.10, 0.14) |
| LRRC30 | 9,547 | Forward: cttgcacgtgggctcgaatcReverse: ggatgttgttgccctctgcg | 95 | 66.3°C | - |
| TNFRSF1B | 83,214 | Forward: cattaggagatgtgtggtcctgReverse: aacagtatgtcccgttctgtctc | 90 | 59.6°C | 3(3.09, 0.008) |
| PRIMER 1 | 36,283 | Forward: acagaacctaagcggaaatcctReverse: aactggaagcaagatgctgact | 107 | 64.0°C | 3(3.40, 0.08) |
| PRIMER 2 | 65,481 | Forward: ccctgaagcgtgagtctctaat Reverse: tgataacacctctgcacattcc | 89 | 63.5°C | 3(2.50, 0.12) |
| PRIMER 3 | 42,399 | Forward: ggtcttcagtttgtgcttcagat Reverse: catcacttcctagcgccttc | 80 | 63.4°C | 3(2.90, 0.07) |
| PRIMER 4 | 63,260 | Forward: tcctaaagtttccgcaggagReverse: ctcacttcactggtgtcaggtt | 99 | 63.2°C | 1(1.14, 0.32) |
| QCNV2 | 9,812 | Forward: caggcaagttcatatgttccaReverse: agaggaatgccagatagagcag | 113 | 63.6°C | 3(2.90, 0.11) |
| QCNV4 | 4,021 | Forward: acttggtaaattgtgttgaReverse: tgtcagtcctgcattt | 104 | 52.4°C | 2(2.20, 0.17) |
WDR4 and QCNV4 showed copy number normal and therefore considered as false positive. QCNV2 was detected as a CN gain by microarray, inconsistent with the qPCR validation, therefore considered as false positive. Parentheses, unrounded copy number values calculated using the relative quantification, standard deviation.
General characteristics of CNV and CNVR among 34 Negrito genomes from Peninsular Malaysia.
| GTC | Birdsuite | iPattern | Merged | |
|
| ||||
| Gain | 530 | 735 | 1,430 | 330 |
| Loss | 803 | 1,901 | 2,262 | 781 |
| Complex CNV | 40 | |||
| Total | 1,333 | 2,636 | 3,692 | 1,111 |
|
| ||||
| Gain | 15.5 | 21.6 | 42.0 | 9.7 |
| Loss | 23.6 | 55.9 | 66.5 | 23.0 |
| Total | 39.2 | 77.5 | 108.6 | 32.7 |
|
| ||||
| Min | 1,000 | 1,019 | 1,010 | 1,134 |
| Max | 1,768,000 | 985,807 | 1,033,784 | 1,033,785 |
*Merged: stringent CNV calls by at least 2 out of 3 algorithms applied.
Figure 1CNVR map of Negrito samples.
The ideogram summarizes the distribution of CNVRs on each human chromosome. The red indicates copy number loss, the blue indicates copy number gain while the green indicates multi-allelic loci.
Figure 2Length distribution of the CNVs in Negrito from Peninsular Malaysia.
Common CNV with significant difference in allele frequencies compare to the HapMap3 dataset.
| CNVs | Chr | Start | End | CNV Frequencies | ||||||||||
| NEG | ASW | CEU | CHB | CHD | GIH | JPT | LWK | MEX | TSI | YRI | ||||
| 1. | 2 | 40,780,879 | 40,803,110 | 0.21 | 0.00 | 5.6×103 | 0.06 | 0.02 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 | 0.00 |
| 2. | 2 | 52,605,074 | 52,635,046 | 0.88 | 0.17 | 0.53 | 0.80 | 0.80 | 0.70 | 0.86 | 0.04 | 0.44 | 0.49 | 0.09 |
| 3. | 3 | 37,957,108 | 37,961,932 | 0.56 | 0.07 | 0.11 | 0.15 | 0.16 | 0.21 | 0.03 | 0.01 | 0.11 | 0.27 | 0.03 |
| 4. | 4 | 78,495,579 | 78,500,367 | 0.12 | 0.00 | 0.00 | 0.03 | 0.07 | 0.00 | 0.03 | 0.00 | 0.01 | 0.00 | 0.01 |
| 5. | 6 | 202,353 | 326,149 | 0.03 | 0.15 | 0.29 | 0.26 | 0.37 | 0.16 | 0.29 | 0.17 | 0.27 | 0.30 | 0.09 |
| 6. | 15 | 32,487,975 | 32,617,680 | 0.44 | 0.21 | 0.09 | 0.16 | 0.17 | 0.18 | 0.22 | 0.10 | 0.15 | 0.17 | 0.21 |
| 7. | 16 | 14,897,364 | 15,016,088 | 0.09 | 0.30 | 0.24 | 0.42 | 0.35 | 0.30 | 0.36 | 0.33 | 0.31 | 0.21 | 0.22 |
| 8. | 17 | 41,750,187 | 42,107,479 | 0.18 | 0.00 | 0.00 | 0.01 | 0.02 | 0.00 | 0.03 | 0.00 | 0.02 | 0.00 | 0.00 |
Figure 3UCSC Genome Browser view of CNV on chromosome 3p22.2.
Figure produced by custom tracks listing CNV call of Negrito and uploaded to http://genome.ucsc.edu.
Population specific CNVs in 34 genomes of Negrito from Peninsular Malaysia.
| Chromosome Cytoband | Start | End | Size | CNV frequency | CNV type(gain/loss) | Genes involved | Disrupted genes |
| 1p36.22 | 12,117,038 | 12,200,251 | 83,214 | 0.03 | gain | TNFRSF1B, TNFRSF8 | TNFRSF8 |
| 1q43 | 237,656,846 | 237,667,786 | 10,941 | 0.03 | loss | - | - |
| 2p21 | 41,716,288 | 41,781,081 | 64,794 | 0.06 | loss | - | - |
| 41,717,149 | 41,780,408 | 63,260 | 0.06 | loss | - | - | |
| 41,717,149 | 41,785,214 | 68,066 | 0.03 | loss | - | - | |
| 2p12 | 75,457,878 | 75,486,632 | 28,755 | 0.03 | gain | - | - |
| 75,469,904 | 75,486,632 | 16,729 | 0.03 | gain | - | - | |
| 2q13 | 109,015,589 | 109,051,831 | 36,243 | 0.03 | loss | - | - |
| 2q37.1 | 232,156,839 | 232,246,171 | 89,333 | 0.03 | gain | C2orf57 | - |
| 3p26.1 | 8,096,025 | 811,9974 | 23,950 | 0.03 | loss | - | - |
| 3q25.33 | 161,089,639 | 161,131,309 | 41,671 | 0.03 | gain | SCHIP1, IQCJ-SCHIP1 | SCHIP1 |
| 4q22.2 | 94,375,625 | 94,778,350 | 402,726 | 0.03 | loss | GRID2 | GRID2 |
| 4q23 | 100,542,893 | 100,696,485 | 153,593 | 0.03 | gain | RG9MTD2, C4orf17, ADH7 | RG9MTD2 |
| 100,651,864 | 100,686,034 | 34,171 | 0.03 | gain | C4orf17 | C4orf17 | |
| 100,656,576 | 100,695,572 | 38,997 | 0.03 | gain | C4orf17, RG9MTD2 | C4orf17, RG9MTD2 | |
| 4q31.23 | 149,474,120 | 149,512,767 | 38,648 | 0.03 | loss | NR3C2 | - |
| 4q32.3 | 169,960,297 | 169,996,579 | 36,283 | 0.03 | loss | PALLD | PALLD |
| 169,961,956 | 169,979,040 | 17,085 | 0.03 | loss | PALLD | - | |
| 6p12.2 | 51,564,000 | 51,583,519 | 19,520 | 0.06 | loss | - | - |
| 7p22.2 | 3,073,605 | 3,094,449 | 20,845 | 0.03 | gain | - | - |
| 8p23.2 | 2,672,753 | 2,738,233 | 65,481 | 0.03 | gain | - | - |
| 2,672,753 | 2,974,287 | 301,535 | 0.03 | gain | CSMD1 | CSMD1 | |
| 2,675,472 | 2,766,578 | 91,107 | 0.03 | gain | - | - | |
| 2,780,146 | 2,947,279 | 167,134 | 0.03 | gain | CSMD1 | CSMD1 | |
| 8q24.3 | 143,097,891 | 143,112,524 | 14,634 | 0.03 | loss | - | - |
| 9p23 | 10,734,135 | 1,0754,097 | 19,963 | 0.03 | gain | - | - |
| 9p21.1 | 28,746,344 | 28,839,949 | 93,606 | 0.03 | loss | - | - |
| 9q21.33 | 86,893,365 | 86,953,896 | 60,532 | 0.03 | loss | - | - |
| 10q23.1 | 82,374,114 | 82,389,577 | 15,464 | 0.03 | loss | SH2D4B | - |
| 14q13.1 | 32,721,471 | 32,746,954 | 25,484 | 0.03 | loss | NPAS3 | - |
| 14q32.12 | 90,730,514 | 90,754,443 | 23,930 | 0.03 | gain | C14orf159 | C14orf159 |
| 18p11.23, 18p11.31 | 6,863,354 | 7,424,329 | 56,0976 | 0.03 | gain | LRRC30, LAMA1, ARHGAP28, LOC400643 | ARHGAP28 |
| 18q21.33 | 58,951,168 | 58,970,363 | 19,196 | 0.03 | gain | BCL2 | - |
| 19p13.3 | 4,795,152 | 4,837,550 | 42,399 | 0.09 | gain | PLIN3 | PLIN3 |
| 20p13 | 2,436,294 | 2,547,940 | 111,647 | 0.03 | gain | ZNF343, TMC2 | ZNF343, TMC2 |
| 2,436,294 | 2,555,429 | 119,136 | 0.03 | gain | ZNF343, TMC2 | ZNF343, TMC2 | |
| 2,436,294 | 2,556,157 | 119,864 | 0.03 | gain | ZNF343, TMC2 | ZNF343, TMC2 | |
| 21q21.2 | 23,740,931 | 23,752,189 | 11,259 | 0.06 | loss | - | - |
| 21q22.3 | 43,121,093 | 43,286,636 | 165,544 | 0.03 | gain | NDUFV3, WDR4, PKNOX1 | PKNOX1 |
Position of CNVs were coordinated based on Human Genome Assembly NCBI (hg18).
*CNV frequencies calculated based on the 34 Negrito genomes genotyped.
Figure 4Length distribution of the CNVs unique to the Negrito from Peninsular Malaysia.
Figure 5Gene Ontology and pathway analyses on the gene set within the Negrito-specific CNVs using PANTHER and DAVID.
(a) PANTHER analysis suggests a major involvement of the genes harboring the population specific CNVs in the immune system process and response to stimulus, as well as the metabolic process; (b) DAVID analysis suggests the involvement of the genes harboring the population specific CNVs in the transcription and regulation of RNA metabolic processes.
Pathways and biological processes of the genes underlying the population specific CNvs in Negrito from Peninsular Malysia.
| Pathways/Biological functions | GO Term | Genes |
| Immune systems and processes | GO:0002376 | TNFRSF8, CSMD1, SH2D4B, TNFRSF1B, LRRC30 |
| Response to stimulus | GO:0050896 | TNFRSF8, CSMD1, SH2D4B, TNFRSF1B |
| System process | GO:0003008 | SCHIP1, LAMA1, GRID2, PALLD |
| Metabolic processes | GO:0008152 | NDUFV3, WDR4, NPAS3, ADH7, PKNOX1 |
| Cellular processes | GO:0009987 | TNFRSF8, TNFSR1B, LAMA1, GRID2 |
| Cell communication | GO:0007154 | TNFRSF8, TNFSR1B, LAMA1, GRID2 |
| Transcription | GO:0006350 | NPAS3, NR3C2, ZNF343 |
| Regulation of transcription, DNA-dependent | GO:0006355 | PBX1, NPAS3, NR3C2, ZNF343 |
| Regulation of RNA metabolic process | GO:0051252 | PBX1, NPAS3, NR3C2, ZNF343 |
Analysis was performed using PANTHER DAVID.