| Literature DB >> 25780754 |
Yi Jiang1, Hong Qin2, Li Yang1.
Abstract
Substantial health disparities exist between African Americans and Caucasians in the United States. Copy number variations (CNVs) are one form of human genetic variations that have been linked with complex diseases and often occur at different frequencies among African Americans and Caucasian populations. Here, we aimed to investigate whether CNVs with differential frequencies can contribute to health disparities from the perspective of gene networks. We inferred network clusters from human gene/protein networks based on two different data sources. We then evaluated each network cluster for the occurrences of known pathogenic genes and genes located in CNVs with different population frequencies, and used false discovery rates to rank network clusters. This approach let us identify five clusters enriched with known pathogenic genes and with genes located in CNVs with different frequencies between African Americans and Caucasians. These clustering patterns predict two candidate causal genes located in four population-specific CNVs that play potential roles in health disparities.Entities:
Keywords: Clustering; Copy Number Variations (CNVs); Gene Ontology; Gene networks; Gene-disease association; Health disparities
Year: 2015 PMID: 25780754 PMCID: PMC4358638 DOI: 10.7717/peerj.677
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1Overview of our approach to identify CNVs associated with health disparities.
(A) Contingency table for Fisher’s exact test on pathogenic genes. (B) Contingency table for Fisher’s exact test on CNV genes.
| (A) | |||
|---|---|---|---|
| Pathogenic genes | Non-pathogenic genes | Total | |
|
|
|
| |
|
| |||
|
|
|
| |
Notes.
For each cluster, contingency tables were constructed for right-tailed Fisher’s exact Tests. (A) is for pathogenic significance test, and (B) is for tests of enrichment significance of CNV genes (CNV_AA or CNV_CA genes). Q and q are the number of pathogenic genes in the whole networks and that in current cluster, respectively. N and m are the number of genes in whole networks and that in current cluster, respectively. S and s are the number of CNV_AA or CNV_CA genes in the whole networks and that in current cluster, respectively.
Figure 2Graph representations of selected clusters for biological significance analysis.
Each rounded rectangle represents a gene and each gray line represents a gene–gene interaction. Black rounded rectangles represent non-pathogenic genes and orange rounded rectangles represent pathogenic genes. Genes labeled with red or blue ovals are located in African American CNVs or in Caucasian CNVs. Genes with green lines share the same GO terms. In each cluster, different line types represent the enrichment of different GO terms. Line types shown in different clusters refer to the enrichment of different GO terms.
Cluster analysis results for HPRDNet and MultiNet.
| Network | Cluster name | CNV_ AA | CNV_ CA | Pathogenic | Cluster size |
|---|---|---|---|---|---|
|
| AA1 |
| – | 8 | 11 |
| AA2 |
| – | 8 | 12 | |
| AA3 |
| – | 8 | 13 | |
| CA1 | – |
| 4 | 5 | |
|
| AA4 |
| – | 5 | 5 |
| CA1 | – |
| 4 | 5 |
Notes.
Selected clusters were listed. CNV_ AA and CNV_ CA are CNV-related genes.
Detected genes with potential roles in health disparity and their located CNVs.
| Gene | Chr | Gene coordinates | CNV region | CNV type | CNV occurrence preference |
|---|---|---|---|---|---|
|
| 7 | 75,931,861–75,933,614 | 75,867,431–76,481,102 | Duplication | Only in African American |
| 75,929,740–76,481,102 | Duplication | Only in African American | |||
| 75,929,740–76,568,388 | Duplication | More in African American than in Caucasian | |||
|
| 16 | 28,889,726–28,915,830 | 28,306,730–28,936,772 | Duplication | Only in Caucasian |
Notes.
Chr represents chromosomes. CNV Regions are regions of CNVs identified in more than a single individual; all CNVs listed have a type of Duplication, referring to one copy increase. CNV Regions and Types are from the CNV map (McElroy et al., 2009). CNV Occurrence preference describes in which population those CNVs have higher occurrence frequency.
Enriched GO terms with CNV-genes in the identified network clusters.
| Clusters | Involved genes | GO domain | GO ID | GO term |
|---|---|---|---|---|
| AA1 | Molecular function | GO:0042802 | Identical protein binding | |
| AA4 |
| Biological process | GO:0043086 | Negative regulation of catalytic activity |
| Biological process | GO:0043066 | Negative regulation of apoptotic process | ||
| Biological process | GO:0043069 | Negative regulation of programmed cell death | ||
|
| Molecular function | GO:0042802 | Identical protein binding | |
|
| Cellular component | GO:0030018 | Z disc | |
| CA1 |
| Biological process | GO:0090257 | Regulation of muscle system process |
| Biological process | GO:0006816 | Calcium ion transport | ||
| Cellular component | GO:0033017 | Sarcoplasmic reticulum membrane | ||
|
| Biological process | GO:0003012 | Muscle system process | |
| Biological process | GO:0006874 | Cellular calcium ion homeostasis | ||
| Cellular component | GO:1902495 | Transmembrane transporter complex | ||
|
| Cellular component | GO:0016529 | Sarcoplasmic reticulum | |
|
| Biological process | GO:0032470 | Positive regulation of endoplasmic reticulum calcium ion concentration | |
| Cellular component | GO:0031095 | Platelet dense tubular network membrane |
Notes.
Biological relevance of network clusters was analyzed by GOrilla (Eden et al., 2009) to search for enriched gene ontology (GO) terms. Genes in the selected clusters were used as target genes, and all genes in the networks were treated as background genes. Three types of GO terms were analyzed: biological process, molecular function and cellular component. The default p-value threshold (1 × 10−3) was used. In the results, enriched GO terms that are associated with CNV_ AA gene HSPB1 and CNV_ CA gene ATP2A1 were selected and listed in the table.
When multiple enriched GO terms show similar meanings, we only presented the most general terms.
Associated diseases of genes with enriched GO terms.
| Cluster | Gene | Associated Disease |
|---|---|---|
|
|
| Axonal Charcot-Marie-Tooth disease type 2F |
| Distal hereditary motor neuronopathy type 2B | ||
|
| Multiple types of cataract 9 | |
|
| Multiple types of cataract 16 | |
| Dilated cardiomyopathy-1II | ||
| Myofibrillar myopathy-2 | ||
|
| Multiple types of Cataract 3 | |
|
|
| Brody myopathy |
|
| Acrokeratosis verruciformis | |
| Darier disease | ||
|
| Dilated cardiomyopathy-1P | |
| Familial hypertrophic cardiomyopathy-18 |
Notes.
Only GO terms that contain CNV-genes are studied due to our focus on the role of CNV-genes in health disparity.