| Literature DB >> 27773988 |
Chen Chi1, Rasif Ajwad2, Qin Kuang3, Pingzhao Hu4.
Abstract
Many cancers have been linked to copy number variations (CNVs) in the genomic DNA. Although there are existing methods to analyze CNVs from individual samples, cancer-causing genes are more frequently discovered in regions where CNVs are common among tumor samples, also known as recurrent CNVs. Integrating multiple samples and locating recurrent CNV regions remain a challenge, both computationally and conceptually. We propose a new graph-based algorithm for identifying recurrent CNVs using the maximal clique detection technique. The algorithm has an optimal solution, which means all maximal cliques can be identified, and guarantees that the identified CNV regions are the most frequent and that the minimal regions have been delineated among tumor samples. The algorithm has successfully been applied to analyze a large cohort of breast cancer samples and identified some breast cancer-associated genes and pathways.Entities:
Keywords: cancer; interval graph; maximal clique; recurrent copy number variation
Year: 2016 PMID: 27773988 PMCID: PMC5063805 DOI: 10.4137/CIN.S39368
Source DB: PubMed Journal: Cancer Inform ISSN: 1176-9351
Figure 1Representing CNVs as an interval graph. (A) A, B, C, D, E, F are individual patient-level CNVs on a specific chromosome. Each of the CNVs has chromosome start and end positions. (B) This is an interval graph where A, B, C, D, E, F are the individual patient-level CNVs in (A). The edge between each of the two vertices in the graph represents that the two individual patient-level CNVs share a piece of common regions on the chromosome.
Figure 2Flowchart analysis for implicating proposed algorithm.
Figure 3Comparison of the number patients in each recurrent CNV in Discovery set and Validation set. The number of patients in validated recurrent gain CNV regions with genes (A) and without genes (B) and in validated recurrent loss CNV regions with genes (C) and without genes (D).
Notes: The red line represents the Discovery set and the black circle represents the Validation set.
Figure 4Pathway enrichment map generated by Cytoscape. (A) Pathway enrichment map for the 67 validated recurrent CNV gain regions. (B) Pathway enrichment map for the 77 validated recurrent CNV loss regions. Each solid circle represents one pathway. Edge thickness represents overlap between two pathways. Color represents P-value – the redder the color, the lower the P-value. Node size represents the size of the pathway. Pathways with similar biological meanings were clustered, and a name was assigned to each cluster in the map using text-mining application “WordCloud” within Cytoscape. The generated name acts as a general representative of the cluster.
Gene ontology (GO) analysis via ConcensusPathDB.
| TERM_GOid | TERM_NAME | Q-VALUE | TERM_CATEGORY | TERM_LEVEL | |
|---|---|---|---|---|---|
| GO:0050906 | Detection of stimulus involved in sensory perception | 1.00E-05 | 0.000509 | B | 3 |
| GO:0070458 | Cellular detoxification of nitrogen compound | 1.27E-05 | 0.000509 | B | 3 |
| GO:1901685 | Glutathione derivative metabolic process | 2.38E-05 | 0.000509 | B | 3 |
| GO:0018916 | Nitrobenzene metabolic process | 2.54E-05 | 0.000509 | B | 3 |
| GO:0051410 | Detoxification of nitrogen compound | 4.23E-05 | 0.000677 | B | 3 |
| GO:0009593 | Detection of chemical stimulus | 7.65E-05 | 0.00102 | B | 3 |
| GO:0043295 | Glutathione binding | 0.00023104 | 0.002633 | M | 3 |
| GO:0016765 | Transferase activity, transferring alkyl or aryl (other than methyl) groups | 0.00029258 | 0.002633 | M | 3 |
| GO:0006790 | Sulfur compound metabolic process | 0.00091456 | 0.010452 | B | 3 |
| GO:0042277 | Peptide binding | 0.00115433 | 0.006849 | M | 3 |
| GO:0006575 | Cellular modified amino acid metabolic process | 0.00135798 | 0.01358 | B | 3 |
| GO:0098553 | Lumenal side of endoplasmic reticulum membrane | 0.00155266 | 0.029828 | C | 3 |
| GO:0038023 | Signaling receptor activity | 0.001658 | 0.006849 | M | 3 |
| GO:0042605 | Peptide antigen binding | 0.00190249 | 0.006849 | M | 3 |
| GO:0016021 | Integral component of membrane | 0.00248563 | 0.029828 | C | 3 |
| GO:0009636 | Response to toxic substance | 0.00490397 | 0.043591 | B | 3 |
| GO:0006805 | Xenobiotic metabolic process | 0.00568268 | 0.045461 | B | 3 |
| GO:0009410 | Response to xenobiotic stimulus | 0.00673018 | 0.048947 | B | 3 |
| GO:0009593 | Detection of chemical stimulus | 4.11E-05 | 0.0025821 | B | 3 |
| GO:0050906 | Detection of stimulus involved in sensory perception | 4.92E-05 | 0.0025821 | B | 3 |
| GO:0016021 | Integral component of membrane | 6.04E-05 | 0.0017514 | C | 3 |
| GO:0038023 | Signaling receptor activity | 0.0005567 | 0.0133617 | M | 3 |
| GO:0007586 | Digestion | 0.0005624 | 0.0196841 | B | 3 |
| GO:0098553 | lumenal side of endoplasmic reticulum membrane | 0.002344 | 0.0339873 | C | 3 |
| GO:0009812 | Flavonoid metabolic process | 0.0026886 | 0.0705755 | B | 3 |
| GO:0042605 | Peptide antigen binding | 0.0028693 | 0.0332701 | M | 3 |
Abbreviations: b, biological process; m, molecular function; c, cellular component term level: the level of the GO term in GO hierarchy.
| Sort all the vertices in terms of their chromosomal end positions |
| Initialize |
| For each vertex |
| For each vertex |
| return |