| Literature DB >> 27412732 |
Asoke K Talukder1, Mahima Agarwal1, Kenneth H Buetow2, Patrice P Denèfle3.
Abstract
It is difficult for existing methods to quantify, and track the constant evolution of cancers due to high heterogeneity of mutations. However, structural variations associated with nucleotide number changes show repeatable patterns in localized regions of the genome. Here we introduce SPKMG, which generalizes nucleotide number based properties of genes, in statistical terms, at the genome-wide scale. It is measured from the normalized amount of aligned NGS reads in exonic regions of a gene. SPKMG values are calculated within OncoTrack. SPKMG values being continuous numeric variables provide a statistical metric to track DNA level changes. We show that SPKMG measures of cancer DNA show a normative pattern at the genome-wide scale. The analysis leads to the discovery of core cancer genes and also provides novel dynamic insights into the stage of cancer, including cancer development, progression, and metastasis. This technique will allow exome data to also be used for quantitative LOH/CNV analysis for tracking tumour progression and evolution with a higher efficiency.Entities:
Mesh:
Year: 2016 PMID: 27412732 PMCID: PMC4944131 DOI: 10.1038/srep29647
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Architecture of OncoTrack pipeline.
Figure 2Clustering of Breast cancer and healthy individuals.
(a) MDS plot of SPKMG values of 11 breast cancer and 20 healthy samples. Distances were calculated using log fold changes. Marker 1 and Marker 2 in this figure are the top two leading log fold change dimensions in the data. (b) SPKMG heatmap and hierarchical clustering of the breast cancer and healthy samples.
Genes with highest statistical significance in Breast Cancer case/control comparative analysis.
| Gene | Description | P-value | Number of Unique Samples (COSMIC) | COSMIC Studies | COSMIC References |
|---|---|---|---|---|---|
| TCEAL5 | transcription elongation factor A (SII)-like 5 | 3.26E-67 | 28,859 | 13 | 14 |
| PRB4 | proline-rich protein BstNI subfamily 4 | 1.44E-34 | 28,981 | 24 | 32 |
| PRB2 | proline-rich protein BstNI subfamily 2 | 6.63E-33 | 28,772 | 24 | 26 |
| SPRR2D | small proline-rich protein 2D | 1.83E-27 | 28,792 | 8 | 8 |
| FLG | Filaggrin | 6.01E-27 | 28,983 | 43 | 92 |
| FAM47C | family with sequence similarity 47, member C | 1.23E-26 | 28,796 | 35 | 52 |
| MKI67 | marker of proliferation Ki-67 | 1.26E-26 | 28,903 | 38 | 77 |
| OTOP1 | Otopetrin 1 | 4.45E-25 | 28,815 | 38 | 42 |
| AHNAK | AHNAK nucleoprotein | 5.35E-25 | 29,029 | 41 | 83 |
| OR10A2 | olfactory receptor, family 10, subfamily A, member 2 | 2.01E-24 | 28,838 | 19 | 23 |
| MYH1 | myosin, heavy chain 1, skeletal muscle, adult | 2.83E-24 | 29,073 | 41 | 81 |
| CSH2 | chorionic somatomammotropin hormone 2 | 6.53E-24 | 28,814 | 15 | 9 |
| SPRR2B | small proline-rich protein 2B | 7.71E-24 | 28792 | 7 | 9 |
| GH2 | growth hormone 2 | 3.07E-23 | 28,815 | 21 | 19 |
| CSH1 | chorionic somatomammotropin hormone 1 (placental lactogen) | 4.41E-23 | 24,523 | 20 | 19 |
| TCEAL6 | transcription elongation factor A (SII)-like 6 | 4.88E-23 | 28,814 | 20 | 18 |
| KIR3DL1 | killer cell immunoglobulin-like receptor, three domains, long cytoplasmic tail, 1 | 8.85E-23 | 28,831 | 6 | 11 |
| MAGEC1 | melanoma antigen family C1 | 4.17E-22 | 28,937 | 40 | 72 |
| KRT33A | keratin 33A, type I | 5.09E-22 | 28,816 | 28 | 25 |
| OR10G8 | olfactory receptor, family 10, subfamily G, member 8 | 2.18E-21 | 28,816 | 26 | 33 |
| RFPL1 | ret finger protein-like 1 | 2.69E-21 | 28,815 | 24 | 18 |
| MAGEA3 | melanoma antigen family A3 | 2.72E-21 | 28,814 | 19 | 16 |
| KRT6B | keratin 6B, type II | 3.18E-21 | 28,815 | 31 | 35 |
| PDE4DIP | phosphodiesterase 4D interacting protein | 4.75E-21 | 25,256 | 39 | 67 |
| SPRR2F | small proline-rich protein 2F | 6.69E-21 | 28,792 | 12 | 10 |
| KRTAP21-1 | keratin associated protein 21-1 | 7.27E-21 | 28,855 | 9 | 12 |
| IFNA21 | interferon, alpha 21 | 7.94E-21 | 24,540 | 13 | 15 |
| OR4A47 | olfactory receptor, family 4, subfamily A, member 47 | 7.99E-21 | 28,815 | 23 | 32 |
All 28 most significant genes have been discovered in multiple cancer studies.
Figure 3Enrichment networks from XomPathways for Breast cancer Comparative Analysis.
(a) Enriched pathways – gene network. (b) Gene-gene network, with most central gene highlighted in red. (c) Pathway-pathway interaction network, with most central pathway in red.
Figure 4Breast cancer mutual information network.
(a) The entire MI network for breast cancer samples consisting of 1405 genes and 1172 edges. (b) 9 clusters with 10 or more genes identified and isolated from the whole network. (c) The degree distribution follows power law with an alpha of 2.44.
Figure 5Overlap of significant genes from the comparative analyses for HxT, HxN and HxTN.
Figure 6ESCC Mutual Information network for 19 tumor samples.
(a) The entire MI network for ESCC tumor samples with 1520 genes and 1308 edges. (b) 17 clusters with 10 or more members identified and isolated from the complete network. (c) The degree distribution of the tumor MI network follows power law with an alpha of 2.54.
Figure 7ESCC Mutual Information network for 19 germline samples.
(a) The entire MI network for ESCC germline samples consisting of 2680 genes and 3597 edges. (b) 6 clusters with 10 or more genes identified and isolated from the MI network. (c) The degree distribution for the complete MI network follows power law with an alpha of 2.52.