| Literature DB >> 26075260 |
Mingon Kang1, Dong-Chul Kim2, Chunyu Liu3, Jean Gao1.
Abstract
Human diseases are abnormal medical conditions in which multiple biological components are complicatedly involved. Nevertheless, most contributions of research have been made with a single type of genetic data such as Single Nucleotide Polymorphism (SNP) or Copy Number Variation (CNV). Furthermore, epigenetic modifications and transcriptional regulations have to be considered to fully exploit the knowledge of the complex human diseases as well as the genomic variants. We call the collection of the multiple heterogeneous data "multiblock data." In this paper, we propose a novel Multiblock Discriminant Analysis (MultiDA) method that provides a new integrative genomic model for the multiblock analysis and an efficient algorithm for discriminant analysis. The integrative genomic model is built by exploiting the representative genomic data including SNP, CNV, DNA methylation, and gene expression. The efficient algorithm for the discriminant analysis identifies discriminative factors of the multiblock data. The discriminant analysis is essential to discover biomarkers in computational biology. The performance of the proposed MultiDA was assessed by intensive simulation experiments, where the outstanding performance comparing the related methods was reported. As a target application, we applied MultiDA to human brain data of psychiatric disorders. The findings and gene regulatory network derived from the experiment are discussed.Entities:
Mesh:
Year: 2015 PMID: 26075260 PMCID: PMC4450020 DOI: 10.1155/2015/783592
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Figure 1The conceptual graphic representation of the integrative genomic model. A rectangle represents a manipulated variable, and a circle represents a latent variable. The graphic representation illustrates the structure model that shows the relationship between SNP, CNV, DNA methylation, gene expression, and disease phenotype.
Algorithm 1Discriminant multiblock analysis.
Generation functions.
| Function | Model |
|---|---|
| Type1( |
|
| Type2( |
|
| Type3( |
|
| Type4( |
|
Scheme of the simulation data.
| Simulation data | Generation model type | Column index |
|---|---|---|
|
|
| 1 ≤ |
|
| 6 ≤ | |
|
| 11 ≤ | |
|
| 41 ≤ | |
|
| ||
|
|
| 1 ≤ |
|
| 6 ≤ | |
|
| 11 ≤ | |
|
| 61 ≤ | |
|
| ||
|
|
| 1 ≤ |
|
| 6 ≤ | |
|
| 11 ≤ | |
|
| 211 ≤ | |
Figure 2Performance comparison in simulation study: (a) True Positive Rate; (b) Positive Predictive Value; (c) Accuracy.
The gene results from MultiDA with psychiatric disorders.
| Gene | Chromosome | Location | Source | ID | MAF | Reference |
|---|---|---|---|---|---|---|
| HTR7 | 10 | 10q21-q24 | GE | 7934970 | [ | |
| APOE | 19 | 19q13.2 | DM | cg14123992 | [ | |
| TRPM1 | 15 | 15q13.3 | DM | cg18085517 | ||
| EPHB1 | 3 | 3q21-q23 | CNV | CNP12652 | ||
| NPY | 7 | 7p15.1 | CNV | CNP2267 | [ | |
| QKI | 6 | 6q26 | SNP | rs1336225 | 0.18 | |
| SLC15A1 | 13 | 13q32.3 | SNP | rs9517421 | 0.17 | [ |
| NPAS3 | 14 | 14q13.1 | SNP | rs1124910 | 0.25 | [ |
| C15orf53 | 15 | 15q14 | SNP | rs1433876 | 0.29 | [ |
Figure 3The gene regulatory network searched with the gene results by STRING database. The legend shows the data source of the gene.