| Literature DB >> 26560100 |
SungHwan Kim1,2, Jose D Herazo-Maya3, Dongwan D Kang4, Brenda M Juan-Guardela5, John Tedrow6, Fernando J Martinez7, Frank C Sciurba8, George C Tseng9,10, Naftali Kaminski11.
Abstract
BACKGROUND: The increased multi-omics information on carefully phenotyped patients in studies of complex diseases requires novel methods for data integration. Unlike continuous intensity measurements from most omics data sets, phenome data contain clinical variables that are binary, ordinal and categorical.Entities:
Mesh:
Year: 2015 PMID: 26560100 PMCID: PMC4642618 DOI: 10.1186/s12864-015-2170-4
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Flow chart of Integrative Phenotyping Framework (iPF): Integrative phenotyping framework (iPF) includes the following steps (1) Data preprocessing (2) Feature concatenation (3) Dimension reduction (4) Feature smoothing (5) Clustering for subtype discovery and visualization
Fig. 2Overview of integrative clustering in integrative phenotyping framework (iPF): (a) Vertically combined multiple omics data sets (b) A distance matrix between any two features within and across omic data sets (c) Multidimensional scaling (MDS) mapping to a two-dimensional Euclidean space (d) Smoothed feature intensities in the reduced 2D space for each patient (e) Unsupervised clustering to identify potential disease subtypes and averaged feature intensities for representative plots of each cluster
Fig. 3Graphical illustration of heterogeneous data sets comparison scheme in iPF. Step 1-2: Compare and combine all possible pairs of omics data sets until we produce homogenous clustering results. Step 3: Pairwise comparison until identifying any heterogeneous data sets
Fig. 4iPF clustering results for multiple omics data sets using all samples (Clinical, mRNA + miRNA). This figure shows three clusters are generated in each omics data source, and the feature topology plots for clusters from the first and the second omics data sources on the left and on the top, respectively
Summary of significant features grouped in each cluster (cluster A, E, and I)a
| Total | Cluster A | Cluster E | Cluster I |
| Cluster | Cluster | Cluster | |
|---|---|---|---|---|---|---|---|---|
| ( | ( | ( | ( | ANOVA | A & E | A & I | E & I | |
| Age, yrs | 63.5 | 65.7 | 55 | 66.1 | 2.91E-07 | 7.68E–02 | 1.24E–07 | 5.05E–02 |
| Gender, % female | 41.2 | 39.5 | 65.1 | 30 | 7.51E–04 | 2.44E–02 | 7.23E–01 | 7.42E–04 |
| Body Mass Index, BMI | 28.6 | 28 | 27.5 | 29.8 | 3.29E–02 | 1.00E + 00 | 7.24E–02 | 1.07E–01 |
| FEV1 % predicted | 61.7 | 48 | 64.3 | 73.4 | 1.12E–13 | 4.37E–05 | 1.53E–13 | 3.61E–02 |
| FVC % predicted | 69.2 | 72.4 | 69.7 | 65.8 | 1.85E–02 | 1.00E + 00 | 1.79E–02 | 2.93E–01 |
| FEV1/FVC ratio | 0.9 | 0.653 | 0.93 | 1.12 | 3.62E–29 | 1.59E–09 | 1.49E–25 | 2.75E–09 |
| DLCO | 53.9 | 59.2 | 57.3 | 47 | 2.55E–03 | 1.00E + 00 | 8.14E–03 | 1.46E–02 |
| Total lung capacity, mean | 5.28 | 6.55 | 4.87 | 4.19 | 4.69E–19 | 1.16E–07 | 2.32E–18 | 4.98E–02 |
| CT % emphysema | 7.19 | 14.4 | 1.88 | 1.01 | 4.46E–13 | 1.73E–07 | 3.77E–11 | 3.70E–01 |
| Lung reticular volume, ml | 309 | 63.8 | 198 | 662 | 5.86E–17 | 2.22E–02 | 3.03E–16 | 2.32E–07 |
| Diagnosis, % IPF | 38.7 | 1.32 | 9.3 | 90 | 4.01E–33 | 1.16E–01 | 6.00E–28 | 6.75E–18 |
| Diagnosis, % Emphysema | 19.6 | 43.4 | 14 | 0 | 4.29E–11 | 3.20E–03 | 1.12E–10 | 1.99E–03 |
aThe average values of 12 selected demographic and clinical variables in each sub–cluster groups, p–values from Kruskal–Wallis ANOVA (all three groups) and p-values from Kruskal-Wallis rank sum test (paired wise groups)
Fig. 5Heatmap for the four modules of gene expression and miRNA features which significantly differentiate three clusters (COPD, Intermediate, and ILD). We performed gene co-expression cluster analysis using partition around modoids (PAM) to identify four gene and miRNA modules. When perform clustering, the 88 miRNAs intensities are turned to the opposite direction (by multiplying the expression intensities by -1) to show that most miRNAs have inhibitory effects on mRNA expression