| Literature DB >> 29703159 |
Sebastian J Teran Hidalgo1, Shuangge Ma2,3.
Abstract
BACKGROUND: Omics profiling is now a routine component of biomedical studies. In the analysis of omics data, clustering is an essential step and serves multiple purposes including for example revealing the unknown functionalities of omics units, assisting dimension reduction in outcome model building, and others. In the most recent omics studies, a prominent trend is to conduct multilayer profiling, which collects multiple types of genetic, genomic, epigenetic and other measurements on the same subjects. In the literature, clustering methods tailored to multilayer omics data are still limited. Directly applying the existing clustering methods to multilayer omics data and clustering each layer first and then combing across layers are both "suboptimal" in that they do not accommodate the interconnections within layers and across layers in an informative way.Entities:
Keywords: Clustering; Multilayer omics data; NCut
Mesh:
Year: 2018 PMID: 29703159 PMCID: PMC5991460 DOI: 10.1186/s12864-018-4580-6
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Multilayer omics data and clustering. Three data types are considered: proteins in the upper layer; gene expressions in the middle layer; and CNVs in the lower layer. One dot represents one variable. Two dots are connected by a line if the corresponding variables are interconnected). Left panel: the true data structure with four clusters. Middle panel: MuNCut clustering. Right panel: K-means clustering. For K-means and MuNCut, different clusters are represented using different colors. a Truth. b MuNCut. c K-means
Data analysis: concordance between the analysis results using different methods. In each cell, M(B |A), where B and A are the clustering methods in the column and row, respectively
| BRCA | MuNCut | KM* | SC* | HC* |
|---|---|---|---|---|
| MuNCut | 100% | 59.4% | 72.7% | 80.1% |
| KM* | 44.7% | 100% | 74% | 82.3% |
| SC* | 34.5% | 46.7% | 100% | 90.1% |
| HC* | 36.3% | 49.6% | 85.9% | 100% |
| CESC | MuNCut | KM* | SC* | HC* |
| MuNCut | 100% | 48.3% | 44.6% | 52.5% |
| KM* | 37.8% | 100% | 51.3% | 64.7% |
| SC* | 38.7% | 56.9% | 100% | 61.5% |
| HC* | 35.6% | 56.2% | 48.1% | 100% |
Fig. 2Analysis of BRCA data: stability of heatmaps. a MuNCut; b KM ∗; c SC ∗; d HC ∗. The (i,j)th entry is the probability that the ith and j elements belong to the same cluster. Higher/lower probabilities are presented using warmer/colder colors
Fig. 3Analysis of CESC data: stability of heatmaps. a MuNCut; b KM ∗; c SC ∗; d HC ∗. The (i,j)th entry is the probability that the ith and j elements belong to the same cluster. Higher/lower probabilities are presented using warmer/colder colors
Simulation results for Scenario I
| Parameters |
| |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
| MuNCut | KM | SC | HC | KM* | SC* | HC* | LC | FGC |
| 200 | 400 | 0.15 | 0.20 | 0.023 | 0.411 | 0.47 | 0.565 | 0.13 | 0.126 | 0.159 | 0.155 | 0.157 |
| 200 | 400 | 0.15 | 0.40 | 0.016 | 0.364 | 0.468 | 0.585 | 0.134 | 0.115 | 0.17 | 0.160 | 0.159 |
| 200 | 400 | 0.25 | 0.20 | 0.054 | 0.368 | 0.474 | 0.587 | 0.131 | 0.123 | 0.157 | 0.155 | 0.152 |
| 200 | 400 | 0.25 | 0.40 | 0.068 | 0.363 | 0.477 | 0.586 | 0.133 | 0.117 | 0.193 | 0.151 | 0.149 |
| 400 | 400 | 0.15 | 0.20 | 0.022 | 0.373 | 0.460 | 0.588 | 0.129 | 0.124 | 0.165 | 0.160 | 0.159 |
| 400 | 400 | 0.15 | 0.40 | 0.014 | 0.364 | 0.468 | 0.585 | 0.129 | 0.123 | 0.174 | 0.160 | 0.160 |
| 400 | 400 | 0.25 | 0.20 | 0.048 | 0.367 | 0.462 | 0.586 | 0.127 | 0.115 | 0.175 | 0.153 | 0.151 |
| 400 | 400 | 0.25 | 0.40 | 0.063 | 0.361 | 0.464 | 0.584 | 0.12 | 0.11 | 0.176 | 0.148 | 0.147 |
| 200 | 800 | 0.15 | 0.20 | 0.095 | 0.322 | 0.44 | 0.576 | 0.122 | 0.124 | 0.152 | 0.150 | 0.149 |
| 200 | 800 | 0.15 | 0.40 | 0.103 | 0.319 | 0.432 | 0.575 | 0.127 | 0.129 | 0.173 | 0.146 | 0.145 |
| 200 | 800 | 0.25 | 0.20 | 0.111 | 0.33 | 0.366 | 0.582 | 0.126 | 0.123 | 0.153 | 0.141 | 0.162 |
| 200 | 800 | 0.25 | 0.40 | 0.128 | 0.315 | 0.433 | 0.577 | 0.129 | 0.134 | 0.17 | 0.134 | 0.138 |
| 400 | 800 | 0.15 | 0.20 | 0.092 | 0.318 | 0.423 | 0.577 | 0.134 | 0.114 | 0.168 | 0.148 | 0.148 |
| 400 | 800 | 0.15 | 0.40 | 0.102 | 0.324 | 0.428 | 0.579 | 0.111 | 0.107 | 0.149 | 0.143 | 0.143 |
| 400 | 800 | 0.25 | 0.20 | 0.109 | 0.319 | 0.431 | 0.579 | 0.119 | 0.115 | 0.162 | 0.138 | 0.154 |
| 400 | 800 | 0.25 | 0.40 | 0.123 | 0.324 | 0.427 | 0.58 | 0.135 | 0.139 | 0.174 | 0.188 | 0.133 |
| 200 | 1200 | 0.15 | 0.20 | 0.11 | 0.312 | 0.384 | 0.578 | 0.139 | 0.139 | 0.157 | 0.145 | 0.156 |
| 200 | 1200 | 0.15 | 0.40 | 0.104 | 0.304 | 0.395 | 0.577 | 0.135 | 0.14 | 0.17 | 0.138 | 0.144 |
| 200 | 1200 | 0.25 | 0.20 | 0.124 | 0.308 | 0.4 | 0.576 | 0.132 | 0.131 | 0.153 | 0.212 | 0.162 |
| 200 | 1200 | 0.25 | 0.40 | 0.131 | 0.309 | 0.395 | 0.582 | 0.133 | 0.136 | 0.168 | 0.207 | 0.212 |
| 400 | 1200 | 0.15 | 0.20 | 0.112 | 0.316 | 0.388 | 0.58 | 0.122 | 0.124 | 0.154 | 0.141 | 0.154 |
| 400 | 1200 | 0.15 | 0.40 | 0.122 | 0.314 | 0.396 | 0.58 | 0.123 | 0.123 | 0.161 | 0.160 | 0.126 |
| 400 | 1200 | 0.25 | 0.20 | 0.127 | 0.315 | 0.403 | 0.573 | 0.13 | 0.132 | 0.162 | 0.186 | 0.197 |
| 400 | 1200 | 0.25 | 0.40 | 0.127 | 0.309 | 0.40 | 0.579 | 0.135 | 0.137 | 0.173 | 0.157 | 0.231 |
n is the sample size; q is the number omics measurements in each layer;
h measures the strength of regulation across layers; ρ is the correlation coefficient among CNVs
Simulation results for Scenario II
| Parameters |
| |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
| MuNCut | KM | SC | HC | KM* | SC* | HC* | LC | FGC |
| 200 | 400 | 0.15 | 0.20 | 0.026 | 0.365 | 0.462 | 0.582 | 0.13 | 0.122 | 0.188 | 0.139 | 0.155 |
| 200 | 400 | 0.15 | 0.40 | 0.025 | 0.411 | 0.476 | 0.564 | 0.133 | 0.119 | 0.202 | 0.158 | 0.157 |
| 200 | 400 | 0.25 | 0.20 | 0.095 | 0.409 | 0.475 | 0.566 | 0.131 | 0.122 | 0.19 | 0.163 | 0.163 |
| 200 | 400 | 0.25 | 0.40 | 0.118 | 0.409 | 0.473 | 0.563 | 0.124 | 0.12 | 0.202 | 0.157 | 0.155 |
| 400 | 400 | 0.15 | 0.20 | 0.024 | 0.412 | 0.469 | 0.564 | 0.13 | 0.125 | 0.204 | 0.155 | 0.152 |
| 400 | 400 | 0.15 | 0.40 | 0.024 | 0.413 | 0.475 | 0.567 | 0.129 | 0.123 | 0.197 | 0.155 | 0.153 |
| 400 | 400 | 0.25 | 0.20 | 0.096 | 0.413 | 0.469 | 0.564 | 0.128 | 0.113 | 0.20 | 0.162 | 0.159 |
| 400 | 400 | 0.25 | 0.40 | 0.111 | 0.411 | 0.479 | 0.565 | 0.125 | 0.134 | 0.203 | 0.153 | 0.151 |
| 200 | 800 | 0.15 | 0.20 | 0.113 | 0.399 | 0.436 | 0.561 | 0.129 | 0.118 | 0.179 | 0.152 | 0.174 |
| 200 | 800 | 0.15 | 0.40 | 0.132 | 0.397 | 0.443 | 0.560 | 0.138 | 0.138 | 0.194 | 0.144 | 0.143 |
| 200 | 800 | 0.25 | 0.20 | 0.132 | 0.405 | 0.432 | 0.562 | 0.127 | 0.12 | 0.18 | 0.181 | 0.151 |
| 200 | 800 | 0.25 | 0.40 | 0.142 | 0.397 | 0.442 | 0.56 | 0.138 | 0.137 | 0.197 | 0.208 | 0.164 |
| 400 | 800 | 0.15 | 0.20 | 0.106 | 0.402 | 0.443 | 0.560 | 0.129 | 0.129 | 0.184 | 0.148 | 0.149 |
| 400 | 800 | 0.15 | 0.40 | 0.134 | 0.394 | 0.452 | 0.559 | 0.14 | 0.137 | 0.198 | 0.141 | 0.140 |
| 400 | 800 | 0.25 | 0.20 | 0.13 | 0.391 | 0.431 | 0.546 | 0.125 | 0.122 | 0.189 | 0.180 | 0.149 |
| 400 | 800 | 0.25 | 0.40 | 0.141 | 0.396 | 0.429 | 0.561 | 0.143 | 0.142 | 0.196 | 0.165 | 0.213 |
| 200 | 1200 | 0.15 | 0.20 | 0.127 | 0.383 | 0.412 | 0.554 | 0.137 | 0.131 | 0.161 | 0.145 | 0.176 |
| 200 | 1200 | 0.15 | 0.40 | 0.143 | 0.404 | 0.441 | 0.558 | 0.14 | 0.138 | 0.186 | 0.218 | 0.149 |
| 200 | 1200 | 0.25 | 0.20 | 0.137 | 0.393 | 0.417 | 0.558 | 0.142 | 0.14 | 0.178 | 0.224 | 0.219 |
| 200 | 1200 | 0.25 | 0.40 | 0.148 | 0.393 | 0.434 | 0.56 | 0.141 | 0.14 | 0.188 | 0.163 | 0.238 |
| 400 | 1200 | 0.15 | 0.20 | 0.126 | 0.398 | 0.426 | 0.559 | 0.14 | 0.142 | 0.183 | 0.194 | 0.147 |
| 400 | 1200 | 0.15 | 0.40 | 0.14 | 0.396 | 0.427 | 0.559 | 0.142 | 0.141 | 0.184 | 0.192 | 0.221 |
| 400 | 1200 | 0.25 | 0.20 | 0.126 | 0.401 | 0.428 | 0.560 | 0.139 | 0.141 | 0.181 | 0.165 | 0.220 |
| 400 | 1200 | 0.25 | 0.40 | 0.142 | 0.397 | 0.420 | 0.559 | 0.143 | 0.147 | 0.187 | 0.165 | 0.242 |
n is the sample size; q is the number omics measurements in each layer;
h measures the strength of regulation across layers; ρ is the correlation coefficient among CNVs
Simulation results for Scenario III
| Parameters |
| |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
|
|
| MuNCut | KM | SC | HC | KM* | SC* | HC* | LC | FGC |
| 200 | 400 | 0.15 | 0.20 | 0.064 | 0.359 | 0.459 | 0.583 | 0.124 | 0.125 | 0.186 | 0.172 | 0.200 |
| 200 | 400 | 0.15 | 0.40 | 0.108 | 0.354 | 0.464 | 0.582 | 0.124 | 0.126 | 0.194 | 0.171 | 0.171 |
| 200 | 400 | 0.25 | 0.20 | 0.126 | 0.360 | 0.462 | 0.584 | 0.127 | 0.128 | 0.186 | 0.192 | 0.224 |
| 200 | 400 | 0.25 | 0.40 | 0.141 | 0.355 | 0.468 | 0.578 | 0.131 | 0.129 | 0.198 | 0.147 | 0.144 |
| 400 | 400 | 0.15 | 0.20 | 0.06 | 0.356 | 0.457 | 0.583 | 0.121 | 0.123 | 0.185 | 0.171 | 0.169 |
| 400 | 400 | 0.15 | 0.40 | 0.097 | 0.354 | 0.46 | 0.587 | 0.12 | 0.124 | 0.193 | 0.164 | 0.162 |
| 400 | 400 | 0.25 | 0.20 | 0.121 | 0.358 | 0.456 | 0.585 | 0.122 | 0.123 | 0.185 | 0.174 | 0.152 |
| 400 | 400 | 0.25 | 0.40 | 0.138 | 0.357 | 0.462 | 0.586 | 0.124 | 0.124 | 0.191 | 0.134 | 0.136 |
| 200 | 800 | 0.15 | 0.20 | 0.122 | 0.314 | 0.434 | 0.578 | 0.13 | 0.132 | 0.189 | 0.175 | 0.172 |
| 200 | 800 | 0.15 | 0.40 | 0.134 | 0.315 | 0.431 | 0.579 | 0.139 | 0.134 | 0.19 | 0.212 | 0.172 |
| 200 | 800 | 0.25 | 0.20 | 0.142 | 0.32 | 0.402 | 0.567 | 0.128 | 0.128 | 0.19 | 0.202 | 0.195 |
| 200 | 800 | 0.25 | 0.40 | 0.144 | 0.318 | 0.414 | 0.58 | 0.146 | 0.144 | 0.196 | 0.166 | 0.206 |
| 400 | 800 | 0.15 | 0.20 | 0.121 | 0.321 | 0.421 | 0.578 | 0.129 | 0.129 | 0.174 | 0.191 | 0.153 |
| 400 | 800 | 0.15 | 0.40 | 0.144 | 0.321 | 0.427 | 0.577 | 0.146 | 0.144 | 0.193 | 0.165 | 0.216 |
| 400 | 800 | 0.25 | 0.20 | 0.141 | 0.315 | 0.424 | 0.578 | 0.127 | 0.128 | 0.173 | 0.152 | 0.197 |
| 400 | 800 | 0.25 | 0.40 | 0.143 | 0312 | 0.439 | 0.579 | 0.131 | 0.134 | 0.188 | 0.168 | 0.228 |
| 200 | 1200 | 0.15 | 0.20 | 0.138 | 0.307 | 0.391 | 0.578 | 0.139 | 0.139 | 0.168 | 0.207 | 0.212 |
| 200 | 1200 | 0.15 | 0.40 | 0.146 | 0.314 | 0.389 | 0.575 | 0.148 | 0.147 | 0.19 | 0.160 | 0.235 |
| 200 | 1200 | 0.25 | 0.20 | 0.136 | 0.30 | 0.374 | 0.575 | 0.136 | 0.133 | 0.169 | 0.187 | 0.225 |
| 200 | 1200 | 0.25 | 0.40 | 0.144 | 0.308 | 0.405 | 0.572 | 0.146 | 0.145 | 0.189 | 0.169 | 0.232 |
| 400 | 1200 | 0.15 | 0.20 | 0.136 | 0.316 | 0.406 | 0.573 | 0.138 | 0.139 | 0.163 | 0.159 | 0.223 |
| 400 | 1200 | 0.15 | 0.40 | 0.144 | 0.30 | 0.389 | 0.571 | 0.146 | 0.145 | 0.189 | 0.165 | 0.239 |
| 400 | 1200 | 0.25 | 0.20 | 0.141 | 0.316 | 0.376 | 0.577 | 0.135 | 0.139 | 0.183 | 0.171 | 0.228 |
| 400 | 1200 | 0.25 | 0.40 | 0.141 | 0.308 | 0.391 | 0.575 | 0.139 | 0.14 | 0.186 | 0.146 | 0.219 |
n is the sample size; q is the number omics measurements in each layer;
h measures the strength of regulation across layers; ρ is the correlation coefficient among CNVs