| Literature DB >> 28617220 |
Sunny Raj1, Faraz Hussain2, Zubir Husein3, Neslisah Torosdagli3, Damla Turgut3, Narsingh Deo3, Sumanta Pattanaik3, Chung-Che Jeff Chang4, Sumit Kumar Jha3.
Abstract
BACKGROUND: Polychromatic flow cytometry is a popular technique that has wide usage in the medical sciences, especially for studying phenotypic properties of cells. The high-dimensionality of data generated by flow cytometry usually makes it difficult to visualize. The naive solution of simply plotting two-dimensional graphs for every combination of observables becomes impractical as the number of dimensions increases. A natural solution is to project the data from the original high dimensional space to a lower dimensional space while approximately preserving the overall relationship between the data points. The expert can then easily visualize and analyze this low-dimensional embedding of the original dataset.Entities:
Keywords: Automated synthesis; Biomedical informatics; Flow cytometry; High-dimensional data; High-fidelity visualization; Symbolic decision procedures
Mesh:
Year: 2017 PMID: 28617220 PMCID: PMC5471952 DOI: 10.1186/s12859-017-1662-4
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Distortions produced by the MDS approach and SANJAY when 10 randomly chosen high-dimensional data points from 30 flow cytometry datasets were projected onto two dimensions
| Dataset | Maximum | Maximum | Ratio of | Dataset | Maximum | Maximum | Ratio of |
|---|---|---|---|---|---|---|---|
| ID | distortion | distortion | maximum distortions | ID | distortion | distortion | maximum distortions |
| for MDS | for SANJAY | MDS/SANJAY | for MDS | for SANJAY | MDS/SANJAY | ||
| 1 | 3197.8 | 1000 | 3.197 | 16 | 3150.4 | 1200 | 2.625 |
| 2 | 2711.1 | 1200 | 2.259 | 17 | 2497.2 | 1100 | 2.270 |
| 3 | 1953.0 | 1000 | 1.953 | 18 | 2925.5 | 1400 | 2.089 |
| 4 | 2917.2 | 1200 | 2.431 | 19 | 3813.3 | 1300 | 2.933 |
| 5 | 3483.5 | 1400 | 2.488 | 20 | 3700.8 | 1300 | 2.846 |
| 6 | 2925.9 | 1100 | 2.659 | 21 | 3011.8 | 1200 | 2.509 |
| 7 | 4233.0 | 1800 | 2.351 | 22 | 3252.4 | 1000 | 3.252 |
| 8 | 2898.0 | 1300 | 2.229 | 23 | 3381.4 | 1200 | 2.817 |
| 9 | 1876.7 | 1300 | 1.443 | 24 | 2963.9 | 1100 | 2.694 |
| 10 | 4314.1 | 1500 | 2.876 | 25 | 3428.3 | 1600 | 2.142 |
| 11 | 3543.6 | 1400 | 2.531 | 26 | 2712.2 | 1200 | 2.260 |
| 12 | 2449.8 | 1300 | 1.884 | 27 | 3679.7 | 1500 | 2.453 |
| 13 | 3835.2 | 1500 | 2.556 | 28 | 3286.0 | 1200 | 2.738 |
| 14 | 4153.3 | 1000 | 4.153 | 29 | 2449.7 | 1000 | 2.449 |
| 15 | 2858.6 | 1000 | 2.858 | 30 | 4160.0 | 1400 | 2.971 |
The maximum distortion produced by SANJAY was, on average, 2.56 times less than that produced by MDS
Fig. 1Steps for generating the structural representation of flow cytometry data for use in the SANJAY visualization synthesis technique
Average distortions produced by the MDS approach and SANJAY when 10 randomly chosen high-dimensional data points from 30 flow cytometry datasets were projected onto two dimensions
| Dataset | Average distortion | Average distortion | Dataset | Average distortion | Average distortion |
|---|---|---|---|---|---|
| ID | for MDS | for SANJAY | ID | for MDS | for SANJAY |
| 1 | 1042.4 | 540.8 | 16 | 1034.4 | 733.8 |
| 2 | 1024.4 | 653.3 | 17 | 919.5 | 623.0 |
| 3 | 649.2 | 537.5 | 18 | 1056.8 | 822.4 |
| 4 | 897.4 | 765.3 | 19 | 1117.4 | 757.5 |
| 5 | 1089.6 | 806.3 | 20 | 989.5 | 773.6 |
| 6 | 1069.4 | 634.0 | 21 | 1057.5 | 684.8 |
| 7 | 1374.4 | 1010.7 | 22 | 1412.6 | 605.7 |
| 8 | 949.8 | 709.4 | 23 | 915.0 | 712.8 |
| 9 | 765.9 | 752.5 | 24 | 824.3 | 741.1 |
| 10 | 1011.7 | 892.9 | 25 | 1178.1 | 1033.5 |
| 11 | 1050.4 | 882.8 | 26 | 949.2 | 713.3 |
| 12 | 1050.3 | 760.0 | 27 | 1114.2 | 833.6 |
| 13 | 1241.7 | 849.7 | 28 | 935.4 | 611.7 |
| 14 | 985.7 | 613.4 | 29 | 1004.8 | 561.3 |
| 15 | 1249.6 | 612.4 | 30 | 1178.4 | 874.1 |
Fig. 2Plots of the two dimensional projections synthesized by the SANJAY algorithm for 1000 randomly chosen data points from 6 flow cytometry datasets (dataset IDs 9, 24, 11, 14, 17, and 5 respectively in Table 1). For these and 24 other flow cytometry datasets, Table 1 lists the maximum distance distortion when 12-dimensional flow cytometry data is projected onto two dimensions, and Table 2 lists the average distortions
Maximum distortions produced by SANJAY and Random Projections technique when 10 randomly chosen high-dimensional data points from 30 flow cytometry datasets were projected onto two dimensions
| Dataset | Maximum | Maximum distortion | Ratio of maximum | Dataset | Maximum | Maximum distortion | Ratio of maximum |
|---|---|---|---|---|---|---|---|
| ID | distortion | for random | distortions | ID | distortion | for random | distortions |
| for SANJAY | projections | RP/SANJAY | for SANJAY | projections | RP/SANJAY | ||
| 1 | 1000 | 4069 | 4.07 | 16 | 1200 | 6732 | 5.61 |
| 2 | 1200 | 4179 | 3.48 | 17 | 1100 | 4298 | 3.90 |
| 3 | 1000 | 3982 | 3.98 | 18 | 1400 | 4922 | 3.51 |
| 4 | 1200 | 5289 | 4.40 | 19 | 1300 | 6719 | 5.16 |
| 5 | 1400 | 5045 | 3.60 | 20 | 1300 | 5583 | 4.29 |
| 6 | 1100 | 5092 | 4.62 | 21 | 1200 | 5311 | 4.42 |
| 7 | 1800 | 5364 | 2.98 | 22 | 1000 | 4447 | 4.44 |
| 8 | 1300 | 3566 | 2.74 | 23 | 1200 | 4731 | 3.94 |
| 9 | 1300 | 4357 | 3.35 | 24 | 1100 | 6251 | 5.68 |
| 10 | 1500 | 4262 | 2.84 | 25 | 1600 | 5919 | 3.69 |
| 11 | 1400 | 4945 | 3.53 | 26 | 1200 | 5385 | 4.48 |
| 12 | 1300 | 4370 | 3.36 | 27 | 1500 | 4886 | 3.25 |
| 13 | 1500 | 4747 | 3.16 | 28 | 1200 | 5884 | 4.90 |
| 14 | 1000 | 7029 | 7.02 | 29 | 1000 | 5398 | 5.30 |
| 15 | 1000 | 6161 | 6.16 | 30 | 1400 | 3900 | 2.78 |
Average distortions produced by SANJAY and Random Projections when 10 randomly chosen high-dimensional data points from 30 flow cytometry datasets were projected onto two dimensions
| Dataset | Average | Average distortion | Ratio of average | Dataset | Average | Average distortion | Ratio of average |
|---|---|---|---|---|---|---|---|
| ID | distortion | for random | distortions | ID | distortion | for random | distortions |
| for SANJAY | projections | RP/SANJAY | for SANJAY | projections | RP/SANJAY | ||
| 1 | 540.8 | 1289.2 | 2.38 | 16 | 733.8 | 1791.5 | 2.44 |
| 2 | 653.3 | 1226.5 | 1.87 | 17 | 623.0 | 1361.3 | 2.18 |
| 3 | 537.5 | 1095.5 | 2.03 | 18 | 822.4 | 1480.3 | 1.80 |
| 4 | 765.3 | 1637.1 | 2.13 | 19 | 757.5 | 1912.7 | 2.52 |
| 5 | 806.3 | 1654.7 | 2.05 | 20 | 773.6 | 1806.0 | 2.33 |
| 6 | 634.0 | 1555.5 | 2.45 | 21 | 684.8 | 1535.2 | 2.24 |
| 7 | 1010.7 | 1608.8 | 1.59 | 22 | 605.7 | 1440.1 | 2.37 |
| 8 | 709.4 | 1111.8 | 1.56 | 23 | 712.8 | 1355.4 | 1.90 |
| 9 | 752.5 | 1439.5 | 1.91 | 24 | 741.1 | 1944.2 | 2.62 |
| 10 | 892.9 | 1376.7 | 1.54 | 25 | 1033.5 | 1943.4 | 1.88 |
| 11 | 882.8 | 1578.5 | 1.78 | 26 | 713.3 | 1762.9 | 2.47 |
| 12 | 760.0 | 1395.6 | 1.83 | 27 | 833.6 | 1519.0 | 1.82 |
| 13 | 849.7 | 1363.1 | 1.60 | 28 | 611.7 | 1648.0 | 2.69 |
| 14 | 613.4 | 2084.7 | 3.39 | 29 | 561.3 | 1513.4 | 2.70 |
| 15 | 612.4 | 1916.6 | 3.12 | 30 | 874.1 | 1047.5 | 1.19 |