| Literature DB >> 23396282 |
Nima Aghaeepour1, Greg Finak, Holger Hoos, Tim R Mosmann, Ryan Brinkman, Raphael Gottardo, Richard H Scheuermann.
Abstract
Traditional methods for flow cytometry (FCM) data processing rely on subjective manual gating. Recently, several groups have developed computational methods for identifying cell populations in multidimensional FCM data. The Flow Cytometry: Critical Assessment of Population Identification Methods (FlowCAP) challenges were established to compare the performance of these methods on two tasks: (i) mammalian cell population identification, to determine whether automated algorithms can reproduce expert manual gating and (ii) sample classification, to determine whether analysis pipelines can identify characteristics that correlate with external variables (such as clinical outcome). This analysis presents the results of the first FlowCAP challenges. Several methods performed well as compared to manual gating or external variables using statistical performance measures, which suggests that automated methods have reached a sufficient level of maturity and accuracy for reliable use in FCM data analysis.Entities:
Mesh:
Year: 2013 PMID: 23396282 PMCID: PMC3906045 DOI: 10.1038/nmeth.2365
Source DB: PubMed Journal: Nat Methods ISSN: 1548-7091 Impact factor: 28.547
Participating algorithms: algorithms that were applied in at least one challenge
| Algorithm name | Availabilitya | Brief descriptionb | SN/ref.c |
|---|---|---|---|
| Cell population identification | |||
| ADICyt | Commercially available | Hierarchical clustering and entropy-based merging | 1.1.1/– |
| CDP | Python source code | Bayesian nonparametric mixture models, calculated using massively parallel computing on GPUs | 1.1.2/ ref. |
| FLAME | R package | Multivariate finite mixtures of skew and heavy-tailed distributions | 1.1.3/ref. |
| FLOCK | C source code | Grid-based partitioning and merging | 1.1.4/ref. |
| flowClust/Merge | Two R/BioC packages | 1.1.5/refs. | |
| flowKoh | R source code | Self-organizing maps | 1.1.6/– |
| flowMeans | R/BioC package | 1.1.7/ref. | |
| FlowVB | Python source code | 1.1.8/– | |
| L2kmeans | JAVA source code | Discrepancy learning | 1.1.9/ ref. |
| MM, MMPCA | Windows and Linux executable | Density-based Misty Mountain clustering | 1.1.10/ref. |
| NMFcurvHDR | R source code | Density-based clustering and non-negative matrix factorization | 1.1.11/ref. |
| SamSPECTRAL | R/BioC package | Efficient spectral clustering using density-based downsampling | 1.1.12/ref. |
| SWIFT | MATLAB source code | Weighted iterative sampling and mixture modeling | 1.1.13/ ref. |
| RadialSVM | MATLAB source code | Supervised training of radial SVMs using example manual gates | 1.1.14/ref. |
| Ensemble clustering | R/CRAN package | Combines the results of all participating algorithms | Online |
| Sample classification | |||
| 2DhistSVM | Pseudocode | 2D histograms of all pairs of dimensions and support vector machines | 1.2.1/– |
| admire-lvq | MATLAB source code | 1D features and learning vector quantization | 1.2.2/– |
| biolobe | Pseudocode | 1.2.3/– | |
| daltons | MATLAB source code | Linear discriminant analysis and logistic regression | 1.2.4/– |
| DREAM–A | Pseudocode | 2D and 3D histograms and cross-validation of several classifiers | 1.2.5/– |
| DREAM–B | Pseudocode | 1D Gaussian mixtures and support vector machines | 1.2.6/– |
| DREAM–C | Pseudocode | 1D gating and several different classifiers | 1.2.7/– |
| DREAM–D | Pseudocode | 4D clustering and bootstrapped | 1.2.8/– |
| EMMIXCYTOM, uqs | R source code | Skew- | 1.2.9/– |
| fivebyfive | Pseudocode | 1D histograms and support vector machines | 1.2.10/– |
| flowBin | R package | High-dimensional cluster mapping across multiple tubes and support vector machines | 1.2.11/– |
| flowCore-flowStats | R source code | Sequential gating and normalization and a beta-binomial model | 1.2.12/ ref. |
| flowPeakssvm, Kmeanssvm | R package | 1.2.13/ref. | |
| flowType, flowType FeaLect | Two R/BioC packages | 1D gates extrapolated to multiple dimensions and bootstrapped LASSO classification | 1.2.14/refs. |
| jkjg | JAVA source code | 1D Gaussian and logistic regression | 1.2.15/– |
| PBSC | C source code | Multidimensional clustering and cross-sample population matching using a relative distance order | 1.2.16/ ref. |
| PRAMS | R source code | 2D clustering and logistic regression | 1.2.17/– |
| Pram Spheres, CIHC | Pseudocode | Genetic algorithm and gradient boosting | 1.2.18/– |
| Random Spheres | Pseudocode | Hypersphere-based Monte Carlo optimization | 1.2.18/– |
| SPADE, BCB | MATLAB, Cytoscape, R/BioC | Density-based sampling, | 1.2.19/ref. |
| SPCA+GLM | Pseudocode | 1D probability binning and principal-component analysis | 1.2.20/– |
| SWIFT | MATLAB source code | SWIFT clustering and support vector machines | 1.2.21/ ref. |
| Team21 | Python source code | 1D relative entropies | 1.2.22/– |
aSee Supplementary Table 3 for algorithm contact information.
bSee Supplementary Note 1 for more details about each program.
cSupplementary Note 1 section (SN) and reference citation.
Summary of results for the cell identification challenges
| GvHD | DLBCL | HSCT | WNV | ND | Mean | Runtime h:mm:ssb | Rank scorec | |
|---|---|---|---|---|---|---|---|---|
| Challenge 1: completely automated | ||||||||
| ADICyt |
|
|
|
|
| 0.89 | 4:50:37 | 52 |
| flowMeans |
|
|
|
| 0.85 (0.76, 0.92) | 0.89 | 0:02:18 | 49 |
| FLOCK |
|
| 0.86 (0.83, 0.89) |
| 0.91 (0.89, 0.92) | 0.86 | 0:00:20 | 45 |
| FLAME |
|
|
| 0.80 (0.76, 0.84) | 0.90 (0.89, 0.90) | 0.88 | 0:04:20 | 44 |
| SamSPECTRAL |
| 0.86 (0.82, 0.90) | 0.85 (0.82, 0.88) | 0.75 (0.60, 0.85) | 0.85 | 0:03:51 | 39 | |
| MMPCA |
| 0.85 (0.82, 0.88) | 0.64 (0.51, 0.71) | 0.76 (0.75, 0.77) | 0.80 | 0:00:03 | 29 | |
| FlowVB |
| 0.87 (0.85, 0.90) | 0.75 (0.70, 0.79) | 0.81 (0.78, 0.83) | 0.85 (0.84, 0.86) | 0.82 | 0:38:49 | 28 |
| MM |
|
| 0.73 (0.66, 0.80) | 0.69 (0.60, 0.75) | 0.75 (0.74, 0.76) | 0.78 | 0:00:10 | 28 |
| flowClust/Merge | 0.69 (0.55, 0.79) | 0.84 (0.81, 0.86) | 0.81 (0.77, 0.85) | 0.77 (0.74, 0.79) | 0.73 (0.58, 0.85) | 0.77 | 2:12:00 | 24 |
| L2kmeans | 0.64 (0.57, 0.72) | 0.79 (0.74, 0.83) | 0.70 (0.65, 0.75) | 0.78 (0.75, 0.81) | 0.81 (0.80, 0.82) | 0.74 | 0:08:03 | 20 |
| CDP | 0.52 (0.46, 0.58) | 0.87 (0.85, 0.90) | 0.50 (0.48, 0.52) | 0.71 (0.68, 0.75) | 0.88 (0.86, 0.90) | 0.70 | 0:00:57 | 19 |
| SWIFT | 0.63 (0.56, 0.70) | 0.67 (0.62, 0.71) | 0.59 (0.55, 0.62) | 0.69 (0.64, 0.74) | 0.87 (0.86, 0.88) | 0.69 | 1:14:50 | 15 |
| Ensemble clustering | 0.88 | 0.94 | 0.97 | 0.88 | 0.94 | 0.92 | – | 64 |
| Challenge 2: manually tuned | ||||||||
| ADICyt |
|
|
|
|
| 0.89 | 4:50:37 | 34 |
| SamSPECTRAL |
|
| 0.90 (0.86, 0.93) |
|
| 0.89 | 0:06:47 | 31 |
| FLOCK |
| 0.88 (0.85, 0.91) | 0.86 (0.83, 0.89) |
| 0.89 (0.87, 0.91) | 0.86 | 0:00:15 | 23 |
| FLAME |
| 0.87 (0.84, 0.90) | 0.87 (0.82, 0.90) |
| 0.87 (0.86, 0.87) | 0.85 | 0:04:20 | 23 |
| SamSPECTRAL-FK |
| 0.85 (0.81, 0.89) | 0.90 (0.86, 0.92) | 0.76 (0.71, 0.81) |
| 0.86 | 0:04:25 | 23 |
| CDP |
| 0.89 (0.86, 0.91) | 0.90 (0.88, 0.92) | 0.75 (0.71, 0.78) | 0.86 (0.85, 0.88) | 0.83 | 0:00:18 | 19 |
| flowClust/Merge | 0.69 (0.53, 0.78) | 0.87 (0.85, 0.90) | 0.77 (0.75, 0.79) | 0.88 (0.81, 0.91) | 0.83 | 2:12:00 | 18 | |
| NMFcurvHDR |
| 0.84 (0.83, 0.86) | 0.70 (0.67, 0.74) | 0.81 (0.77, 0.84) | 0.83 (0.83, 0.84) | 0.79 | 1:39:42 | 13 |
| Ensemble clustering | 0.87 | 0.94 | 0.98 | 0.87 | 0.92 | 0.91 | – | 41 |
| Challenge 3: assignment of cells to populations with predefined number of populations | ||||||||
| ADICyt |
|
|
| 0.95 | 0:10:49 | 26.2 | ||
| SamSPECTRAL |
|
|
| 0.92 | 0:02:30 | 26.2 | ||
| flowMeans |
|
| 0.95 (0.93, 0.96) | 0.93 | 0:00:01 | 23.4 | ||
| TCLUST |
|
| 0.93 (0.90, 0.95) | 0.93 | 0:00:40 | 23.4 | ||
| FLOCK | 0.92 (0.89, 0.94) |
| 0.92 | 0:00:02 | 22.2 | |||
| CDP |
|
| 0.76 (0.72, 0.81) | 0.84 | 0:00:21 | 16.9 | ||
| flowClust/Merge |
| 0.90 (0.86, 0.94) | 0.83 (0.79, 0.88) | 0.87 | 0:49:24 | 15.9 | ||
| FLAME |
| 0.90 (0.86, 0.93) | 0.86 (0.82, 0.91) | 0.87 | 0:03:20 | 15.9 | ||
| SWIFT |
| 0.00 (0.00, 0.00) | 0.88 (0.84, 0.92) | 0.59 | 0:01:37 | 11.9 | ||
| flowKoh | 0.85 (0.80, 0.90) | 0.85 (0.82, 0.88) | 0.87 (0.84, 0.91) | 0.86 | 0:00:42 | 9.5 | ||
| NMF | 0.74 (0.69, 0.78) | 0.84 (0.80, 0.88) | 0.80 (0.76, 0.84) | 0.79 | 0:01:00 | 7.5 | ||
| Ensemble clustering | 0.95 | 0.97 | 0.98 | 0.97 | – | 35 | ||
| Challenge 4: supervised approaches trained using human-provided gates | ||||||||
| RadialSVM |
| 0.84 (0.80, 0.87) |
|
|
| 0.92 | 0:00:18 | 21 |
| flowClust/Merge |
|
|
| 0.84 (0.82, 0.86) | 0.89 (0.88, 0.90) | 0.90 | 5:31:50 | 19 |
| randomForests |
| 0.78 (0.74, 0.83) | 0.81 (0.79, 0.83) | 0.87 (0.84, 0.90) |
| 0.85 | 0:02:06 | 15 |
| FLOCK | 0.82 (0.77, 0.87) |
| 0.86 (0.76, 0.93) | 0.86 (0.82, 0.89) | 0.86 (0.77, 0.92) | 0.86 | 0:00:05 | 13 |
| CDP | 0.78 (0.68, 0.87) |
| 0.75 (0.71, 0.78) | 0.86 (0.84, 0.88) | 0.83 (0.80, 0.86) | 0.83 | 0:00:15 | 11 |
| Ensemble clustering | 0.91 | 0.94 | 0.95 | 0.92 | 0.94 | 0.93 | – | 26 |
aIn each data set/challenge, the top algorithm (highest mean F-measure) and the algorithms with overlapping confidence intervals with the top algorithm are boldface (see Online Methods for F-measure calculations).
bRun time was calculated as time per CPU per sample.
cAlgorithms are sorted by rank score within each challenge (see Online Methods for rank score calculations). Data sets: GvHD, graft-versus-host disease; DLBCL, diffuse large B-cell lymphoma; WNV, symptomatic West Nile virus; ND, normal donors; HSCT, hematopoietic stem cell transplant.
Figure 1F-measure results of cell population identification challenges.
Average manual and algorithm F-measures are represented against the manual consensus cluster as a function of the number of populations included, ranked from most consistent to least consistent. For a given population, consistency was defined as the agreement among manual gates, calculated as the average manual F-measures against the manual consensus cluster for that population. All populations across all samples were included in this calculation, and, as such, the numbers on the x axis should be multiplied by 12 and 30 (for GvHD and HSCT, respectively) to reflect the total number of populations in all samples in the reference. Individual manual gating results are plotted as gray lines. (a) Graft-versus-host disease (GvHD) data set. (b) Hematopoietic stem cell transplant (HSCT) data set.
Figure 2Per-population pairwise comparisons of the cell population identification challenges.
Average F-measures of all pairs of results for the five cell populations across all samples in the hematopoietic stem cell transplant (HSCT) data set are represented as heat maps. The heat-map color in individual squares reflects the pairwise agreement between each method for each cell population independently, and the position in the matrix reflects the pattern of agreement across all methods on the basis of hierarchical clustering. The manual-gate consensus cluster for each sample was used as a reference for matching of the automated results of that sample. Pairwise F-measures between all algorithms and manual gates for the HSCT data set are shown. The dendrogram groups the algorithms and manual gates on the basis of the similarities between their pairwise F-measures. EC, ensemble clustering.
Figure 3Comparison of manual-gate consensus and ensemble clustering results.
Dots are color-coded by population membership as determined by ensemble clustering, with donor-derived (CD45.2+) granulocytes/monocytes in green and donor-derived lymphocytes in red. Colored polygons enclose regions corresponding to the consensus clustering of manual gates. Fluorochromes used: FITC, fluorescein isothiocyanate; PE, phycoerythrin; APC, allophycocyanin. (a,b) Sample for which all of the cell populations have been accurately identified. (c,d) Sample in which the tail of the blue population has been misclassified as orange by the algorithms, resulting in a lower F-measure for the blue population. The red, blue, green, purple and orange cell populations match cell population 1–5 of Figure 2, respectively.
Performance of algorithms in the sample-classification challenges on the validation cohorta
| Recall | Precision | Accuracy | Recall | Precision | Accuracy | Recall | Precision | Accuracy | ||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Challenge 1: HEUvsUE | Challenge 2: AML | Challenge 3: HVTN | ||||||||||
| FlowCAP | ||||||||||||
| 2DhistsSVMb | 0.50 | 0.091 | 0.50 |
| 0.00 | 0.95 | 0.99 |
| ||||
| EMMIXCYTOM | 0.95 | 0.95 | 0.99 |
| ||||||||
| flowBin | 0.012 | 0.00 | 0.45 |
| 0.10 | 0.30 | 0.92 |
| ||||
| flowCore-flowStats | 0.56 | 0.455 | 0.55 |
| 1.00 | 1.00 | 1.00 |
| ||||
| flowPeakssvm | 1.00 | 1.00 | 1.00 |
| ||||||||
| flowType | 0.58 | 0.636 | 0.59 |
| 0.95 | 0.95 | 0.99 |
| 0.88 | 0.71 | 0.81 |
|
| flowType-FeaLect | 0.55 | 0.545 | 0.55 |
| 1.00 | 1.00 | 1.00 |
| 1.00 | 1.00 | 1.00 |
|
| Kmeanssvm | 1.00 | 1.00 | 1.00 |
| ||||||||
| PBSC | 0.33 | 0.273 | 0.36 |
| 0.75 | 0.75 | 0.94 |
| 0.95 | 0.95 | 0.95 |
|
| PRAMS | 1.00 | 1.00 | 1.00 |
| ||||||||
| Pram Spheres | 0.36 | 0.364 | 0.36 |
| 0.90 | 0.90 | 0.90 |
| ||||
| Random Spheres | 0.95 | 0.95 | 0.99 |
| ||||||||
| SPADE | 1.00 | 1.00 | 1.00 |
| 1.00 | 1.00 | 1.00 |
| ||||
| SWIFT | 0.67 | 0.545 | 0.64 |
| 1.00 | 1.00 | 1.00 |
| ||||
| DREAM | ||||||||||||
| admire-lvq | 1.00 | 1.00 | 1.00 |
| ||||||||
| bcb | 1.00 | 1.00 | 1.00 |
| ||||||||
| biolobe | 1.00 | 1.00 | 1.00 |
| ||||||||
| cihc | 1.00 | 0.95 | 0.99 |
| ||||||||
| daltons | 1.00 | 1.00 | 1.00 |
| ||||||||
| DREAM–A | 0.95 | 0.95 | 0.99 |
| ||||||||
| DREAM–B | 1.00 | 0.85 | 0.98 |
| ||||||||
| DREAM–C | 1.00 | 0.85 | 0.98 |
| ||||||||
| DREAM–D | 0.95 | 0.95 | 0.99 |
| ||||||||
| fivebyfive | 0.95 | 1.00 | 0.99 |
| ||||||||
| jkjg | 1.00 | 1.00 | 1.00 |
| ||||||||
| SPCA+GLM | 0.89 | 0.85 | 0.97 |
| ||||||||
| team21 | 1.00 | 1.00 | 1.00 |
| ||||||||
| uqs | 1.00 | 0.95 | 0.99 |
| ||||||||
aNot all algorithms were applied in all challenges. Particularly, a large number of algorithms participated through the DREAM project that included only the AML data set. Data sets: HEUvsUE, HIV-exposed–uninfected versus unexposed; AML, acute myeloid leukemia; HVTN, identification of antigen stimulation groups of post–HIV vaccine T cells.
bContact information of the participating teams can be found in Supplementary Table 3.
Figure 4Acute myeloid leukemia (AML) subject detected as an outlier by the algorithms.
(a) Total number of misclassifications for each sample in the test set (sample nos. 180–359) of the AML data set. (b–g) Forward scatter (FSC)/side scatter (b–d) and FSC/CD34 (e–g) plots of representative normal (b,e) and AML (c,f) samples and the outlier sample no. 340 (d,g), with the CD34+ cells highlighted in red. Cell proportions of the CD34+ population are reported as blast frequency (freq.) percentages.