| Literature DB >> 35042474 |
Michael Y Lee1,2,3, Jacob S Bedia4, Salil S Bhate1,2,5, Graham L Barlow1,2, Darci Phillips1,2,6, Wendy J Fantl4,7,8, Garry P Nolan2,7, Christian M Schürch9,10,11.
Abstract
BACKGROUND: Algorithmic cellular segmentation is an essential step for the quantitative analysis of highly multiplexed tissue images. Current segmentation pipelines often require manual dataset annotation and additional training, significant parameter tuning, or a sophisticated understanding of programming to adapt the software to the researcher's need. Here, we present CellSeg, an open-source, pre-trained nucleus segmentation and signal quantification software based on the Mask region-convolutional neural network (R-CNN) architecture. CellSeg is accessible to users with a wide range of programming skills.Entities:
Keywords: CODEX; Deep learning; Image analysis; Mask R-CNN; Multiplexed imaging; Pre-trained model; Segmentation
Mesh:
Year: 2022 PMID: 35042474 PMCID: PMC8767664 DOI: 10.1186/s12859-022-04570-9
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1CellSeg pipeline. Overview of CellSeg software with following steps. (1) Extract nuclear channel and crop images to segment. (2) Segment each image crop with CellSeg. (3) Stitch together segmented crops. (4) Expand boundaries of cells using mask expansion. (5) Perform lateral bleed compensation, then compute and output single-cell statistics for N markers
Fig. 2Training and Benchmarking CellSeg performance on the 2018 Kaggle data challenge. A Information on Kaggle dataset used to develop, train, and test CellSeg. CellSeg final performance was assessed on a test set provided by the Kaggle data challenge using mean average precision (mAP) score. B CellSeg segmentation of representative fluorescence image from the Kaggle test set. White arrowheads: cells with blurred nuclear boundaries C CellSeg segmentation of representative H&E-stained brightfield image. Red arrows: nuclear debris. D CellSeg performance compared to other top performing segmentation algorithms in data science bowl. Columns show mean average precision (mean AP) scores reported on Kaggle DSB2018 stage 2 test set and average F1 scores. For nucleAIzer, reported scores from the original publication [29] are displayed. For StarDist, brightfield and fluorescence images were segmented using 2D_versatile_he pre-trained model and 2D_versatile_fluo pre-trained model, respectively. For Cellpose, the pre-trained nuclei segmentation model was used (see “Methods” section for testing details)
Fig. 3CellSeg performance on diverse human FFPE tissues. CellSeg performance on representative tissue images from a multi-tumor tissue microarray imaged with CODEX, all stains are DRAQ5 nuclear stain. A. Healthy spleen shows small cells. B. Dermatofibrosarcoma protuberans (DFSP) shows spindly nuclei. C. Glioblastoma multiforme (GBM) shows large, misshapen cells. D. Hepatocellular carcinoma (HCC) shows large, round cells. E. Seminoma shows a blend of large tumor cell nuclei and small nuclei from tumor-infiltrating lymphocytes. F. T-cell acute lymphoblastic leukemia (T-ALL) shows densely packed cells. Scale bar, 20 μm. Fluorescence intensity increased in original images for visualization purposes
Fig. 4CellSeg performs comparably to established deep learning-based segmentation algorithms on diverse human FFPE tissues. Representative images from tissues described in Fig. 3 are shown. A. StarDist, Cellpose, and CellSeg show comparable performance on spleen. B. StarDist oversegments several spindly nuclei in DFSP (arrows), while CellSeg and Cellpose segment nuclei accurately. C. StarDist and CellSeg segment more low intensity objects in GBM (arrows). D. all three algorithms perform similarly well on HCC. E. StarDist and CellSeg segment more low intensity objects in seminoma (arrows). F. All three algorithms perform relatively poorly on T-ALL. Scale bar, 20 μm
Fig. 5Testing lateral bleed compensation on a CODEX dataset of colorectal cancer samples. A. Schematic demonstrating post-processing of CellSeg segmentation with following steps. (1) Grow cell boundaries by user defined number of pixels (growth of two pixels shown). (2) Compute inverse adjacency matrix from cell–cell. (3) Multiply inverted adjacency matrix by marker pixel intensity vector to obtain compensated single-cell pixel quantifications table. B. Effects of lateral bleed compensation on double-positive cell populations in the CRC dataset for three pairs of mutually exclusive markers (CD8 vs. CD4, Cytokeratin vs. CD45, CD20 vs. CD3). Data shown are from one of the two CRC TMAs (TMA A), with comparable bleed compensation performance for the other TMA (TMA B, data not shown)
Fig. 6Recapitulating previously identified cell populations using CellSeg. A. Visualization of identified populations in a representative CRC tissue. Points on scatter plots show positions of cells on the displayed tissue image in Fig. 5B. Population identity obtained through gating. B. Fluorescent image of a representative CRC tissue. Expression of six phenotyping markers used in Fig. 5A shown. C. Population correlation analysis between CellSeg and WTS. Each point corresponds to a TMA spot, where the X value is the gated population count from WTS and the Y value is the count computed from CellSeg. Least-squares regression line displayed along with r2 value. T cells are defined as CD45+CD3+CD20−Cytokeratin−; macrophages as CD45+CD20−CD3− and CD68+, CD163+, or CD68+CD163+; B cells as CD45+CD20+CD3−Cytokeratin−; and tumor as Cytokeratin+CD45−