| Literature DB >> 22164226 |
Bernd Lahrmann1, Niels Halama, Hans-Peter Sinn, Peter Schirmacher, Dirk Jaeger, Niels Grabe.
Abstract
The upcoming quantification and automation in biomarker based histological tumor evaluation will require computational methods capable of automatically identifying tumor areas and differentiating them from the stroma. As no single generally applicable tumor biomarker is available, pathology routinely uses morphological criteria as a spatial reference system. We here present and evaluate a method capable of performing the classification in immunofluorescence histological slides solely using a DAPI background stain. Due to the restriction to a single color channel this is inherently challenging. We formed cell graphs based on the topological distribution of the tissue cell nuclei and extracted the corresponding graph features. By using topological, morphological and intensity based features we could systematically quantify and compare the discrimination capability individual features contribute to the overall algorithm. We here show that when classifying fluorescence tissue slides in the DAPI channel, morphological and intensity based features clearly outpace topological ones which have been used exclusively in related previous approaches. We assembled the 15 best features to train a support vector machine based on Keratin stained tumor areas. On a test set of TMAs with 210 cores of triple negative breast cancers our classifier was able to distinguish between tumor and stroma tissue with a total overall accuracy of 88%. Our method yields first results on the discrimination capability of features groups which is essential for an automated tumor diagnostics. Also, it provides an objective spatial reference system for the multiplex analysis of biomarkers in fluorescence immunohistochemistry.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22164226 PMCID: PMC3229509 DOI: 10.1371/journal.pone.0028048
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Microscopic image examples of different TMA-cores.
(a) Representation of all 3 channel of a fluorescently stained core in RGB colorspace. Glyphs originated due to TMA's preparation. Red representing the stromal marker (Vimentin), green the tumor marker (CK19) and blue the DAPI channel highlighting the cell nuclei; (b) the DAPI channel of (a) as an intensity image: in general tumor cells are darker and tighter connected than stromal cells; (c) another DAPI image of a core with a high density of cells; (d) an example of a core with a lower density of cells shows the high heterogeneity among the cores.
Figure 2A Flowchart showing the single steps of our methodology.
After obtaining the images, pre-processing steps enhance the image quality and watershed segmentation for the subsequent segmentation is applied. Accordingly the cell graphs are generated and features are computerized. The last step uses a SVM to classify the graphs as either tumor or stroma.
Figure 3The different image processing steps and the graph generation steps.
(a) original image of the DAPI-channel; (b) image after shading correction and noise removal; (c) result of the watershed segmentation, the segmented cells are highlighted by green contour; (d) the image after removal of single cells; (e) showing the cells which were connected via the graph generation step in the same color (cells marked with the same color belong to the same sub-graph); (f) cell graph representation of the cells. The red dots are the nodes which represent the cells, the black lines are the edges between them.
Figure 4Conceptional representation of cell graphs.
(a) Artificial sketch of 3 different 3 cell type: tumor cells in blue, lymphocytes in white and in purple fibroblast. (b) Cell graph representation of (a). Cells are depicted as nodes and links between them represent biological relations.
Graph metrics used to train the classifier and their description.
| Name | Description |
|
| |
| (1) Nr. of nodes | Defines the number of nodes in a graph. |
| (2) Nr. of edges | Total number of edges in a graph. |
| (3) Average degree | The average degree of the nodes of the graph. The degree of a node is defined as the number of its edges. It explains the number of neighbour nodes. |
| (4) Diameter | The eccentricity |
| (5) Radius | The minimum graph eccentricity is called the graph radius: |
| (6) Nr. of central points | Number of nodes that have eccentricity equal to the radius. |
| (7) Average clustering coefficient | Here, the average clustering coefficients of the nodes of the graphs are used as a global metric. The clustering coefficient Ci of a node vi is given as: |
| (8) Nr. of end nodes. | The number of nodes with degree equal one. |
| (9) Percentage of end nodes. | The percentage of end nodes of a graph. |
| (10) Hop-plot exponent | The hop-plot exponent is computed by the slope of the hop-plot values as a function of h in log-log-scale. The hop-plot value reflects the size of a neighbourhood between nodes within a hop |
|
| |
| (11) Average area | The average area of the cells of the graph. |
| (12) Average eccentricity | The eccentricity is the ratio of the distance between the centroid of the ellipse and its major axis length. The value is between 0 and 1 (an ellipse whose eccentricity is 0 is actually a circle, while an ellipse whose eccentricity is 1 is a line segment.). Eccentricity is given as: |
| (13) Average equivalent diameter | The equivalent diameter specifies the diameter of a circle with the same area as the cell: |
| (14) Average extent | The extent specifies the proportion of pixels in the smallest rectangle containing the cell that are also in the region. Computed as the cell area divided by the area of the smallest rectangle containing the cell. |
| (15) Average major axis length | Major axis length specifying the length of the major axis of the ellipse that has the same normalized second central moments as the cell. |
| (16) Average minor axis length | Minor axis length the length (in pixels) of the minor axis of the ellipse that has the same normalized second central moments as the cell. |
| (17) Average max Intensity | Maximum intensity of the cell. |
| (18) Average min Intensity | Minimum intensity of the cell. |
| (19) Average mean intensity | Mean intensity of the cell. |
| (20) Average perimeter | Perimeter: the distance around the boundary of the cell. |
| (21) Average STD Intensity | Standard deviation of the intensity of a cell. |
| (22) Average median intensity | Median intensity level of a cell. |
Figure 5The results of the classification.
(a–d) showing the original RGB core images; (e–h) showing the corresponding DAPI channel as an intensity image of the cores (a–d); (i–l) results of the classification step, green = cells classified as tumor cells, blue = cells classified as stroma cells.
Accuracy of the watershed cell segmentation.
| Total Nuclei | Correctly segmented | Over-segmented | Under-segmented | |
|
| 5162 | 4860 | 272 | 30 |
|
| 100% | 94.1(±3.75) | 5.3(±4.0) | 0.6(±0.3) |
The F-scores of each feature in descending order.
| Features | Type | F-score |
|
| I | 0.240 |
|
| M | 0.212 |
|
| M | 0.209 |
|
| M | 0.182 |
|
| I | 0.143 |
|
| I | 0.108 |
|
| M | 0.063 |
|
| I | 0.056 |
|
| M | 0.051 |
|
| T | 0.039 |
|
| T | 0.038 |
|
| T | 0.038 |
|
| T | 0.038 |
|
| T | 0.038 |
|
| T | 0.038 |
| Average extent | M | 0.038 |
| Average min Intensity | I | 0.016 |
| Nr. of nodes | T | 0.008 |
| Average clustering coefficient | T | 0.007 |
| Nr. of edges | T | 0.006 |
| Number of end nodes | T | 0.004 |
| Number of central points | T | 0.001 |
|
| M | 0.144 |
|
| I | 0.112 |
|
| T | 0.023 |
The table shows the evaluated features sorted by their decreasing value for tissue classification (F-score). For each feature it is given whether it is of morphological (M), intensity (I), and topological character (T).
The average classification accuracies.
| Training set | Test set | Average 1–5 | |||||
| 1 | 2 | 3 | 4 | 5 | |||
| Overall | 88.47(±06.68) | 87.65(±08.19) | 90.30(±06.44) | 88.68(±07.19) | 88.76(±06.98) | 88.59(±09.83) | 88.80(±07.73) |
| Tumor | 89.26(±10.20) | 87.56(±13.29) | 87.83(±12.47) | 88.00(±17.64) | 88.98(±10.01) | 87.71(±14.13) | 88.02(±13.51) |
| Stroma | 85.14(±10.95) | 81.19(±11.62) | 91.45(±06.21) | 82.97(±15.12) | 80.02(±12.35) | 86.90(±13.69) | 84.67(±11.80) |
The table shows the accuracies of the training set and the accuracies of the slides from the test set.
The accuracies of the slides of the test set with the additional single node classification.
| Test set | Average 1–5 | |||||
| 1 | 2 | 3 | 4 | 5 | ||
| Overall | 84.02(±07.83) | 87.46(±6.85) | 86.12(±08.23) | 86.28(±05.88) | 86.69(±08.51) | 86.12(±07.46) |
| Tumor | 85.62(±13.26) | 86.05(±12.57) | 86.31(±17.08) | 88.39(±09.77) | 86.64(±13.41) | 86.60(±13.22) |
| Stroma | 75.94(±13.11) | 84.58(±13.63) | 80.23(±14.24) | 76.29(±11.81) | 81.72(±13.55) | 79.75(±13.27) |