| Literature DB >> 33247116 |
Marc Aubreville1, Christof A Bertram2, Taryn A Donovan3, Christian Marzahl4, Andreas Maier4, Robert Klopfleisch2.
Abstract
ASTRACT: Canine mammary carcinoma (CMC) has been used as a model to investigate the pathogenesis of human breast cancer and the same grading scheme is commonly used to assess tumor malignancy in both. One key component of this grading scheme is the density of mitotic figures (MF). Current publicly available datasets on human breast cancer only provide annotations for small subsets of whole slide images (WSIs). We present a novel dataset of 21 WSIs of CMC completely annotated for MF. For this, a pathologist screened all WSIs for potential MF and structures with a similar appearance. A second expert blindly assigned labels, and for non-matching labels, a third expert assigned the final labels. Additionally, we used machine learning to identify previously undetected MF. Finally, we performed representation learning and two-dimensional projection to further increase the consistency of the annotations. Our dataset consists of 13,907 MF and 36,379 hard negatives. We achieved a mean F1-score of 0.791 on the test set and of up to 0.696 on a human breast cancer dataset.Entities:
Year: 2020 PMID: 33247116 PMCID: PMC7699627 DOI: 10.1038/s41597-020-00756-z
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Fig. 1Examples of mitotic figures and structures with a similar appearance. Due to ambiguities, precise classification for some candidates is not straightforward.
Fig. 2Generation of the object detection-augmented and expert-labeled dataset (ODAEL). Adapted from Bertram et al.[12].
Overview of the individual slides of the final dataset.
| Case No. | File name | tumor area | No. of mitotic figures (MEL/ODAEL/CODAEL) | No. of non-mitotic cells (MEL/ODAEL/CODAEL) | set |
|---|---|---|---|---|---|
| 1 | 4eee7b944ad5e46c60ce.svs | 66.06 mm2 | 47/61/64 | 114/196/193 | test |
| 2 | a8773be388e12df89edd.svs | 37.01 mm2 | 64/71/74 | 204/591/588 | train |
| 3 | deb768e5efb9d1dcbc13.svs | 187.43 mm2 | 92/96/84 | 287/472/484 | train |
| 4 | e09512d530d933e436d5.svs | 214.97 mm2 | 87/98/102 | 602/742/738 | test |
| 5 | 72c93e042d0171a61012.svs | 26.29 mm2 | 130/151/140 | 375/680/691 | train |
| 6 | 2d56d1902ca533a5b509.svs | 49.32 mm2 | 139/155/153 | 228/365/367 | test |
| 7 | 084383c18b9060880e82.svs | 41.71 mm2 | 157/173/160 | 404/547/560 | train |
| 8 | da18e7b9846e9d38034c.svs | 253.10 mm2 | 187/210/211 | 991/1,354/1,353 | train |
| 9 | 13528f1921d4f1f15511.svs | 339.93 mm2 | 283/301/292 | 963/1,127/1,136 | test |
| 10 | d0423ef9a648bb66a763.svs | 273.88 mm2 | 378/411/354 | 1,143/1,596/1,653 | train |
| 11 | 69a02453620ade0edefd.svs | 45.35 mm2 | 634/642/612 | 1,407/1,505/1,535 | test |
| 12 | d37ab62158945f22deed.svs | 226.39 mm2 | 578/651/674 | 1,105/1,725/1,702 | train |
| 13 | d7a8af121d7d4f3fbf01.svs | 426.92 mm2 | 716/746/720 | 1,832/2,373/2,399 | train |
| 14 | 460906c0b1fe17ea5354.svs | 112.24 mm2 | 673/742/754 | 1,199/2,480/2,468 | train |
| 15 | b1bdee8e5e3372174619.svs | 231.84 mm2 | 812/861/869 | 1,260/1,832/1,824 | test |
| 16 | c4b95da36e32993289cb.svs | 257.01 mm2 | 1,097/1,114/1,085 | 2,454/2,944/2,973 | train |
| 17 | 022857018aa597374b6c.svs | 325.81 mm2 | 1,290/1,344/1,320 | 2,463/3,106/3,130 | test |
| 18 | 50cf88e9a33df0c0c8f9.svs | 269.25 mm2 | 1,197/1,339/1,337 | 1,632/2,550/2,552 | train |
| 19 | 3d3d04eca056556b0b26.svs | 513.28 mm2 | 1,383/1,465/1,447 | 2,110/2,933/2,951 | train |
| 20 | 2191a7aa287ce1d5dbc0.svs | 96.38 mm2 | 1,449/1,485/1,462 | 2,155/2,609/2,632 | train |
| 21 | fa4959e484beec77543b.svs | 365.91 mm2 | 1,949/2,035/1,993 | 3,598/4,408/4,450 | train |
| total | 4,360.07 mm2 | 13,342/14,151/13,907 | 26,526/36,135/36,379 | total |
For each slide, the number of mitotic figures and number of non-mitotic structures (hard negatives) is given for each of the three dataset variants: manually expert labeled (MEL), object detection-augmented and expert labeled (ODAEL) and clustering and object detection-augmented and expert labeled (CODAEL).
Fig. 3Generation of the Clustering and Object Detection-Augmented Expert Labeled (CODAEL) dataset variant by reassessment of mitotic figures (red) and hard negatives (blue) in a clustered visual representation.
Fig. 4Statistical overview of the count of mitotic figures per area of 10 high power fields (10 HPF, 2.37 mm2) (bottom right). For better visualization, the dataset was split up into two groups (according to the overall sum of mitotic figures). Whiskers indicate absolute maximum, boxes indicate second to third quartile.The dashed red and green lines represent cut-off values. The four images (top row and bottom left) are examples of mitotic figure distribution through the histological section (H&E stain) using the clustering-aided (CODAEL) dataset variant. Red outlines indicate tumor region. Green dots indicate mitotic figures. The green rectangle in each image indicates the region of maximum mitotic count in an area encompassing 10 HPF (2.37 mm2).
Cell classification experiment, based on cropped-out patches of the manually expert labeled (MEL), object detection-augmented and expert labeled (ODAEL) and the clustering + object detection-augmented and expert labeled (CODAEL) dataset variants.
| Metric | MEL variant | ODAEL variant | CODAEL variant |
|---|---|---|---|
| Precision | 0.675 ± 0.055 | 0.642 ± 0.040 | 0.679 ± 0.044 |
| Recall | 0.898 ± 0.023 | 0.883 ± 0.031 | 0.891 ± 0.028 |
| ROC AUC | 0.926 ± 0.014 | 0.930 ± 0.002 | 0.944 ± 0.007 |
ROC AUC indicates the area under the receiver operating characteristic curve. Values represent mean ± standard deviation of five independent training and inference runs.
Performance assessment (F1 score) for mitotic figure detection on the test set of the three different dataset variants, mean and standard deviation for five independent training and inference runs.
| Network | MEL | ODAEL | CODAEL |
|---|---|---|---|
| Single stage (RetinaNet) | 0.681 ± 0.014 | 0.702 ± 0.023 | 0.735 ± 0.013 |
| Dual stage (RetinaNet + ResNet-18) | 0.707 ± 0.013 | 0.785 ± 0.003 | 0.791 ± 0.012 |
Fig. 5Patches containing mitotic figures from our canine dataset (left), the AMIDA13 cases within TUPAC16 (middle), and the remaining cases of TUPAC16 (right). The clear difference in color representation causes a domain shift.
Mitotic figure detection performance (F1 Score), when trained on the final canine mammary carcinoma (CMC) dataset and tested on the TUPAC16 dataset and its subsets (including AMIDA-13), without any domain adaptation and with only threshold optimization (TO) and model selection (MS), or with transfer learning (TL) performed on the target domain. Values given are mean and standard deviations of five independent training and inference runs. The histological images (cases) were obtained with Aperio ScanScope XT (A) or Leica SCN400 (L) scanner, both with a resolution of 0.25 microns per pixel (400X magnification).
| training conditions | test conditions | |||||
|---|---|---|---|---|---|---|
| dataset | labels | cases | scanner | single stage | dual stage | |
| only CMC | TUPAC train | original labels[ | 73 | L,A | 0.528 ± 0.029 | 0.544 ± 0.014 |
| only CMC | TUPAC test | 34 | L | 0.322 ± 0.032 | 0.268 ± 0.039 | |
| only CMC | AMIDA train | 12 | A | 0.524 ± 0.022 | 0.574 ± 0.019 | |
| only CMC | AMIDA test | 11 | A | 0.546 ± 0.044 | 0.579 ± 0.026 | |
| TO and MS on AMIDA-train | AMIDA test | 11 | A | 0.584 | 0.628 | |
| only CMC | TUPAC-train | re-labeled[ | 73 | L,A | 0.564 ± 0.038 | 0.573 ± 0.019 |
| only CMC | TUPAC-test | 34 | L | 0.298 ± 0.044 | 0.218 ± 0.036 | |
| only CMC | AMIDA-train | 12 | A | 0.592 ± 0.043 | 0.645 ± 0.016 | |
| only CMC | AMIDA-test | 11 | A | 0.594 ± 0.047 | 0.635 ± 0.028 | |
| TO and MS on AMIDA-train | AMIDA-test | 11 | A | 0.628 | 0.696 | |
| TL on AMIDA-train | AMIDA-test | 11 | A | 0.720 ± 0.022 | 0.733 ± 0.007 | |
| Measurement(s) | Mitotic Figure • Slide Image • non-mitotic structures • anatomical phenotype annotation |
| Technology Type(s) | Pathology Report • hematoxylin and eosin stain • machine learning |
| Factor Type(s) | breast cancer tissue |
| Sample Characteristic - Organism | Canis |