Literature DB >> 28202066

Mitotic counts in breast cancer should be standardized with a uniform sample area.

Abstract

BACKGROUND: Mitotic rate is routinely assessed in breast cancer cases and based on the assessment of 10 high power fields (HPF), a non-standard sample area, as per the College of American Pathologists cancer checklist. The effect of sample area variation has not been assessed.
METHODS: A computer model making use of the binomial distribution was developed to calculate the misclassification rate in 1,000,000 simulated breast specimens using the extremes of field diameter (FD) and mitotic density cutoffs (3 and 8 mitoses/mm2), and for a sample area of 5 mm2. Mitotic counts were assumed to be a random sampling problem using a mitotic rate distribution derived from an experimental study (range 0-16.4 mitoses/mm2). The cellular density was 2500 cell/mm2.
RESULTS: For the smallest microscopes (FD = 0.40 mm, area 1.26 mm2) 16% of cases were misclassified, compared to 9% of the largest (FD 0.69 mm, area 3.74 mm2), versus 8% for 5 mm2. An excess of 27% of score 2 cases were misclassified as 1 or 3 for the lower FD.
CONCLUSION: Mitotic scores based on ten HPFs of a small field area microscope are less reliable measures of the mitotic density than in a bigger field area microscope; therefore, the sample area should be standardized. When mitotic counts are close to the cut-offs the score is less reproducible. These cases could benefit from using larger sample areas. A measure of mitotic density variation due to sampling may assist in the interpretation of the mitotic score.

Entities: Disease Gene Species

Keywords: Breast cancer; Cancer grading; HPF; Mitotic counting; Reproducibility; Sampling; Simulation

Mesh：

Year: 2017 PMID： 28202066 PMCID： PMC5312435 DOI： 10.1186/s12938-016-0301-z

Source DB: PubMed Journal: Biomed Eng Online ISSN： 1475-925X Impact factor: 2.819

Background

Tumour growth rate is a prognostic marker and can be evaluated by its correlate at the cellular level: mitoses. Thus, mitotic counts are used in a wide number of neoplasms to predict prognosis, and highly proliferative neoplasms (with many mitoses) usually have a worse prognosis. Mitotic counts are performed by a pathologist, counting mitotic figures at a high magnification. As mitotic figures are rare in relation to the number of cells, 10 high power fields (HPF) of view are typically examined. As cellularity is time consuming to quantify, mitoses/area is often used instead of mitoses/cell. Considered as a sampling problem, mitotic counting is, typically biased in a number of ways: (1) many pathologists do not start the count until they have found one mitosis, (2) pathologists count mitoses in the area of the tumour they consider to be the most mitotically active (usually the most poorly differentiated portion). The former introduces a systematic bias that consistently skews the results in the direction towards a higher mitotic score. The later factor is not a significant factor if one frames the problem as an assessment of the poorly differentiated region of the tumour (as opposed to the tumour as a whole). In breast pathology, mitotic counts are a part of the Nottingham score and have been demonstrated to be a histomorphologic predictor of outcome. The procedure for counting mitoses was laid-out in the original paper that described the scoring system [1], and has subsequently been clarified in the CAP protocol, where it states it should be done on the “most mitotically active area”. The system recognizes that mitotic rate per area is a strong predictor and essentially standardizes the cut-points (3 and 8 mitoses/mm2) to two separate mitotic rates (mitoses/area), creating a three tier system. The system pseudo-standardizes the sample area to 10 HPF, where one HPF is the field area seen with the 40x objective and dependent on the field diameter of the microscope. Mitotic counting in breast pathology has been considered a sampling problem and it has been studied experimentally and modelled mathematically [3]. Experimentally assessing the reproducibility of mitotic counts rigorously is an onerous proposition, and it would be prohibitively expensive to do a large study from which the misclassification errors due to sampling can be accurately assessed. The reproducibility of mitotic counts in the context of breast pathology is “low” [3]; thus, small sample sizes (< 50 cases) are not sufficient. Unselected breast cancer cases are usually mitotic score 1, and this makes a study of misclassification more onerous, as the effect size is smaller than it would be if the cases are equally distributed among the different scores. A computer simulation of this problem is an elegant solution as it can avoid the onerous labour and control for many confounders. Populations of simulated specimens can be randomly sampled, and compared to the true mitotic rates. Such a gold standard comparison is not possible using glass slides and pathologists. This can be done millions of times to determine the distribution of correct and incorrectly classified simulated specimens, and is a powerful tool for demonstrating the effect of sampling 10 HPFs from microscopes with different field areas. The area of a HPF measured in mm2 may vary considerably from microscope to microscope. Using the smallest and largest field diameters from the table in the College of American Pathologists (CAP) checklist for Invasive Breast Cancer [2], a microscope with a field diameter of 0.40 mm has a HPF area of 0.13 mm2, while one with a field diameter of 0.69 mm has a HPF area of 0.37 mm2. This is almost a three-fold difference in the area sampled in one HPF. HPF is a tenuous pseudo-standard and, unfortunately, this measure is widely used throughout pathology, from eosinophilic esophagitis [4] to gastrointestinal stromal tumours [5], and breast cancer [6] to mention a few. It has been recognized that the HPF, an often used measure of area, has no uniform definition and is a significant and under recognized source of variability that can lead to misclassification affecting reproducibility. As a result, there has been a gradual move toward establishing standardized sample areas, e.g. 5 mm2 for gastrointestinal stromal tumour [5]; however, it stubbornly persists in many areas of pathology. Sampling problems, similar to mitotic counting, are all around us, and differences in sample size significantly effect predictions. A public opinion survey with 1000 individuals is more representative of the population than one with 500 individuals. This paper will show the same applies for the sample area, and will demonstrate how large the effect of sample size is in the context of breast cancer mitotic scoring. This paper will rigorously assess the impact of the sample area on the mitotic score in the three-tier system used in breast pathology and it will demonstrate that a standardized sample area is essential.

Methods

An in silico model was developed based on the binomial distribution [7], using the software GNU Octave (www.gnu.org/software/octave/). The model assumes mitotic counting is a sampling problem. It assessed the classification and misclassification rates of 1,000,000 simulated breast specimens, using the sample areas for the extremes of the field diameter range (FD = 0.40 mm, 10 HPF = 1.26 mm2, FD = 0.69 mm, 10 HPF = 3.74 mm2) in the College of American Pathologists (CAP) checklist [6], as well as the areas of 5.00 mm2, which would be equivalent to 40 HPF at a FD 0.40 mm, or almost 14HPF for FD 0.69 mm. Each simulated breast specimen was assigned a true mitotic density, based on an experimentally determined distribution from Meyers et al. The mitotic rate in the population ranged from 0 to 16.4 mitoses/mm2 and is similar to the distribution seen in an unselected population; however, has more cases in the higher mitotic score categories. Then, in essence, the cumulative probability of each simulated mitotic count was calculated based on the knowledge of (1) the true mitotic rate per area, (2) the sample area, and (3) the cellular density. One such (sampled) mitotic count-probability distribution is seen in Fig. 1. The abscissa of Fig. 1 shows the (sampled) mitotic count (abscissa) versus the cumulative probability (ordinate).

Fig. 1

Cumulative probability (based on the binomial distribution) versus the mitotic count for a set of parameters (sample area = 2 mm2, cells/area = 2500 cells/mm2, true mitotic rate = 4.0 mitoses/mm2)

Cumulative probability (based on the binomial distribution) versus the mitotic count for a set of parameters (sample area = 2 mm2, cells/area = 2500 cells/mm2, true mitotic rate = 4.0 mitoses/mm2) The sampling processes, i.e. the simulated mitotic counts, were each represented by a random number between 0 and 1, which were considered equivalent to the percentile score of all possible sampling results. Thus, the random numbers could be substituted for the cumulative probability—in the (sampled) mitotic count–probability distributions (i.e. Fig. 1)—and thus converted to the sampled mitotic counts. The sampled mitotic counts were subsequently converted into (sampled) mitotic scores. The true and sampled mitotic scores were then determined based on the true and sampled mitotic densities, using the cutoffs of 3 and 8 mitoses/mm2 as in 2013 version of the CAP checklist [2], and the misclassification or agreement was tabulated. For each of the 1,000,000 cases the cellular density (2500 cells/mm2) was held constant. For example: If the random generated number is 0.42644 the mitotic score is determined to be 7, as the random number is less than the cumulative probability for 7 mitoses (0.45285) and greater than the cumulative probably for 6 mitoses (0.31318). The number of specimens with a particular mitotic rate for the sample population are shown in Fig. 1, which was interpolated to generate a table of mitotic rates and their relative frequency in the population of specimens.

Results

The percentage of agreement between the true mitotic rate and the sampled mitotic rate based on the three different sampled areas, 1.24, 3.74, and 5.00 mm2, are given in Tables 1, 2, and 3 respectively, for 1,000,000 simulated breast specimens. Some rounding and significant digits lead to totals not adding up to 100%. Additionally the accuracy, for each HPF area was also calculated. The misclassification rates are 16, 9, 8, 5, 4 and 4% for sample areas of 1.26, 3.74, 5, 10, 15 and 20 mm2 respectively (see Table 4). If one frames the comparison between the true mitotic score and the score generated by the simulated of mitotic count, as an inter-rater reliability problem, Cohen’s kappa is applicable as a measure. The kappa is 0.76, 0.87, 0.89, 0.92, 0.93 and 0.94 for sample areas of 1.26, 3.74, 5, 10, 15 and 20 mm2 respectively.

Table 1

Percentage classification for sample area 1.26 mm2

	True score 1n = 655,196 (%)	True score 2n = 245,122 (%)	True score 3n = 99,682 (%)	Totalsn = 1,000,000 (%)
Sample score 1	96	37	<1	72
Sample score 2	4	55	25	18
Sample score 3	<1	8	75	9

Table 2

Percentage classification for sample area 3.74 mm2

	True score 1n = 682,735 (%)	True score 2n = 209,385 (%)	True score 3n = 107,880 (%)	Totalsn = 1,000,000 (%)
Sample score 1	95	14	0	68
Sample score 2	5	82	18	22
Sample score 3	0	4	82	10

Table 3

Percentage classification for sample area 5.00 mm2

	True score 1n = 684,813	True score 2n = 212,349	True score 3n = 102,838	Totalsn = 1,000,000 (%)
Sample score 1	96	12	0	68
Sample score 2	4	83	13	22
Sample score 3	0	5	87	1

Table 4

Accuracy by area of all classifications of simulated breast specimens

Area (mm²)	Percentage correct	Percent misclassified
1.26	84	16
3.74	91	9
5.00	92	8
10.0	95	5
15.0	96	4
20.0	96	4

Percentage classification for sample area 1.26 mm2 Percentage classification for sample area 3.74 mm2 Percentage classification for sample area 5.00 mm2 Accuracy by area of all classifications of simulated breast specimens

Discussion

The three-tier system reproducibly separates score 1 and score 3; the lowest sample area (1.26 mm2) has less than 1% of cases misclassified as score 1 when it is truly score 3. Intuitively, the middle group should have a higher misclassification rate than the other two, as it directly interfaces with the two other groups. The model reproduces this expected pattern; the middle group has the highest misclassification rate. The smallest field diameter microscopes (0.40 mm), due to sampling error, incorrectly categorize an additional of 7% of all tumors when compared to the largest field diameter microscopes (0.69 mm), 27% more of cases in the mitotic score 2 group. The results reproduce findings by Meyer et al.; the misclassification rate is quite high (9–16% of cases). There is a clear trend to less misclassification with greater sample areas and the misclassification rate has a non-linear relationship with the sample area where incremental increases in area have successively lower reductions in the misclassification rate, as shown by the flattening of the misclassification from 5 to 20 mm2 (Table 4). The accuracy strong depends on the true mitotic rate, as a plot of the fraction incorrect versus the true mitotic rate demonstrates, Fig. 2.

Fig. 2

Missclassification versus true mitotic rate

Limitations

The study did not consider the common practise of beginning the mitotic count with a mitotic figure. The effect of this practise could be calculated; however, it adds another level of complexity and likely does not change the overall conclusions. As well, the study did not systematically assess the impact of cellularity in the simulated breast specimens; however, some smaller calculations suggest it is not a significant factor (data not shown). The findings are not corroborated by a large data set with patient outcomes and sample areas. This is a true short coming; however, we are not in possession of such a data set, though we hope this study will spurn some data mining by others. These findings regarding sampling theory as it applies to simulated breast specimens are practically self-evident, particularly if examined in the context of the vast experience with similar problems in opinion research (public polling) and manufacturing (statistical process control).

Conclusions

Ten HPF is not a good standard sample area, as the misclassification rate is dependent on the microscope. The reproducibility of the mitotic score is poor, especially when close to the CAP Protocol cut-points of 3 and 8 mitoses/mm2 (see Fig. 2; Additional file 1: Appendix S1). The mitotic count cut-points should be standardized and the sample area standardized; this could be accomplished by varying the number of HPFs counted and may be less complicated than the table in the CAP checklist (see Additional file 2: Appendix S2).

Reducing misclassification

Generally, reducing the misclassification error requires a larger sample area, as noted by Meyer et al. [3]. However, we believe advocating larger sample areas (>5 mm2) would be impractical and needlessly tedious, as many cases can be assessed with a relatively small area. Also, as we have shown, a number of cases close to the cut-point, considered practically, will frequently be misclassified unless one samples the whole or at least the entirety of the most poorly differentiated component, of the tumour. We believe a more rational approach would be to triage cases into (a) “needs a larger sample area”, and (b) “confident it is correctly classified”. The triage decision would be guided by a count on a (small) standardized sample area and a confidence interval around the cut-points. Cases deemed to need a larger sample area would be classified based on the larger sample area. We believe it is reasonable to draw the line after limited additional sampling; with a statement about the confidence interval—to make the clinician aware that a number of the cases will be misclassified by chance so that it can be taken into account. It is possible that some pathologists might already do such an activity, by performing repeated counts in several areas (Additional file 3). We strongly believe that the term “high power field” and its cousins (“intermediate power field”, and “low power field”) should be completely abandoned as measures of area in pathology. Their continued use is offensive to any person that has given thought to why measures (such as the foot, millimeter and kilogram) are standardized or has some understanding of sampling theory. Accurately quantifying proliferation will likely remain important for predicting cancer outcomes in the near term. Proliferative activity is used to subclassify breast cancer, has been quantified with Ki-67 labeling, and is central to commercial ancillary tests for breast cancer, e.g. Oncotype Dx [8]. The issue identified in this paper may explain, in part, why alternatives to the mitotic count have been sought; mitotic counts done by humans have limitations and have been done without much attention to sampling theory.

4 in total

1. Mitotic index of invasive breast carcinoma. Achieving clinically meaningful precision and evaluating tertial cutoffs.

Authors: John S Meyer; Eric Cosatto; Hans Peter Graf
Journal: Arch Pathol Lab Med Date: 2009-11 Impact factor: 5.534

2. Prognostic value of histologic grade and proliferative activity in axillary node-positive breast cancer: results from the Eastern Cooperative Oncology Group Companion Study, EST 4189.

Authors: J F Simpson; R Gray; L G Dressler; C D Cobau; C I Falkson; K W Gilchrist; K J Pandya; D L Page; N J Robert
Journal: J Clin Oncol Date: 2000-05 Impact factor: 44.544

Review 3. Variability in diagnostic criteria for eosinophilic esophagitis: a systematic review.

Authors: Evan S Dellon; Ademola Aderoju; John T Woosley; Robert S Sandler; Nicholas J Shaheen
Journal: Am J Gastroenterol Date: 2007-07-07 Impact factor: 10.864

4. Ki-67 is a prognostic parameter in breast cancer patients: results of a large population-based cohort of a cancer registry.

Authors: E C Inwald; M Klinkhammer-Schalke; F Hofstädter; F Zeman; M Koller; M Gerstenhauer; O Ortmann
Journal: Breast Cancer Res Treat Date: 2013-05-16 Impact factor: 4.872

4 in total

7 in total

1. Development and technical validation of an artificial intelligence model for quantitative analysis of histopathologic features of eosinophilic esophagitis.

Authors: Luisa Ricaurte Archila; Lindsey Smith; Hanna-Kaisa Sihvo; Thomas Westerling-Bui; Ville Koponen; Donnchadh M O'Sullivan; Maria Camila Cardenas Fernandez; Erin E Alexander; Yaohong Wang; Priyadharshini Sivasubramaniam; Ameya Patil; Puanani E Hopson; Imad Absah; Karthik Ravi; Taofic Mounajjed; Rish Pai; Catherine Hagen; Christopher Hartley; Rondell P Graham; Roger K Moreira
Journal: J Pathol Inform Date: 2022-09-27

2. Accuracy and efficiency of an artificial intelligence tool when counting breast mitoses.

Authors: Liron Pantanowitz; Douglas Hartman; Yan Qi; Eun Yoon Cho; Beomseok Suh; Kyunghyun Paeng; Rajiv Dhir; Pamela Michelow; Scott Hazelhurst; Sang Yong Song; Soo Youn Cho
Journal: Diagn Pathol Date: 2020-07-04 Impact factor: 2.644

3. Automated Computational Detection, Quantitation, and Mapping of Mitosis in Whole-Slide Images for Clinically Actionable Surgical Pathology Decision Support.

Authors: Munish Puri; Shelley B Hoover; Stephen M Hewitt; Bih-Rong Wei; Hibret Amare Adissu; Charles H C Halsey; Jessica Beck; Charles Bradley; Sarah D Cramer; Amy C Durham; D Glen Esplin; Chad Frank; L Tiffany Lyle; Lawrence D McGill; Melissa D Sánchez; Paula A Schaffer; Ryan P Traslavina; Elizabeth Buza; Howard H Yang; Maxwell P Lee; Jennifer E Dwyer; R Mark Simpson
Journal: J Pathol Inform Date: 2019-02-07

4. Agreement in Histological Assessment of Mitotic Activity Between Microscopy and Digital Whole Slide Images Informs Conversion for Clinical Diagnosis.

Authors: Bih-Rong Wei; Charles H Halsey; Shelley B Hoover; Munish Puri; Howard H Yang; Brandon D Gallas; Maxwell P Lee; Weijie Chen; Amy C Durham; Jennifer E Dwyer; Melissa D Sánchez; Ryan P Traslavina; Chad Frank; Charles Bradley; Lawrence D McGill; D Glen Esplin; Paula A Schaffer; Sarah D Cramer; L Tiffany Lyle; Jessica Beck; Elizabeth Buza; Qi Gong; Stephen M Hewitt; R Mark Simpson
Journal: Acad Pathol Date: 2019-07-11

5. (Re) Defining the High-Power Field for Digital Pathology.

Authors: David Kim; Liron Pantanowitz; Peter Schüffler; Dig Vijay Kumar Yarlagadda; Orly Ardon; Victor E Reuter; Meera Hameed; David S Klimstra; Matthew G Hanna
Journal: J Pathol Inform Date: 2020-10-09

6. Defining the area of mitoses counting in invasive breast cancer using whole slide image.

Authors: Asmaa Ibrahim; Ayat G Lashen; Ayaka Katayama; Raluca Mihai; Graham Ball; Michael S Toss; Emad A Rakha
Journal: Mod Pathol Date: 2021-12-11 Impact factor: 8.209

7. Deep learning algorithms out-perform veterinary pathologists in detecting the mitotically most active tumor region.

Authors: Marc Aubreville; Christof A Bertram; Christian Marzahl; Corinne Gurtner; Martina Dettwiler; Anja Schmidt; Florian Bartenschlager; Sophie Merz; Marco Fragoso; Olivia Kershaw; Robert Klopfleisch; Andreas Maier
Journal: Sci Rep Date: 2020-10-05 Impact factor: 4.379

7 in total