| Literature DB >> 28659278 |
Joann G Elmore1, Raymond L Barnhill2, David E Elder3, Gary M Longton4, Margaret S Pepe4, Lisa M Reisch5, Patricia A Carney6, Linda J Titus7, Heidi D Nelson8,9, Tracy Onega10,11, Anna N A Tosteson12, Martin A Weinstock13,14, Stevan R Knezevich15, Michael W Piepkorn16,17.
Abstract
Objective To quantify the accuracy and reproducibility of pathologists' diagnoses of melanocytic skin lesions.Design Observer accuracy and reproducibility study.Setting 10 US states.Participants Skin biopsy cases (n=240), grouped into sets of 36 or 48. Pathologists from 10 US states were randomized to independently interpret the same set on two occasions (phases 1 and 2), at least eight months apart.Main outcome measures Pathologists' interpretations were condensed into five classes: I (eg, nevus or mild atypia); II (eg, moderate atypia); III (eg, severe atypia or melanoma in situ); IV (eg, pathologic stage T1a (pT1a) early invasive melanoma); and V (eg, ≥pT1b invasive melanoma). Reproducibility was assessed by intraobserver and interobserver concordance rates, and accuracy by concordance with three reference diagnoses.Results In phase 1, 187 pathologists completed 8976 independent case interpretations resulting in an average of 10 (SD 4) different diagnostic terms applied to each case. Among pathologists interpreting the same cases in both phases, when pathologists diagnosed a case as class I or class V during phase 1, they gave the same diagnosis in phase 2 for the majority of cases (class I 76.7%; class V 82.6%). However, the intraobserver reproducibility was lower for cases interpreted as class II (35.2%), class III (59.5%), and class IV (63.2%). Average interobserver concordance rates were lower, but with similar trends. Accuracy using a consensus diagnosis of experienced pathologists as reference varied by class: I, 92% (95% confidence interval 90% to 94%); II, 25% (22% to 28%); III, 40% (37% to 44%); IV, 43% (39% to 46%); and V, 72% (69% to 75%). It is estimated that at a population level, 82.8% (81.0% to 84.5%) of melanocytic skin biopsy diagnoses would have their diagnosis verified if reviewed by a consensus reference panel of experienced pathologists, with 8.0% (6.2% to 9.9%) of cases overinterpreted by the initial pathologist and 9.2% (8.8% to 9.6%) underinterpreted.Conclusion Diagnoses spanning moderately dysplastic nevi to early stage invasive melanoma were neither reproducible nor accurate in this large study of pathologists in the USA. Efforts to improve clinical practice should include using a standardized classification system, acknowledging uncertainty in pathology reports, and developing tools such as molecular markers to support pathologists' visual assessments. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.Entities:
Mesh:
Year: 2017 PMID: 28659278 PMCID: PMC5485913 DOI: 10.1136/bmj.j2813
Source DB: PubMed Journal: BMJ ISSN: 0959-8138
Summary of MPATH-Dx reporting schema for classification of melanocytic skin lesions into five diagnostic categories
| MPATH-Dx class | Perceived risk for progression | Suggested intervention | Examples |
|---|---|---|---|
| 0 | Incomplete study due to sampling or technical limitations | Repeat biopsy or short term follow-up | NA |
| I | Very low risk | No further treatment | Common melanocytic nevus; blue nevus; mildly dysplastic nevus |
| II | Low risk | Narrow but complete excision (<5 mm) | Moderately dysplastic nevus; -Spitz nevus |
| III | Higher risk. Greater need for intervention | Complete excision with at least 5 mm but <1 cm margins | Severely dysplastic nevus; melanoma in situ; atypical Spitz tumor |
| IV | Substantial risk for local or regional progression | Wide local excision with ≥1 cm margins | Thin, invasive melanomas (eg, pT1a†) |
| V | Greatest risk for regional and/or distant metastases | Wide local excision with ≥1 cm margins. Consideration of staging sentinel lymph node biopsy, adjuvant therapy | Thicker, invasive melanomas (eg, pT1b, stage 2 or greater†) |
*Assuming representative sampling of lesion.
†According to American Joint Committee on Cancer seventh edition cancer staging manual.15
Self reported characteristics of participating pathologists who completed the baseline survey and phase 1 interpretations (n=187)
| Physician characteristics | No (%) |
|---|---|
|
| |
| Age (years): | |
| <40 | 31 (17) |
| 40-49 | 56 (30) |
| 50-59 | 63 (34) |
| ≥60 | 37 (20) |
| Sex: | |
| Female | 73 (39) |
| Male | 114 (61) |
|
| |
| Affiliation with academic medical centre: | |
| No | 134 (72) |
| Yes, adjunct or affiliated | 34 (18) |
| Yes, primary appointment | 19 (10) |
| Residency specialty: | |
| Anatomic or clinical pathology | 168 (90) |
| Dermatology | 15 (8) |
| Both dermatology and anatomic or clinical pathology | 4 (2) |
| Training: | |
| Board certified or fellowship trained in dermatopathology* | 74 (40) |
| Other board certification or fellowship training† | 113 (60) |
| Years interpreting melanocytic skin lesions: | |
| <5 | 29 (16) |
| 5-9 | 45 (24) |
| 10-19 | 57 (30) |
| ≥20 | 56 (30) |
| Per cent of caseload interpreting melanocytic skin lesions: | |
| <10 | 79 (42) |
| 10-24 | 72 (38) |
| 25-49 | 28 (15) |
| ≥50 | 8 (4) |
| Average No of melanoma cases (melanoma in situ and invasive melanoma) interpreted each month: | |
| <5 | 82 (44) |
| 5-9 | 47 (25) |
| ≥10 | 58 (31) |
| Average No of benign melanocytic skin lesions interpreted each month: | |
| <25 | 54 (29) |
| 25-49 | 32 (17) |
| 50-149 | 51 (27) |
| ≥150 | 50 (27) |
| Considered an expert in melanocytic skin lesions by colleagues: | |
| No | 108 (58) |
| Yes | 79 (42) |
|
| |
| In general, how challenging do you find interpreting melanocytic skin lesions?: | |
| Challenging (somewhat challenging to very challenging) | 179 (96) |
| Easy (very easy to somewhat easy) | 8 (4) |
| Interpreting melanocytic skin lesions makes me more nervous than other types of pathology: | |
| Agree (slightly agree to strongly agree) | 129 (69) |
| Disagree (strongly disagree to slightly disagree) | 58 (31) |
| How confident are you in your assessments of melanocytic skin lesions?: | |
| Confident (extremely confident to moderately confident) | 161 (86) |
| Not confident (somewhat confident to not at all confident) | 26 (14) |
*Consists of physicians with single or multiple fellowships that include dermatopathology, and physicians with single or multiple board certifications that include dermatopathology.
†Includes fellowships or board certifications in surgical pathology, cytopathology, or hematopathology.

Fig 1 Diagnostic terms given to example case by 36 pathologists who each independently interpreted the same glass slide (top image 5× magnification, bottom image 10× magnification)
Intraobserver concordance of 118 pathologists’ interpretations of melanocytic skin biopsy lesions of the same case at phase 1 and phase 2 at least eight months apart*
| Phase 1 diagnosis | Phase 2 diagnosis (No of paired interpretations) | Intraobserver concordance† % (95% CI) | |||||
|---|---|---|---|---|---|---|---|
| Class I | Class II | Class III | Class IV | Class V | Total | ||
| Class I |
| 188 | 119 | 27 | 17 |
| 77 (73 to 80) |
| Class II | 170 |
| 182 | 37 | 28 |
| 35 (31 to 39) |
| Class III | 91 | 120 |
| 169 | 58 |
| 60 (56 to 63) |
| Class IV | 20 | 37 | 147 |
| 105 |
| 63 (59 to 67) |
| Class V | 14 | 16 | 44 | 105 |
|
| 83 (80 to 85) |
| Total |
|
|
|
|
|
| 67‡ (65 to 69) |
Numbers of diagnostic interpretations with intraobserver agreement are emboldened.
*Concordance rates are influenced by case composition, which included larger proportions of cases in classes II-V than would typically be encountered in practice.
†Denominator is phase 1 interpretations and numerator is phase 2 assessments that agreed with phase 1 interpretation. Does not include participants who reviewed glass slides in phase 1 and digital images in phase 2.
‡Average κ for intraobserver agreement across participants is 0.57.
Interobserver concordance of pathologists’ interpretations of melanocytic skin biopsy lesions. Pairwise comparison of interpretations by 187 participating pathologists in phase 1. Diagnoses for all possible ordered pairs of participants reading the same glass slide are included*
| First pathologist’s interpretation | Second pathologist’s interpretation | Interobserver concordance % (95% CI) | |||||
|---|---|---|---|---|---|---|---|
| Class I | Class II | Class III | Class IV | Class V | Total | ||
| Class I |
| 15 082 | 11 223 | 2993 | 1773 |
| 71 (69 to 73) |
| Class II | 15 082 |
| 11 028 | 3974 | 1903 |
| 25 (22 to 27) |
| Class III | 11 223 | 11 028 |
| 12 675 | 4494 |
| 45 (42 to 47) |
| Class IV | 2993 | 3974 | 12 675 |
| 8854 |
| 46 (43 to 49) |
| Class V | 1773 | 1903 | 4494 | 8854 |
|
| 77 (75 to 79) |
| Total |
|
|
|
|
|
| 55† (53 to 56) |
Numbers of interpretations with agreement are emboldened.
*Average pairwise agreement is an unweighted average across all participant pairs. The number of first pathologist interpretations for a given diagnostic class varies across pairs. Concordance rates are influenced by case composition, which included larger proportions of cases in classes II-V than would typically be encountered in practice. There are 6814 distinct order participant pairs ×48 case interpretations per pair yielding 327 072 interpretations.
†Average κ for interobserver agreement across all participant pairs is 0.42.
Accuracy of 187 participating pathologists’ when phase 1 interpretations are compared with the consensus reference diagnoses*
| Consensus reference diagnosis† | Study pathologists’ interpretation | Total interpretations (No) | % Concordance with reference diagnosis (95% CI) | ||||
|---|---|---|---|---|---|---|---|
| Class I | Class II | Class III | Class IV | Class V | |||
| Class I |
| 50 | 19 | 3 | 1 | 935 | 92 (90 to 94) |
| Class II | 843 |
| 131 | 26 | 11 | 1342 | 25 (22 to 28) |
| Class III | 695 | 520 |
| 113 | 11 | 2247 | 40 (37 to 44) |
| Class IV | 150 | 176 | 717 |
| 198 | 2169 | 43 (39 to 46) |
| Class V | 68 | 87 | 161 | 321 |
| 2283 | 72 (69 to 75) |
| Total |
|
|
|
|
| 8976 | |
*Concordance in interpretation is emboldened.
†Reference diagnosis was obtained from consensus of three experienced dermatopathologists.

Fig 2 Participant interpretive variation on each of 240 cases, with cases organized based on the MPATH-Dx consensus reference diagnosis class

Fig 3 Comparison of accuracy (discordance rates of over-interpretation rate and under-interpretation rate) by all three reference diagnoses

Fig 4 Population level predicted proportions of cutaneous melanocytic biopsy interpretations that would be verified by the consensus reference panel or would be classified as over-interpretations or under-interpretations