| Literature DB >> 33275169 |
Cleo Keppens1, Elisabeth Mc Dequeker1, Patrick Pauwels2,3, Ales Ryska4, Nils 't Hart5,6, Jan H von der Thüsen7.
Abstract
Programmed death ligand 1 (PD-L1) immunohistochemistry (IHC) is accepted as a predictive biomarker for the selection of immune checkpoint inhibitors. We evaluated the staining quality and estimation of the tumor proportion score (TPS) in non-small-cell lung cancer during two external quality assessment (EQA) schemes by the European Society of Pathology. Participants received two tissue micro-arrays with three (2017) and four (2018) cases for PD-L1 IHC and a positive tonsil control, for staining by their routine protocol. After the participants returned stained slides to the EQA coordination center, three pathologists assessed each slide and awarded an expert staining score from 1 to 5 points based on the staining concordance. Expert scores significantly (p < 0.01) improved between EQA schemes from 3.8 (n = 67) to 4.3 (n = 74) on 5 points. Participants used 32 different protocols: the majority applied the 22C3 (56.7%) (Dako), SP263 (19.1%) (Ventana), and E1L3N (Cell Signaling) (7.1%) clones. Staining artifacts consisted mainly of very weak or weak antigen demonstration (63.0%) or excessive background staining (19.8%). Participants using CE-IVD kits reached a higher score compared with those using laboratory-developed tests (LDTs) (p < 0.05), mainly attributed to a better concordance of SP263. The TPS was under- and over-estimated in 20/423 (4.7%) and 24/423 (5.7%) cases, respectively, correlating to a lower expert score. Additional research is needed on the concordance of less common protocols, and on reasons for lower LDT concordance. Laboratories should carefully validate all test methods and regularly verify their performance. EQA participation should focus on both staining concordance and interpretation of PD-L1 IHC.Entities:
Keywords: External quality assessment; Immunohistochemistry; PD-L1; Tumor proportion score
Mesh:
Substances:
Year: 2020 PMID: 33275169 PMCID: PMC8099807 DOI: 10.1007/s00428-020-02976-5
Source DB: PubMed Journal: Virchows Arch ISSN: 0945-6317 Impact factor: 4.064
Laboratory characteristics related to average PD-L1 IHC ESS, analysis failures, and TPS misclassifications
| Characteristic | # observations (%) ( | Average ESS on 5 points+ | # analysis failures (%) ( | # under-estimations (%) ( | # over-estimations (%) ( |
|---|---|---|---|---|---|
| EQA scheme year | 0.393 (0.217; 0.713); | ND | 0.388 (0.148; 1.015); | 0.647 (0.304; 1.377); | |
| 2017 | 67 (47.5) | 3.8 | 0 (0.0) | 14 (70.0) | 14 (58.3) |
| 2018 | 74 (52.5) | 4.3 | 8 (100.0) | 6 (30.0) | 10 (41.7) |
| # EQA participations | 1.430 (0.741; 2.759); | 0.937 (0.163; 5.389); | ND | 0.402 (0.135; 1.190); | |
| 1st participation | 104 (73.8) | 4.0 | 6 (75.0) | 20 (100.0) | 21 (87.5) |
| 2nd participation | 37 (26.2) | 4.2 | 2 (25.0) | 0 (0.0) | 3 (12.5) |
| Laboratory setting†,‡ | ND | ND | ND | ||
| Industry | 4 (2.8) | 4.3 | 0 (0.0) | 0 (0.0) | 3 (12.5) |
| (private) laboratories | 31 (22.0) | 4.0 | 6 (75.0) | 4 (20.0) | 5 (20.8) |
| Hospital laboratories | 36 (25.5) | 4.0 | 0 (0.0) | 8 (40.0) | 6 (25.0) |
| University and research | 70 (49.6) | 4.1 | 2 (25.0) | 8 (40.0) | 10 (41.7) |
| Accreditation status‡ | 0.609 (0.313; 1.185); | 0.805 (0.133; 4.876); | 1.242 (0.492; 3.135); | 2.481 (1.049; 5.882); | |
| No | 62 (44.0) | 3.8 | 4 (50.0) | 10 (50.0) | 16 (66.7) |
| Yes | 77 (54.6) | 4.2 | 4 (50.0) | 10 (50.0) | 8 (33.3) |
| Missing data | 2 (1.4) | 4.5 | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| # samples tested in last 12 months for PD-L1 | 1.214 (0.874; 1.687); | 0.390 (0.150; 1.013); | 0.667 (0.426; 1.046); | 0.882 (0.591; 1.315); | |
| No clinical testing | 7 (5.0) | 4.1 | 3 (37.5) | 2 (10.0) | 2 (8.3) |
| < 10 | 7 (5.0) | 3.6 | 1 (12.5) | 4 (20.0) | 0 (0.0) |
| 10-99 | 63 (44.7) | 3.8 | 3 (37.5) | 8 (40.0) | 11 (45.8) |
| 100-249 | 32 (22.7) | 4.4 | 0 (0.0) | 3 (15.0) | 5 (20.8) |
| 250-499 | 21 (14.9) | 4.4 | 1 (12.5) | 3 (15.0) | 4 (16.7) |
| > 500 | 9 (6.4) | 4.0 | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Missing data | 2 (1.4) | 4.5 | 0 (0.0) | 0 (0.0) | 2 (8.3) |
| # staff involved in testing | 1.212 (0.862; 1.704); | 0.637 (0.339; 1.196); | 1.123 (0.701; 1.800); | 0.970 (0.643; 1.461); | |
| 1-5 | 53 (37.6) | 4.0 | 3 (37.5) | 6 (30.0) | 8 (33.3) |
| 6-10 | 49 (34.8) | 4.0 | 5 (62.5) | 8 (40.0) | 9 (37.5) |
| 11-20 | 23 (16.3) | 4.1 | 0 (0.0) | 4 (20.0) | 5 (20.8) |
| > 20 | 14 (9.9) | 4.4 | 0 (0.0) | 2 (10.0) | 1 (4.2) |
| Missing data | 2 (1.4) | 5.0 | 0 (0.0) | 0 (0.0) | 1 (4.2) |
| Method type§ | 1.916 (1.012; 3.629); | 2.716 (0.467; 15.793); | 1.350 (0.532; 3.425); | 0.789 (0.327; 1.905); | |
| Approved kit (CDx) | 67 (47.5) | 4.2 | 2 (25.0) | 11 (55.0) | 10 (41.7) |
| LDT | 74 (52.5) | 3.9 | 6 (75.0) | 9 (45.0) | 14 (58.3) |
| Switched protocol between schemes¶ | 0.899 (0.247; 3.280); | 2.083 (0.142; 30.537); | ND | ND | |
| No | 25 (17.7) | 4.2 | 1 (12.5) | 0 (0.0) | 3 (12.5) |
| Yes | 12 (8.5) | 4.3 | 1 (12.5) | 0 (0.0) | 0 (0.0) |
| NA | 104 (73.8) | 4.0 | 6 (75.0) | 20 (100.0) | 21 (87.5) |
| Antibody dilution | ND | ND | ND | ||
| < 1/50 | 17 (12.1) | 4.3 | 0 (0.0) | 1 (5.0) | 2 (8.3) |
| 1/50 - 1/100 | 72 (51.1) | 3.9 | 7 (87.5) | 9 (45.0) | 12 (50.0) |
| > 1/100 | 14 (9.9) | 3.4 | 0 (0.0) | 4 (20.0) | 4 (16.7) |
| RTU | 38 (27.0) | 4.5 | 1 (12.5) | 6 (30.0) | 6 (25.0) |
| Incubation temperature (°C) | 0.363 (0.193; 0.682); | ND | ND | ND | |
| RT | 55 (39.0) | 3.7 | 1 (12.5) | 10 (50.0) | 14 (58.3) |
| 30-37 | 86 (61.0) | 4.3 | 7 (87.5) | 10 (50.0) | 10 (41.7) |
| Incubation time (min) | ND | ND | ND | ||
| 13-30 | 78 (55.3) | 4.1 | 3 (37.5) | 13 (65.0) | 14 (58.3) |
| 31-60 | 48 (34.0) | 4.0 | 5 (62.5) | 6 (30.0) | 8 (33.3) |
| > 60 | 15 (10.6) | 3.9 | 0 (0.0) | 6 (30.0) | 2 (8.3) |
| Use of amplification | 1.249 (0.659; 2.365); | ND | ND | ND | |
| No | 77 (54.6) | 4.1 | 6 (75.0) | 10 (50.0) | 12 (50.0) |
| Yes | 64 (45.4) | 4.0 | 2 (25.0) | 10 (50.0) | 12 (50.0) |
Abbreviations: # number, CDx companion diagnostic, CI confidence interval, EQA external quality assessment, ESS expert staining score, GEE generalized estimating equations, IHC immunohistochemistry, IRR incidence rate ratio, LDT laboratory-developed test, NA not applicable, ND not determined, OR odds ratio, PD-L1 programmed death ligand 1, RT room temperature, RTU ready-to-use, TPS tumor proportion score
+Proportional odds models were used to analyze the difference in ESS. ++Poisson models were used to evaluate the association with analysis failures or under-/over-estimations. Both models applied GEE for clustering of the data. Results are presented as ORs/IRRs (± 95% CI), respectively. OR/IRR > 1 represent a higher ESS/higher incidence for a higher category level. OR/IRR < 1 represent a lower ESS/lower incidence for a higher category level. *p < .05, **p < .01, ***p < .001, ****p < .0001. ND; statistics not computed due to low power (absence or very few events in one level). For variables with more than two categories (laboratory setting, incubation time, and temperature), overall significance levels are given. ORs for every pairwise comparison between categories are described in the main text
°Analysis failures are defined as the failure to stain or interpret the PD-L1 IHC results on all assessed cases. Under-estimations are calculated on samples validated as a TPS of 1–50% or > 50%. Over-estimations are calculated on the total number of samples with TPS < 1% or 1–50%
†Industry are laboratories involved in the development of diagnostic commercial kits. (Private) Laboratories are not within a hospital’s infrastructure. Hospital laboratories included private and public hospitals. University and research included education and research hospitals, university hospitals, university laboratories, and anti-cancer centers [30]
‡Laboratory setting and accreditation were validated on the websites of the laboratories and national accreditation bodies. Accreditation is defined as compliant to ISO 15189 or relevant national standards
§Approved kits are defined as using the Dako 22C3, Ventana SP142, or Ventana SP263 kits with platform for their intended use. LDTs are defined as these three clones in combination with another platform, or any other antibody clone
¶A switch included either the change in primary antibody, antigen retrieval, or detection method. ‘Not applicable’ included entries from first participations for which no method information from previous years was available
Fig. 1Incidence of analysis failures and TPS under-/over-estimations related to the obtained ESS. Poisson models with GEE were used to analyze the association of the ESS with the number of incorrect TPS classifications (under- or over-estimations) and the number of analysis failures observed in the EQA schemes as count outcome variables. Analysis failures are defined as the failure to stain or interpret the PD-L1 IHC results. Under-estimations are defined only for samples validated as a TPS of 1–50% or > 50%. Over-estimations are defined only for samples with TPS < 1% or 1–50%. Results are presented as IRR (95% CI) taking into account the log of the total number of samples analyzed during the EQA scheme as an offset variable. Bar labels represent the number of cases with correct results/under-estimations/over-estimations/analysis failures observed. IRRs < 1 represent a lower number of incidents for higher ESS. IRRs > 1 represent a higher number of incidents for higher ESS. *p < .05, **p < .01, ***p < .001, ****p < .0001. The IRR for analysis failures in cases with a TPS of 1–50% and > 50% was not computed as only one incident occurred. Abbreviations: CI confidence interval, EQA external quality assessment, ESS expert staining score, GEE generalized estimating equations, IRR incidence rate ratio, N/A not applicable, ND not determined, PD-L1 programmed death ligand 1, TPS tumor proportion score
Fig. 2Examples of optimal and suboptimal concordance with PD-L1 IHC reference stains for different protocols during the 2018 EQA scheme. Images represent a matching core with a validated consensus TPS of > 50%. The core was part of a tissue micro-array containing four different FFPE cases for staining by routine antibodies and detection systems of the 2018 EQA participants. Protocols are presented as reported by the participating laboratories. Scale bar 1 mm. Optimal concordance (top row, from left to right): SP263: RTU antibody from Ventana (16 min incubation at 36 °C) in combination with the Ventana CC1 (64 min.) and OptiView DAB IHC Detection Kit. 22C3: Antibody from Dako (diluted 1/40, 16-min incubation at 37 °C) in combination with the Ventana CC1 (64 min) and OptiView DAB IHC Detection Kit. 28-8: Antibody from Abcam (diluted 1/100, 32-min incubation at RT) in combination with the Ventana CC1 (64 min) and OptiView DAB IHC Detection Kit. SP142: RTU antibody from Ventana (24-min incubation at 37 °C) in combination with the Ventana CC1 (48 min) and UltraView DAB IHC Detection Kit. E1L3N: Antibody from Cell Signaling (diluted 1/200, 30-min incubation at RT) in combination with Leica Bond Epitope Retrieval 2 (20 min) and bond polymer refine detection system. Suboptimal concordance (bottom row, from left to right): SP263: RTU antibody from Ventana (60-min incubation at 37 °C) in combination with the Ventana CC1 (64 min) and OptiView DAB IHC Detection Kit; weak demonstration of antigen in the tumor population and cytoplasmic staining. 22C3: Antibody from Dako (diluted 1/50, 30-min incubation at RT) in combination with Dako EnVisionFLEX Target Retrieval Solution (low pH) and the Envision Flex detection system; Excessive background staining and cytoplasmic staining. 28-8: Antibody from Abcam (diluted 1/50, 20-min incubation at 32 °C) in combination with Dako EnVisionFLEX Target Retrieval Solution (low pH) and the Envision Flex detection system; background staining. SP142: RTU antibody from Ventana (16-min incubation at 36 °C) in combination with the Ventana CC1 (48 min) and OptiView DAB IHC Detection Kit; weak staining of epithelial cells. E1L3N: Antibody from Cell Signaling (diluted 1/150, 60-min incubation at RT) in combination with laboratory developed antigen retrieval by TRIS-EDTA and the Vectastain ABC immunoperoxidase staining avidin-biotin complexes; weak demonstration of antigen in the tumor population and cytoplasmic staining. Abbreviations: EQA external quality assessment, FFPE formalin-fixed paraffin embedded, IHC immunohistochemistry, PD-L1 programmed death ligand 1, RT room temperature, RTU ready-to-use, TPS tumor proportion score
Analysis failures, TPS misclassifications, and ESS for the different PD-L1 IHC protocols used in the EQA schemes
| Primary antibody | Antigen retrieval | Detection method | # times used (%) ( | # analysis failures (%) ( | # Under-estimations (%) ( | # Over-estimations (%) ( | Average ESS/5 | Method code | OR (95% CI) relative to method | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| a | b | c | d | e | f | |||||||||
| 22C3 (Dako) | Cc1 (Ventana) | OptiView DAB IHC Detection Kit (Ventana) | 35 (24.8%) | 5 (62.5%) | 3 (15.0%) | 3 (12.5%) | 4.1 | a | / | 1.247 (0.547; 2.843) | 1.769 (0.482; 6.494) | 1.211 (0.422; 3.481) | ||
| EnVisionFLEX Target Retrieval Solution, low pH (Dako) | Envision flex (Dako) | 32 (22.7%) | 1 (12.5%) | 3 (15.0%) | 5 (20.8%) | 3.9 | b | 0.802 (0.352; 1.828) | / | 3.269 (0.979; 10.919) | 1.419 (0.361; 5.579) | 0.972 (0.311; 3.037) | ||
| Bond Epitope Retrieval 1 (Leica) | Bond polymer refine detection system (Leica) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 2 (8.3%) | 1.0 | c | 0.306(0.092; 1.022) | / | 0.434 (0.090; 2.102) | 0.297 (0.075; 1.178) | |||
| Bond Epitope Retrieval 2 (Leica) | 2 (1.4%) | 1 (12.5%) | 0 (0.0%) | 1 (4.2%) | 3.5 | |||||||||
| Cc1 (Ventana) | UltraView Universal DAB Detection kit (Ventana) | 4 (2.8%) | 0 (0.0%) | 2 (10.0%) | 0 (0.0%) | 3.3 | ||||||||
| Omnis Envision FLEX TRS, High pH (Dako) | Envision flex (Dako) | 1 (0.7%) | 0 (0.0%) | 2 (10.0%) | 0 (0.0%) | 2.0 | ||||||||
| PT module TRS High envision Flex (Dako) | 2 (1.4%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 4.0 | |||||||||
| Homebrew EDTA or TRIS-EDTA (with/without pressure cooker) | Bond polymer refine detection system (Leica) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 3.0 | ||||||||
| Ultra CC1 (Ventana) | OptiView DAB IHC Detection Kit (Ventana) | 2 (1.4%) | 0 (0.0%) | 0 (0.0%) | 1 (4.2%) | 4.5 | ||||||||
| SP263 (Ventana) | Cc1 (Ventana) | OptiView DAB IHC Detection Kit (Ventana) | 24 (17.0%) | 1 (12.5%) | 1 (5.0%) | 5 (20.8%) | 4.8 | d | ||||||
| UltraView Universal DAB Detection kit (Ventana) | 2 (1.4%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 5.0 | ND | ND | ND | ND | ND | ND | |||
| EnVisionFLEX Target Retrieval Solution, low pH (Dako) | OptiView DAB IHC Detection Kit (Ventana) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 5.0 | ND | ND | ND | ND | ND | ND | ||
| E1L3N (cell signaling) | Bond Epitope Retrieval 2 (Leica) | Bond polymer refine detection system (Leica) | 6 (4.3%) | 0 (0.0%) | 0 (0.0%) | 1 (4.2%) | 4.3 | e | 0.565 (0.154; 2.075) | 0.705 (0.179; 2.770) | 2.304 (0.476; 11.111) | / | 0.685 (0.149; 3.155) | |
| Homebrew EDTA or TRIS-EDTA (w/wo pressure cooker) | ABC immunoperoxidase staining avidin-biotin complexes (Vectastain ABC Elite; Vector Laboratories) | 1 (0.7%) | 0 (0.0%) | 1 (5.0%) | 0 (0.0%) | 1.0 | ||||||||
| Bond polymer refine detection system (Leica) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 4.0 | |||||||||
| Brightvision(Immunologic) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 3.0 | |||||||||
| ZytoChem Plus (HRP) Polymer Kit (Zytomed) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 3.0 | |||||||||
| 28-8 (Abcam) | Cc1 (Ventana) | OptiView DAB IHC Detection Kit (Ventana) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 5.0 | f | 0.826 (0.287; 2.370) | 1.029 (0.329; 3.215) | 3.364 (0.849; 13.321) | 1.461 (0.317; 6.719) | / | |
| DAKO Omnis Envision FLEX TRS, High pH | Envision flex (Dako) | 2 (1.4%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 3.5 | ||||||||
| EnVisionFLEX Target Retrieval Solution, low pH (Dako) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 4.0 | |||||||||
| 28-8 (Dako) | Cc1 (Ventana) | OptiView DAB IHC Detection Kit (Ventana) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 4.0 | |||||||
| EnVisionFLEX Target Retrieval Solution, low pH (Dako) | Envision flex (Dako) | 3 (2.1%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 4.7 | ||||||||
| CAL10 (Biocare Medical) | Bond Epitope Retrieval 2 (Leica) | Bond polymer refine detection system (Leica) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 4.0 | |||||||
| Cc1 (Ventana) | OptiView DAB IHC Detection Kit (Ventana) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 1 (4.2%) | 2.0 | ||||||||
| Homebrew EDTA or TRIS-EDTA (with/without pressure cooker) | Master Polymer Plus (Master Diagnóstica SLU) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 1 (4.2%) | 5.0 | ||||||||
| ZytoChem Plus (HRP) Polymer Kit (Zytomed) | 1 (0.7%) | 0 (0.0%) | 0 (0.0%) | 2 (8.3%) | 2.0 | |||||||||
| QR1 (Quartett) | Bond Epitope Retrieval 1 (Leica) | Bond polymer refine detection system (Leica) | 2 (1.4%) | 0 (0.0%) | 1 (5.0%) | 0 (0.0%) | 2.0 | |||||||
| Cc1 (Ventana) | UltraView Universal DAB Detection kit (Ventana) | 3 (2.1%) | 0 (0.0%) | 0 (0.0%) | 0 (0.0%) | 5.0 | ||||||||
| EnVisionFLEX Target Retrieval Solution, low pH (Dako) | Envision flex (Dako) | 1 (0.7%) | 0 (0.0%) | 1 (5.0%) | 1 (4.2%) | 5.0 | ||||||||
| Homebrew EDTA or TRIS-EDTA (with/without pressure cooker) | ZytoChem Plus (HRP) Polymer Kit (Zytomed) | 1 (0.7%) | 0 (0.0%) | 1 (5.0%) | 1 (4.2%) | 4.0 | ||||||||
| SP142 (Ventana) | Cc1 (Ventana) | OptiView DAB IHC Detection Kit (Ventana) | 4 (2.8%) | 0 (0.0%) | 4 (20.0%) | 0 (0.0%) | 2.8 | |||||||
| UltraView Universal DAB Detection kit (Ventana) | 1 (0.7%) | 0 (0.0%) | 1 (5.0%) | 0 (0.0%) | 5.0 | |||||||||
Proportional odds models with GEE for clustering of the data were used to analyze the difference in ESS. Differences in ESS are represented as ORs (95% CI) for every method (row level) relative to other methods used (column level). OR > 1 represent a higher ESS for a given method (column level) relative to the other method (row level). OR < 1 represent a lower ESS for a method relative to other methods. Statistics are computed for main method categories. ND; statistics not computed due to low power (low number of users). Significant results are highlighted in italics. *p < .05, **p < .01, ***p < .001, ****p < .0001. Analysis failures are defined as the failure to stain or interpret the PD-L1 IHC results on all assessed cases. Under-estimations are calculated on samples validated as a TPS of 1–50% or > 50%. Over-estimations are calculated on the total number of samples with TPS < 1% or 1–50%
Abbreviations: # number, CI confidence interval, EQA external quality assessment, ESS expert staining score, GEE generalized estimating equations, IHC immunohistochemistry, ND not determined, OR odds ratio, PD-L1 programmed death ligand 1, TPS tumor proportion score