| Literature DB >> 34655228 |
Laura Lasiter1, Olga Tymejczyk2, Elizabeth Garrett-Mayer3, Shrujal Baxi2, Andrew J Belli4, Marley Boyd5, Jennifer B Christian6, Aaron B Cohen2,7, Janet L Espirito5, Eric Hansen4, Connor Sweetnam8, Nicholas J Robert5, Mackenzie Small6, Mark D Stewart1, Monika A Izano8, Joseph Wagner6, Yanina Natanzon9, Donna R Rivera10, Jeff Allen1.
Abstract
In prior work, Friends of Cancer Research convened multiple data partners to establish standardized definitions for oncology real-world end points derived from electronic health records (EHRs) and claims data. Here, we assessed the performance of real-world overall survival (rwOS) from data sets sourced from EHRs by evaluating the ability of the end point to reflect expected differences from a previous randomized controlled trial across five data sources, after applying inclusion/exclusion criteria. The KEYNOTE-189 clinical trial protocol of platinum doublet chemotherapy (chemotherapy) vs. programmed cell death protein 1 (PD-1) in combination with platinum doublet chemotherapy (PD-1 combination) in first-line nonsquamous metastatic non-small cell lung cancer guided retrospective cohort selection. The Kaplan-Meier product limit estimator was used to calculate 12-month rwOS with 95% confidence intervals (CIs) in each data source. Cox proportional hazards models estimated hazard ratios (HRs) and associated 95% CIs, controlled for prognostic factors. Once the inclusion/exclusion criteria were applied, the five resulting data sets included 155 to 1,501 patients in the chemotherapy cohort and 36 to 405 patients in the PD-1 combination cohort. Twelve-month rwOS ranged from 45% to 58% in the chemotherapy cohort and 44% to 68% in the PD-1 combination cohort. The adjusted HR for death ranged from 0.80 (95% CI: 0.69, 0.93) to 1.15 (95% CI: 0.71, 1.85), controlling for age, gender, performance status, and smoking status. This study yielded insights regarding data capture, including ability of real-world data to precisely identify patient populations and the impact of criteria on end points. Sensitivity analyses could elucidate data set-specific factors that drive results.Entities:
Mesh:
Substances:
Year: 2021 PMID: 34655228 PMCID: PMC9298266 DOI: 10.1002/cpt.2443
Source DB: PubMed Journal: Clin Pharmacol Ther ISSN: 0009-9236 Impact factor: 6.903
Variations across data sources and implementation of end‐point definition
| Data set | Description of data source | Population | Derivation of date of death | Censor date |
|---|---|---|---|---|
| A | Structured and unstructured EHR data and commercial obituary data. | Academic and community practice patients in the United States; outpatient; initiated frontline therapy between 2015 and 2018. | An algorithm is used. If dates agree across the three data sources, the date is selected. If discrepancy <7 days exists, EHR data is preferentially captured. If discrepancy >7 days exists, EHR data with accompanying source documentation (e.g., death certificate) is prioritized, otherwise commercial obituary data is captured. | Date of last clinical activity prior to data cutoff, defined as in‐person visit event with healthcare provider such as treatment administration or collected test. |
| B | Structured and unstructured EHR data, commercial obituary data, and Social Security Death Index (SSDI) data. | Academic and community practice patients in the United States; outpatient; initiated frontline therapy between 2015 and 2018. |
An algorithm is used. If all dates agree across the three data sources, the date is selected. If any two dates agree, that date is selected. If all three dates disagree, the following hierarchy is applied: SSDI, obituary, EHR data. If a day level DoD is available in abstracted EHR data, that date is selected over the consensus structured date. Exact date was used, where available. If only month‐level date was available, it was generalized to the end of month. If only year‐level date was available, if was generalized to the end of year. | Date of last structured activity, defined as the most recent visit prior to data cutoff. |
| C | Structured and unstructured EHR data; hospital‐based, enterprise‐wide, and national cancer registries; commercial obituary data; and digitized obituaries. | Community practice patients in the United States; inpatient and outpatient; initiated frontline therapy between 2015 and 2018. | An algorithm is used. Tumor registry (hospital, enterprise‐wide, national) dates were preferentially selected, followed by structured EHR, followed by commercial obituary data. | Date of last contact (physical encounter, medication order, or medication administration, from structured or unstructured EHR data) prior to the data cutoff. |
| D | Structured EHR data. | Community practice patients in the United States; outpatient; initiated frontline therapy between 2015 and 2018. | Actual date of death documented from EHR or DMF. | Date of last structured clinical activity prior to data cutoff. |
| E | Structured EHR data, structured claims data. | Academic and community practice patients in the United States; primarily outpatient; initiated frontline therapy between 2017 and 2018. | Mortality algorithm incorporating EHR and Claims. | Date of last clinical encounter with healthcare provider prior to data cutoff. |
DMF, death master file; DoD, date of death; EHR, electronic health record.
Definitions of inclusion and exclusion criteria for alignment with Keynote‐189
| Data set | Advanced diagnosis | Metastatic status | Histology | Organ function | ECOG PS | EGFR/ALK sensitizing mutations |
|---|---|---|---|---|---|---|
| Overall |
Diagnosis with advanced disease defined as American Joint Committee on Cancer stage: Stage IIIB, IIIC or IV NSCLC at initial diagnosis or Early stage (stages I, II, and IIIA) NSCLC with a recurrence or progression to metastatic status. | Each group determined metastatic status according to internally consistent protocols. |
Patients were placed into one of the following categories: Non‐squamous cell carcinoma Squamous cell carcinoma NSCLC histology not otherwise specified (NOS) |
Patient is excluded if there is evidence of inadequate kidney/liver function based on laboratory values in the 90 days prior to and including the index date based upon structured lab data. Inadequate renal function is defined as creatinine clearance of <50 ml/min (creatinine clearance calculated using the Cockroft‐Gault equation). Inadequate liver function is calculated as a serum total bilirubin of ≥1.5× ULN (unless direct bilirubin was measured on the same date and <ULN) and AST or ALT ≥5× ULN NOTE: If there are multiple lab values of interest in the time window, the value closest to index date should be used. If there are multiple values for a given lab on the same day, lower creatinine clearance values/ higher bilirubin, AST, and ALT values should be used (i.e. prioritize values that exclude patients) |
Patient’s ECOG PS at the time of the index date: 0 1 2+ Unknown NOTE: ECOG PS may have been recorded up to 30 days prior to the index date, |
Approach 1 to the ALK/EGFR exclusion criterion: Test result showing ALK/EGFR rearrangement/mutation at any point before or up to 30 days after the index date. Approach 2 to the ALK/EGFR exclusion criterion: (alone or in combination with approach 1 above) Patients who are prescribed EGFR or ALK targeting therapies in the first 6 months of the index date will be excluded Erlotinib, Afatinib, Osimertinib, Gefitinib, Crizotinib, Ceritinib, Alectinib |
| A |
Data source: Abstracted EHR data. Definition: No deviation. |
Data source: Abstracted EHR data Definition: Defined as at least one of the following: stage IV, TNM M1, and/or metastatic progression |
Data source: Abstracted EHR data. Definition: No deviation. |
Data source: Structured EHR data. Definition: No deviation. |
Data source: Structured EHR data. Definition: No deviation. |
Data source: Abstracted EHR data Definition: Used a combination of approach 1 and 2. No deviation from definitions. |
| B |
Data source: Abstracted EHR data. Definition: No deviation. |
Data source: Abstracted EHR data. Definition: Stage IV only. |
Data source: Abstracted EHR data. Definition: No deviation. |
Data source: Structured EHR data. Definition: No deviation. |
Data source: Structured EHR data. Definition: No deviation. |
Data source: Abstracted EHR data. Definition: Approach 1 without deviation. Test date was identified as the most recent date across the “specimen collected” date, “specimen received” date, and “result date” variables. If multiple tests were recorded, the one closest to the index date was used. If multiple tests were recorded on the same date, the result showing rearrangement / mutation was used. |
| C |
Data source: Unstructured EHR data and structured pathology reports. Definition: Recurrence or progression to metastatic status is the date of medical oncologist stated metastasis or distant recurrence collected from unstructured EHR. |
Data source: Unstructured EHR. Definition: Metastatic status is based on a medical oncologist statement. |
Data source: Unstructured EHR and structured pathology reports. Definition: Main sources of histology are not codified in the source data but are applied post data collection. Agreed upon vocabulary based on WHO lung cancer histology hierarchy management is utilized to derive NSCLC grouping. |
Data source: Structured EHR. Definition: Organ function was not evaluated if value, unit, or range were missing. |
Data source: Unstructured and structured EHR. Definition: If Karnofsky performance status (KPS) was available instead of ECOG PS, KPS was converted to ECOG PS. |
Data source: Unstructured and structured EHR; commercial laboratory electronic reports. Definition: Used a combination of approach 1 and 2. Definition of ALK rearrangement is based on laboratory report or physician statement. EGFR mutations are defined as any reported mutation on exon 18 through exon 21 regardless of classification. Approach 2 did not deviate. |
| D |
Data source: Structured EHR data. Definition: No deviation. |
Data source: EHR data. Definition: Metastatic status identified by having at least one of the following: Stage IV or M1 disease; documented organ site of metastasis; or disease status as metastatic disease |
Data source: EHR data. Definition: No deviation. |
Data source: EHR data. Definition: No deviation. |
Data source: EHR data. Definition: No deviation. |
Data source: EHR data. Definition: No deviation. |
| E |
Data source: Structured EHR. Definition: To identify staging with ICD9/10 used to classify patients with early stage who progressed. |
Data source: Structured EHR. Definition: Diagnosis, staging and M values. |
Data source: Structured EHR histology codes and descriptions. Definition: For a subset of patients, NSCLC was indicated, but not Squamous vs. Non‐Squamous. These patients were kept since they could not be definitively classified as either. |
Data source: Structured EHR. Definition: No deviation. |
Data source: Structured EHR. Definition: No deviation. |
Data source: Structured and unstructured EHR. Definition: No deviation. |
ALT, alanine aminotransferase; AST, aspartate aminotransferase; ECOG PS, Eastern Cooperative Oncology Group performance status; EGFR/ALK, epidermal growth factor receptor/anaplastic lymphoma kinase; EHR, electronic health record; ICD 9/10, International Classification of Diseases, Ninth and Tenth Revisions; NSCLC, non‐small cell lung cancer; TNM M1, tumor, node, and metastasis metastasis 1; ULN, upper limit of normal; WHO, World Health Organization.
Characteristics of fully restricted cohorts
| Characteristic | KEYNOTE‐189 | A | B | C | D | E |
|---|---|---|---|---|---|---|
| Chemotherapy/PD‐1 combination treated patient characteristics | ||||||
| Total number of patients/Treatment, no. | 410/206 | 346/54 | 1,501/405 | 232/36 | 748/132 | 155/125 |
| Age | ||||||
| Median, yrs (IQR) | 65, 34, 84/64, 34, 84 | 68, 60, 74/65, 60, 72 | 67, 59, 74/65, 59, 72 | 66, 59, 73/64, 58, 71 | 67, 60, 74/64, 58, 72 | 68, 59, 74/65, 60, 73 |
| <65 yr, % | 48.0/55.8% | 38.7/50.0% | 42.4/47.2% | 46.1/50.5% | 41.7/52.3% | 37.4/43.2% |
| Gender, Male, % | 62.0/52.9% | 45.7/55.6% | 45.7/55.6% | 52.6/63.9% | 46.9/58.3% | 47.7/52.8% |
| ECOG, % | ||||||
| 0 | 45.4/38.8% | 25.4/24.1% | 21.0/31.4% | 14.7/22.2% | 14.7/30.3% | 19.4/24.8% |
| 1 | 53.9/60.7% | 46.2/44.4% | 33.7/37.5% | 29.7/38.9% | 63.2/43.2% | 35.5/31.2% |
| Unknown | 0.5/0.5% | 28.3/31.5% | 45.3/31.1% | 55.6/38.9% | 22.1/26.5% | 45.2/44.0% |
| Smoking status, % | ||||||
| Evidence of smoking | 88.3/87.9% | 88.2/87.0% | 88.3/89.6% | 13.8/36.1% | 84.9/84.8% | 25.2/25.6% |
| No evidence | 11.7/12.1% | 11.8/13.0% | 11.7/10.4% | 86.2/63.9% | 15.1/15.2% | 74.8/74.4% |
| Histology, % | ||||||
| Non‐squamous cell carcinoma | 96.1/96.1% | 100/100% | 100/100% | 100/100% | 100/100% | 100/100% |
| NOS | 2.4/1.9% | Restricted | Restricted | Restricted | Restricted | Restricted |
| Brain metastases, % | ||||||
| Evidence of | 17.8/17.0% | 31.8/29.6% | 18.5/14.1% | 13.8/19.5% | 13.2/9.1% | 16.1/20.0% |
| No evidence | 82.2/83.0% | 68.2/70.4% | 81.5/85.9% | 86.2/80.6% | 86.8/90.9% | 83.9/80.0% |
| PD‐L1 expression status, % | ||||||
| <1% | 31.0/30.6% | 11.9/12.8 | 23.7/16.3% | 50.9/20.8% | 50.5/32.3% | NA |
|
| 63.4/62.1% | 19.3/42.6% | 51.9/65.4% | 49.1/79.2% | 43.5/64.6% | NA |
| 1–49% | 31.2/28.2% | 14.8/34.0% | 39.0/34.6% | 26.4/41.7% | 39.6/38.5% | NA |
|
| 32.2/34.0% | 5.9/6.4% | 12.9/30.9% | 22.6/37.5% | 3.8/26.2% | NA |
| Unknown | 5.6/7.3% | 67.4/46.8% | 24.4/18.3% | 6.0/3.1% | NA | |
| Renal function, % | ||||||
| No Evidence of Inadequate Function | 70.5%/64.8% | 89.5/93.1% | 100/100% | 81.0/77.3% | 13.5/16.8% | |
| Unknown% | 29.5%/35.2% | 10.5/6.9% | 0.0/0.0% | 19.0/22.7% | 86.5/83.2% | |
| Hepatic function, % | ||||||
| No evidence of inadequate function | 70.5%/59.3% | 83.3/87.4% | 100/100% | 79.9/76.5% | 49.7/71.2% | |
| Unknown | 29.5%/40.7% | 16.7/12.6% | 0.0/0.0% | 20.1/23.5% | 50.3/28.8% | |
| Median time from advanced diagnosis to frontline therapy initiation, months (IQR) | 1.00, 0.57, 1.63/0.97, 0.62, 1.58 | 1.2,0.8,1.7/1.1,0.7,1.5 | 1.25, 0.90, 1.80/1.17, 0.70, 1.95 | 0.83, 0.37, 1.47/0.72, 0.37, 1.38 | 1.03, 0.53, 1.77/0.93, 0.50, 1.77 | |
| Median structured follow‐up time from frontline therapy initiation, months (IQR) | 10.98, 5.14, 21.28/11.85, 5.60, 14.84 | 7.2,2.8,17.6/10.1,3.5,15.6 | 12.27, 4.93, 25.04/10.37, 5.36, 17.12 | 10.08, 3.67, 19.73/12.98, 3.55, 16.10 | 13.53, 5.93, 14.06/14.06, 6.10, 21.13 | |
| Status at initial diagnosis, % | ||||||
| Advanced at diagnosis | 87.6/88.9% | 100/100% | 88.8/83.3% | 95.6/97.0% | 96.1/92.0% | |
| Progressed after initial diagnosis | 12.4/11.1% | 11.2/16.7% | 4.4/3.0% | 3.9/8.0% | ||
| Stage, % | ||||||
| 0 | 0.0/0.0% | 0.0/0.0% | 0.0/0.0% | 1.3/0.8% | ||
| I | 6.6/5.6% | 6.9/5.6% | 0.9/0.0% | 0.6/0.0% | ||
| II | 2.6/0.0% | 0.9/2.8% | 0.0/0.0% | 0.0/0.0% | ||
| III | 3.2/5.6% | 3.4/8.3% | 2.9/0.0% | 0.0/0.8% | ||
| IV | 87.6/88.9% | 100/100% | 87.9/83.3% | 95.6/97.0% | 95.5/89.6% | |
| Unknown | 0.0/0.0% | 0.0/0.0% | 0.9/0.0% | 0.0/0.0% | 2.6/8.8% | |
| Index year, % | ||||||
| 2015 | 35.5/1.9% | 36.2/0.0% | 34.5/0.0% | 29.7/0.0% | 0.0/0.0% | |
| 2016 | 36.1/0.0% | 36.8/0.2% | 35.3/0.0% | 33.7/0.0% | 0.0/0.0% | |
| 2017 | 23.7/63.0% | 22.7/71.6% | 23.7/75.0% | 30.5/79.5% | 83.9/69.6% | |
| 2018 | 4.6/35.2% | 4.2/28.1% | 6.5/25.0% | 6.1/17.4% | 16.1/30.4% | |
ECOG, Eastern Cooperative Oncology Group; NOS, not otherwise specified; PD‐1, programmed cell death protein 1; PD‐L1, programmed death‐ligand 1.
Data were not provided for this analysis.
Unadjusted associations between use of frontline therapy and rwOS
| Keynote 189 | A | B | C | D | E | |
|---|---|---|---|---|---|---|
| Number of patients | 616 | 346/54 | 1,501/405 | 232/36 | 748/132 | 155/125 |
| Number of events (Chemotherapy/ PD‐1 combination) | 235 | 203/25 | 1094/216 | 163/22 | 421/61 | 164 |
| 12‐month OS (Chemotherapy/ PD‐1 combination) | 0.49/0.69 | 0.57/0.67 | 0.45/0.53 | 0.53/0.44 | 0.56/0.62 | 0.51/0.56 |
| Unadjusted Hazard ratio (HR) for death (95% CI) | 0.49 (0.38–0.64) | 0.99 (0.65–1.50) | 0.79 (0.68–0.92) | 1.10 (0.70–1.72) | 0.92 (0.70–1.20) | 0.97 (0.71–1.33) |
| Age HR (95% CI) | ||||||
| <65 | 0.43 (0.31–0.61) | 1.25 (0.71–2.20) | 0.75 (0.60–0.94) | 1.30 (0.68–2.49) | 1.26 (0.79–2.0) | |
|
| 0.64 (0.43–0.95) | 0.80 (0.42–1.53) | 0.84 (0.69–1.03) | 0.94 (0.49–1.81) | 1.05 (0.88–1.26) | 0.74 (0.48–1.15) |
| Sex HR (95% CI) | ||||||
| Male | 0.70 (0.50–0.99) | 0.81 (0.44–1.48) | 0.75 (0.63–0.91) | 1.08 (0.49–2.36) | 0.96 (0.80–1.15) | 1.31 (0.85–2.02) |
| Female | 0.29 (0.19–0.44) | 1.24 (0.70–2.22) | 0.81 (0.63–1.03) | 1.07 (0.62–1.87) | 0.72 (0.45–1.14) | |
| PD‐L1 expression status HR (95% CI) | ||||||
| <1% | 0.59 (0.38–0.92) | 1.84 (0.58–5.81) | 1.02 (0.69–1.51) | 1.17 (0.34–4.03) | NA | |
|
| 0.47 (0.34–0.66) | 1.45 (0.57–3.67) | 0.70 (0.55–0.90) | 0.76 (0.33–1.74) | 0.80 (0.56–1.14) | NA |
| 1–49% | 0.55 (0.34–0.90) | 1.55 (0.56–4.30) | 0.77 (0.57–1.05) | 1.24 (0.44–3.45) | 0.77 (0.52–1.13) | NA |
|
| 0.42 (0.26–0.68) | 0.91 (0.08–10.21) | 0.69 (0.44–1.08) | 0.31 (0.06–1.53) | 0.70 (0.37–1.32) | NA |
| Brain metastases HR (95% CI) | ||||||
| Evidence of | 0.36 (0.20–0.62) | 1.63 (0.80–3.32) | 0.96 (0.67–1.39) | 0.95 (0.32–2.81) | 1.44 (0.73–2.84) | |
| No evidence of | 0.53 (0.39–0.71) | 0.78 (0.46–1.32) | 0.76 (0.65–0.90) | 1.12 (0.68–1.83) | 1.04 (0.80–1.34) | 0.86 (0.60–1.23) |
| Platinum‐based drug HR (95% CI) | ||||||
| Carboplatin | 0.52 (0.39–0.71) | 0.94 (0.62–1.44) | 0.78 (0.67–0.90) | NA | 0.94 (0.69–1.29) | |
| Cisplatin | 0.41 (0.24–0.69) | N/A | 1.34 (0.19–9.71) | 1.03 (0.66–1.63) | 0.58 (0.38–0.88) | NA |
CI, confidence interval; NA, not available; OS, overall survival; PD‐1, programmed cell death protein 1; PD‐L1, programmed death‐ligand 1; rwOS, real‐world overall survival.
Adjusted associations between frontline therapy and rwOS in the fully restricted cohorts
| Covariate | A | B | C | D | E |
|---|---|---|---|---|---|
| Overall adjusted HR (95% CI) | |||||
| Chemotherapy | Ref. | Ref. | Ref. | Ref. | Ref. |
| PD‐1 combination | 0.99 (0.65–1.51) | 0.80 (0.69–0.93) | 1.15 (0.71–1.85) | 0.96 (0.73–1.26) | 0.95 (0.69–1.30) |
| Age HR (95% CI) | |||||
| <65 | Ref. | Ref. | Ref. | Ref. | Ref. |
|
| 1.18 (0.90–1.55) | 1.16 (1.04–1.30) | 1.20 (0.90–1.61) | 1.04 (0.86–1.25) | 0.72 (0.53–0.98) |
| Gender HR (95% CI) | |||||
| Male | Ref. | Ref. | Ref. | Ref. | Ref. |
| Female | 0.97 (0.74–1.26) | 0.80 (0.72–0.90) | 0.79 (0.59–1.06) | 0.94 (0.79–1.13) | 0.84 (0.62–1.15) |
| ECOG PS HR (95% CI) | |||||
| 0 | Ref. | Ref. | Ref. | Ref. | Ref. |
| 1 | 1.93 (1.38–2.70) | 1.40 (1.21–1.63) | 1.51 (0.93–2.46) | 1.31 (1.01–1.69) | 1.07 (0.69–1.64) |
| Unknown | 1.09 (0.74–1.59) | 1.25 (1.08–1.45) | 1.48 (0.94–2.34) | 1.29 (0.96–1.75) | 0.90 (0.53–1.52) |
| Smoking status HR (95% CI) | |||||
| Evidence of history of | Ref. | Ref. | Ref. | Ref. | Ref. |
| No evidence of history of | 0.99 (0.67–1.47) | 0.84 (0.70–1.00) | 1.11 (0.73–1.68) | 1.01 (0.78–1.32) | 1.54 (0.61–3.90) |
| Unknown/missing population | NA | NA | NA | 0.76 (0.39–1.48) | 0.78 (0.48–1.28) |
CI, confidence interval; ECOG PS, Eastern Cooperative Oncology Group performance status; HR, hazard ratio; NA, not available; OS, overall survival; PD‐1, programmed cell death protein 1; Ref., reference; rwOS, real‐world overall survival.