Literature DB >> 32642247

Evaluating the efficacy of bronchoscopy for the diagnosis of early stage lung cancer.

David M DiBardino¹, Anil Vachani¹, Lonny Yarmus².

Abstract

Novel diagnostic techniques for lung cancer are rapidly evolving. Specifically, several novel changes to bronchoscopy are reaching clinical evaluation. It is critical to think about historical standards for evaluating new diagnostic testing, and put those concepts into the framework of lung cancer. Often a thorough evaluation of new technology is not performed as a part of regulatory marketing clearance. Therefore, we must consider how to best study novel testing beyond these regulatory minimums. There are several methodological principles that can achieve this goal such as using a control arm, more thorough reporting of enrolled patients, consecutive patient enrollment, and adequate sample size. We hope clinicians, particularly those performing bronchoscopy for lung nodules, will feel empowered to critically appraise the evaluation of new diagnostic testing for lung cancer moving forward. 2020 Journal of Thoracic Disease. All rights reserved.

Entities: Disease Species

Keywords: Lung cancer; bronchoscopy; lung nodule

Year: 2020 PMID： 32642247 PMCID： PMC7330761 DOI： 10.21037/jtd.2020.02.35

Source DB: PubMed Journal: J Thorac Dis ISSN： 2072-1439 Impact factor: 2.895

Introduction

New and exciting modalities now exist for the diagnosis of lung cancer in the form of biomarkers and new biopsy devices (1,2). Much of this technology is aimed at the evaluation of indeterminate pulmonary nodules in order to establish a definitive diagnosis of early stage lung cancer or an alternative etiology. As highlighted below, a full evaluation of the clinical efficacy of these medical advances is not necessarily part of the Federal Drug Administration’s (FDA) regulatory approval process. Rather, more than 90% of the devices seeking FDA approval are utilizing the 510(k) pathway which emphasizes a device’s similarities with a legally marketed technology. Therefore, products may come to market before a rigorous and statistically sound evaluation has been performed (3). The number of small peripherally located indeterminate nodules, identified either incidentally or from the expansion of lung cancer screening, is very likely to increase (4). As such, more patients are identified as candidates for this innovative testing during their diagnostic evaluation. Now is an important time to consider the clinical evaluation of such novel technology and to improve the methodologies used to assess bronchoscopic outcomes for the diagnosis of lung cancer.

Adapting traditional standards to lung cancer

The concept of sensitivity, specificity, likelihood ratio positive, likelihood ratio negative, post-test probability, and receiver operating curves (AUC) have long been described in laboratory testing (5). It is important to think about the application of these standard test characteristics to novel biopsy modalities and biomarkers for lung cancer (). For example, when considering the sensitivity and specificity for lung cancer we are not truly evaluating “diagnostic yield” or “diagnostic accuracy”. Rather, we are a priori evaluating for the diagnosis of lung cancer alone and need to predetermine our definition of a true negative and false negative. We can largely accept a false positive rate near 0% and a true positive rate near 100% as cytology and pathology are rarely misclassified as malignant when diagnosing lung cancer (6). Diagnostic yield is a separately valuable endpoint but requires careful consideration as to what histopathologic findings constitute an accurate diagnosis of benign disease rather than a false negative in a lung cancer patient.

Table 1

Traditional measurements used to evaluate diagnostic tests

Measurement	Definition	Application to lung cancer diagnosis	Practical use
Sensitivity	True positives/(true positives + false negatives)	How many lung cancer diagnoses were confirmed	Requires accurate false negative rate
Specificity	True negatives/(true negatives and false positives)	Cases of benign lung nodules that had a diagnostic test suggesting cancer	Very rarely important in histopathology-based testing
Likelihood ratio positive	Sensitivity/(1-specificity)	Odds that someone has lung cancer after positive test	Infinite for positive pathology
Likelihood ratio negative	(1-sensitivity)/specificity	Odds that someone has lung cancer after negative test	Requires accurate false negative rate
Post-test probability	PPV or NPV depending on test outcome	The probability someone has lung cancer given a certain test result	Near 100% PPV for histopathology, NPV requires accurate sensitivity
Receiver operating curve	Plot of true positives against false positives	May have use in biomarker study where a threshold value is being calculated	No relevance to a bronchoscopy study
Diagnostic yield or accuracy	Variable	Often used in bronchoscopy studies as (true positives + true negatives)/total N	Requires accurate true negative rate

PPV, positive predictive value; NPV, negative predictive value.

PPV, positive predictive value; NPV, negative predictive value. With this in mind, we must scrutinize the implication of a negative biopsy result (i.e., no evidence of malignancy) obtained by bronchoscopy and the appropriate approach needed to accurately estimate diagnostic performance in this setting. Given the less specific nature of benign cytology and pathology, a negative lung biopsy is not often specific for a benign alternative diagnosis to lung cancer (7,8). Further complicating matters, granulomatous inflammation can co-exist with cancer and further confuse clinicians deciding on the diagnosis of lung cancer (9-11). Therefore, long term clinical follow-up is essential in deciding which patients had a truly negative diagnostic test for lung cancer. A false negative can be confirmed by further invasive testing such as a subsequent lung biopsy, newly metastatic disease seen on follow-up imaging, or interval worsening of a lung nodule then empirically treated as lung cancer. This type of clinical follow-up may be rigorous, but can realistically be obtained in a clinical trial or observational setting with medical record review extending 6 to 12 months beyond the biopsy. A true negative is even more challenging to confirm. Similar to false negatives, medical record review is necessary, but will likely take up to 1–2 years to confirm. Confirmation may include stability of a lesion on serial imaging, further invasive testing re-affirming benign pathology, or resolution of the abnormality on subsequent follow-up imaging. As inflammatory conditions can evolve over time, there may be nuance to the clinical record review that requires comparing final diagnoses between multiple investigators to validate the evaluation. Despite the uncertainty created by non-malignant histopathologic findings on biopsy, the use of longitudinal assessment is essential to determine the test characteristics of novel technologies as diagnostic tools for lung cancer.

Moving beyond standard study outcomes measures and regulatory minimums

Traditional study designs designed to describe traditional characteristics of test performance may neglect the opportunity to conduct more patient-centered research (12). There are other potential endpoints that are both clinically relevant and meaningful to patients being evaluated for possible lung cancer. Novel testing should be conducted in a controlled setting where the morbidity required to perform the test can be carefully recorded. This will commonly include procedural complications, but we should also consider the issue of time. The length of time required to plan and execute a novel diagnostic test, and how long it takes to obtain a final diagnosis of lung cancer are often hugely important to patients. Furthermore, it is plausible that minimizing the time from diagnosis to treatment of lung cancer can improve outcomes although this association has not been consistently seen in the literature (13). Ultimately, an association between a novel diagnostic test and an earlier stage of lung cancer at the time of treatment initiation could have prognostic implications. It is worth mentioning that diagnostic testing has not traditionally been held to this standard. In fact, outcomes indirect to clinical care and surrogate outcomes are commonly used (14). As it applies to biopsy devices to diagnose lung cancer, U.S. FDA marketing clearance is often obtained via section 510(k) of the Food, Drug, and Cosmetic Act whereas a device gains marketing approval after demonstrating equivalent safety and efficacy as a legally marketed similar device (15). Therefore, it becomes difficult to imagine the motivation for the commercial developers of these technologies to fund costly, time-consuming prospective controlled studies to generate high quality evidence on diagnostic performance and clinical efficacy when it is unnecessary by regulatory standards. Furthermore, this pathway does appropriately hasten the availability of new technology for patients suffering from serious disease states such as lung cancer. This potential disconnect between regulatory minimums, clinically important outcomes, and encouraging medical advances has recently come under scrutiny outside of the lung cancer space (16). Perhaps safety concerns are less relevant for new devices aimed at diagnosing lung cancer, but the motivations presented by 510(k) clearance will clearly affect new developments in bronchoscopy as all recent innovations in this technology have utilized this pathway (17).

Study designs

Novel diagnostic tests are often studied in a prospective single-arm, descriptive design on account of the regulatory issues aforementioned, cost, and simplicity (). This study design is becoming less and less useful or accepted as more options to diagnose lung cancer are developed. The tradition in lung cancer diagnosis has been single-armed studies of transthoracic or bronchoscopic biopsy procedures aimed at evaluating for the traditional test characteristics of sensitivity and diagnostic yield (6,8,18-25). This study design must be interpreted with caution and is not the ideal format for studying novel diagnostic testing for lung cancer (26). These descriptive studies generally report test characteristics that are specific to the exact population studied and can allow for an informal comparison to historical controls of diagnostic rates. Single-arm studies also cannot be used to compare a novel test to an established diagnostic approach as there is no ability to avoid bias or control for important confounders that can influence diagnostic yields.

Table 2

Common pitfalls and solutions with studies aimed at diagnosing lung cancer

Common study design	Pitfall	Solution
Single-arm study with new device	No clear comparison arm to judge new device’s efficacy	Parallel trial design with a control arm
No clear power calculation	Unclear if the study can statistically fulfill the aim	Consideration of the study goals and pre-emptive power calculations
Highly selecting patients for novel diagnostic test	Lack of generalizability	Offer trial enrollment to consecutive patients being worked up for lung cancer
Expert centers only	Lack of generalizability	Multi-center design
Limited demographic and descriptive reporting of biopsy procedure	Lack of generalizability	Careful reporting of lung cancer prevalence in the study population and detailed reporting of nodule characteristics
Lack of confirmation for true negative biopsies	Cannot calculate sensitivity for lung cancer for a technology	Adequate clinical follow-up for all non-malignant biopsies

The main concern in this setting is selection bias (27). Many single-arm studies highly select patients that are “ideal” for a diagnostic test rather than enrolling consecutive, un-selected patients. This is also in contrast to multi-center randomized controlled trials where confounding can at least be equally distributed between study arms and institutions, and data should be rigorously collected (28). Single-arm studies have the potential to produce a favorable result if patients are consciously or unconsciously selected or reported based on factors that lead to a successful procedure or incomplete data collection. While this can be measured to some degree (i.e., nodule size, lobe, bronchus sign, systematic data collection, etc.), there are a number of factors relating to nodule characteristics, including size, density, and orientation, that can influence the ability to achieve a high-quality biopsy and many barriers to complete data collection. No prior studies have collected imaging tests and attempted to report a more sophisticated analysis of factors that may have influenced diagnostic yield. As an example, a large-scale meta-analysis of all guided bronchoscopy techniques have aggregated data from such single-arm studies describing diagnostic yield (29). Most patients in these studies had lung cancer and a diagnostic yield of roughly 70% was seen regardless of exact technique used. Now contrast that to a large registry study describing less highly selected patients undergoing guided bronchoscopy with a yield of 54%, or a prospective randomized clinical trial of bronchoscopy where both arms have a diagnostic yield of less than 50% (8,30). In fact, this discrepancy was further validated recently by an update of the meta-analysis discussed above which showed a decrease in diagnostic yield of almost 10% over the past decade despite advances in technologic development (31). At least some of this effect is attributed to the field maturing in its definition of nonmalignant disease. Requiring stricter criteria to distinguish between true- and false-negative bronchoscopy results invalidated many of the earlier claims of diagnostic yield. Ideally new biopsy technologies will be evaluated using rigorous comparative designs that incorporate a relevant control group. Some randomized, experimental studies have in fact compared different bronchoscopy techniques for the diagnosis of lung cancer (30,32,33). A recent study was the first to directly compare three existing advanced bronchoscopic techniques within a cadaver model. This study directly compared the use of radial probe endobronchial ultrasound (r-EBUS) with an ultrathin scope to electromagnetic navigational bronchoscopy (EMNB) to robotic bronchoscopy in a randomized fashion in human cadavers. The study showed superiority of the robotic approach over either r-EBUS or EMNB (34). The advantages of this trial design are growing as multiple navigational bronchoscopy platforms are now available. The ideal control arm represents a widely available standard of care that clinicians are aiming to improve on. Specifically, when evaluating novel bronchoscopic techniques the ideal control arm varies depending on a clinician’s available options. Unfortunately, there remains equipoise in the guided bronchoscopy literature, and there is no widely agreed upon standard of care. The widely available technology reported in most registry studies and numerous single-arm descriptive trials is fluoroscopy, EMNB, virtual bronchoscopy (VB), and r-EBUS (8,18-22,24,25,29,30). Making things more complicated, these modalities can be combined to improve success (23,32,35). The exciting potential of guided bronchoscopy was reflected in the 2013 the American College of Chest Physicians guideline statement recommending the use of r-EBUS or EMNB to diagnose pulmonary nodules when available (36). This background creates several quality options for control arms. For example, an institution currently using an EMNB platform will be most interested to know how novel devices compare to EMNB. Furthermore, they can internally collect data on the diagnostic performance for consecutive patients at their center and compare this number to the control arm of this proposed trial. This, along with the demographics table, will give the reader a sense of how generalizable the study is to their practice. Generally speaking, r-EBUS is the most widely available and well-studied technique that presents itself as an attractive control arm. Another desirable aspect of r-EBUS is the ability to add this intervention to the experimental arm (i.e., the novel device arm) in order to integrate the limited comparative data supporting combining guided modalities (23,32). Bronchoscopy trials also need to account for proficiency bias and maximize generalizability. Single-center studies involving well known procedure experts may not be generalizable to the majority of physicians charged with diagnosing lung cancer. Utilizing a multi-centered design and trying to include diverse practice settings may be of benefit (37). The first large multicentered clinical trial investigating a clinical algorithmic approach (ALL IN ONE trial) in combination with advanced bronchoscopic technologies is underway with anticipated completion in 2019. Final results are expected in 2020 after a robust clinical and radiographic follow-up period with pathologic confirmation to ensure confidence in diagnoses labeled as benign (38).

Reporting of the study population and intervention

The importance of the patient population enrolled in a study designed to evaluate lung cancer cannot be overstated. Physicians struggle to understand when diagnostic tests apply to their clinical practice and must frame all medical testing for lung cancer with this in mind. How exactly patients were chosen for enrollment in a biopsy trial allows for insight into generalizability. Other than well-defined inclusion and exclusion criteria, were these consecutive patients undergoing lung nodule evaluation who needed further investigation? If not consecutive, how did the investigators decide to offer enrollment? If patients declined enrollment, was there any systematic reason that could influence diagnostic yield? Investigators should clearly state the patient population successfully enrolled with a thorough demographics table. Specifically, for a bronchoscopy device all known factors associated with an increased diagnostic yield for guided bronchoscopy should be reported for each study arm (when applicable) such as nodule size, location of the nodule, distance from pleura, and presence of a bronchus sign on compute tomography (CT) scan. Less well understood factors may also be useful in an effort to communicate how difficult nodules are to biopsy with traditional bronchoscopy such as apical-medial location in the upper lobes, and apical position in the superior segment of the lower lobes. It may also be useful to know what proportion of patients were also eligible for another diagnostic test such as a transthoracic needle biopsy or surgical lung biopsy. This could lead to hypothesis generating subgroup analyses for patients that may or may not have had other options. Detailed reporting of the biopsy methods is required if the novel diagnostic test is a bronchoscopy procedure. Combining traditional sampling methods such as cytology brushes, fine-needle aspiration, transbronchial biopsies, and bronchial lavage will maximize the sensitivity for lung cancer without adding futile maneuvers to the procedure. For studies involving r-EBUS, reporting the r-EBUS view obtained during the procedure (eccentric versus concentric versus no view) can inform proceduralists about the expected diagnostic yield and lead to informative subgroup analyses (19). Other patient safety related outcomes should be reported such as procedure time, pneumothorax, and clinically meaningful bleeding.

Sample size

A statistically sound method should be used to determine the sample size needed to evaluate a novel diagnostic test for lung cancer. For studies with a control arm and traditional experimental design this power calculation will be more straight forward. As above, the sensitivity for lung cancer or diagnostic yield can be compared as proportions using a Chi-square test. The sensitivity for lung cancer or diagnostic yield will need to be assumed depending on the investigators’ interpretation of existing data. For example, if registry and clinical trial data is used to estimate the success in the control arm, one may choose an expected diagnostic yield of 50%. The novel technology’s diagnostic yield can be estimated based on pilot data while ensuring a clear clinical improvement on this baseline rate of 50%. One may consider a diagnostic yield of 75% as a clinically significant goal for novel technology as it starts to approach the data supporting transthoracic needle biopsy in expert hands (6). Assuming a power of 80% to detect a difference and a standard 5% alpha error, this study would require 116 subjects. This enrollment goal will change dramatically if different assumptions are made. This number would rise to 322 if instead a novel technology aimed to improve on an assumed 70% diagnostic yield, as seen in prior single-arm descriptive studies, to an 85% diagnostic yield, with a 90% power to detect a difference. Unfortunately, this type of power calculation is complicated for single-arm studies and can be done using any number of statistical philosophies (39,40). Ultimately these evaluations could use similar principles as controlled trials if a baseline diagnostic yield or sensitivity for lung cancer is assumed based on prior data. This inevitably requires the problematic assumption that the patient population undergoing the novel diagnostic test exactly matches the patient population previously studied. The sample size required to assess safety has to do with the accuracy of an estimated complication rate. An adequate subject number should be chosen in order to narrow the confidence intervals around estimates based on previous data for pneumothorax and bleeding. For example, if an estimated pneumothorax rate is 2%, a study with 100 subjects would estimate that event with a fairly wide 95% confidence interval of 0.24% to 7%. A study with 300 subjects will have a much narrower 95% confidence interval of 0.74% to 4.3%.

Conclusions

There are obviously many competing interests when evaluating novel technology to diagnose lung cancer. Clinicians, patients consenting to testing, and our industry partners need to find common ground on what pragmatic study designs will help us all achieve our goals. Evaluation of these advancements may take place in steps. Often marketing approval will be obtained by the FDA before the technology’s clinical efficacy evaluation is fully known. In the post-marketing phase, fairly large studies involving diverse practice settings and a carefully considered control arm will go a long way towards understanding the use of novel testing. These results should be carefully reported, include patient centered outcomes when possible, and focus on clinical follow-up to confirm true and false negative testing. Thorough evaluation of novel diagnostic testing will set the stage for many future possibilities and guide continued innovation. By optimizing diagnostic testing we can further refine lung cancer screening algorithms, better understand the potential for endoscopic ablation of lung cancer, and mitigate the uncertainty patients face when diagnosed with an indeterminate pulmonary nodule. The article’s supplementary files as

37 in total

1. Electromagnetic navigation transthoracic needle aspiration for the diagnosis of pulmonary nodules: a safety and feasibility pilot study.

Authors: Lonny B Yarmus; Sixto Arias; David Feller-Kopman; Roy Semaan; Ko Pen Wang; Bernice Frimpong; Karen Oakjones Burgess; Richard Thompson; Alex Chen; Ricardo Ortiz; Hans J Lee
Journal: J Thorac Dis Date: 2016-01 Impact factor: 2.895

2. Sample size estimation in diagnostic test studies of biomedical informatics.

Authors: Karimollah Hajian-Tilaki
Journal: J Biomed Inform Date: 2014-02-26 Impact factor: 6.317

3. Aspiration cytology of malignant neoplasms associated with granulomas and granuloma-like features: diagnostic dilemmas.

Authors: K K Khurana; M W Stanley; C N Powers; M B Pitman
Journal: Cancer Date: 1998-04-25 Impact factor: 6.860

4. Modernizing the FDA's 510(k) Pathway.

Authors: Vinay K Rathi; Joseph S Ross
Journal: N Engl J Med Date: 2019-10-23 Impact factor: 91.245

Review 5. Screening for early stage lung cancer and its correlation with lung nodule detection.

Authors: Fangfei Qian; Wenjia Yang; Qunhui Chen; Xueyan Zhang; Baohui Han
Journal: J Thorac Dis Date: 2018-04 Impact factor: 2.895

6. Diagnosis of lung nodules with peripheral/radial endobronchial ultrasound-guided transbronchial biopsy.

Authors: David W Hsia; Kurt W Jensen; Douglas Curran-Everett; Ali I Musani
Journal: J Bronchology Interv Pulmonol Date: 2012-01

Review 7. Transthoracic needle biopsy of the lung.

Authors: David M DiBardino; Lonny B Yarmus; Roy W Semaan
Journal: J Thorac Dis Date: 2015-12 Impact factor: 2.895

8. Patient-centered medicine and patient-oriented research: improving health outcomes for individual patients.

Authors: José A Sacristán
Journal: BMC Med Inform Decis Mak Date: 2013-01-08 Impact factor: 2.796

9. Electromagnetic navigation bronchoscopy to access lung lesions in 1,000 subjects: first results of the prospective, multicenter NAVIGATE study.

Authors: Sandeep J Khandhar; Mark R Bowling; Javier Flandes; Thomas R Gildea; Kristin L Hood; William S Krimsky; Douglas J Minnich; Septimiu D Murgu; Michael Pritchett; Eric M Toloza; Momen M Wahidi; Jennifer J Wolvers; Erik E Folch
Journal: BMC Pulm Med Date: 2017-04-11 Impact factor: 3.317

10. Improving outcome reporting in clinical trial reports and protocols: study protocol for the Instrument for reporting Planned Endpoints in Clinical Trials (InsPECT).

Authors: Nancy J Butcher; Andrea Monsour; Emma J Mew; Peter Szatmari; Agostino Pierro; Lauren E Kelly; Mufiza Farid-Kapadia; Alyssandra Chee-A-Tow; Leena Saeed; Suneeta Monga; Wendy Ungar; Caroline B Terwee; Sunita Vohra; Dean Fergusson; Lisa M Askie; Paula R Williamson; An-Wen Chan; David Moher; Martin Offringa
Journal: Trials Date: 2019-03-06 Impact factor: 2.279

1 in total

1. Impact of preoperative biopsy on tumor spread through air spaces in stage I non-small cell lung cancer: a propensity score-matched study.

Authors: Yun Ding; Jiuzhen Li; Xin Li; Meilin Xu; Hua Geng; Daqiang Sun
Journal: BMC Pulm Med Date: 2022-07-30 Impact factor: 3.320

1 in total