Literature DB >> 31119060

Application of next-generation sequencing technology to precision medicine in cancer: joint consensus of the Tumor Biomarker Committee of the Chinese Society of Clinical Oncology.

Xuchao Zhang^1,2, Zhiyong Liang³, Shengyue Wang⁴, Shun Lu⁵, Yong Song⁶, Ying Cheng⁷, Jianming Ying⁸, Weiping Liu⁹, Yingyong Hou¹⁰, Yangqiu Li¹¹, Yi Liu¹², Jun Hou¹³, Xiufeng Liu¹⁴, Jianyong Shao¹⁵, Yanhong Tai¹⁶, Zheng Wang¹⁷, Li Fu¹⁸, Hui Li⁷, Xiaojun Zhou¹⁹, Hua Bai²⁰, Mengzhao Wang²¹, You Lu²², Jinji Yang²³, Wenzhao Zhong²³, Qing Zhou²³, Xuening Yang²³, Jie Wang²⁴, Cheng Huang²⁵, Xiaoqing Liu²⁶, Xiaoyan Zhou²⁷, Shirong Zhang²⁸, Hongxia Tian^1,2, Yu Chen^1,2, Ruibao Ren²⁹, Ning Liao³⁰, Chunyan Wu³¹, Zhongzheng Zhu³², Hongming Pan³³, Yanhong Gu³⁴, Liwei Wang³⁵, Yunpeng Liu³⁶, Suzhan Zhang³⁷, Tianshu Liu³⁸, Gong Chen³⁹, Zhimin Shao⁴⁰, Binghe Xu²⁴, Qingyuan Zhang⁴¹, Ruihua Xu⁴², Lin Shen⁴³, Yilong Wu^1,2, On Behalf Of Chinese Society Of Clinical Oncology Csco Tumor Biomarker Committee^{1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43}.

Abstract

Next-generation sequencing (NGS) technology is capable of sequencing millions or billions of DNA molecules simultaneously. Therefore, it represents a promising tool for the analysis of molecular targets for the initial diagnosis of disease, monitoring of disease progression, and identifying the mechanism of drug resistance. On behalf of the Tumor Biomarker Committee of the Chinese Society of Clinical Oncology (CSCO) and the China Actionable Genome Consortium (CAGC), the present expert group hereby proposes advisory guidelines on clinical applications of NGS technology for the analysis of cancer driver genes for precision cancer therapy. This group comprises an assembly of laboratory cancer geneticists, clinical oncologists, bioinformaticians, pathologists, and other professionals. After multiple rounds of discussions and revisions, the expert group has reached a preliminary consensus on the need of NGS in clinical diagnosis, its regulation, and compliance standards in clinical sample collection. Moreover, it has prepared NGS criteria, the sequencing standard operation procedure (SOP), data analysis, report, and NGS platform certification and validation.

Entities: Chemical Disease Gene Mutation Species

Keywords: Next-generation sequencing technology; cancer; consensus

Year: 2019 PMID： 31119060 PMCID： PMC6528448 DOI： 10.20892/j.issn.2095-3941.2018.0142

Source DB: PubMed Journal: Cancer Biol Med ISSN： 2095-3941 Impact factor: 4.248

Introduction

Precision medicine, also referred to as personalized medicine, is an emerging concept in healthcare that adapts therapy based on the genome, physiology, and lifestyle of the patient. This method of disease management requires the incorporation of clinical and genetic data, enabling the clinicians to provide an efficient treatment course to attain a significant outcome[1,2]. In the present era of personalized oncology therapy, a comprehensive analysis of cancer driver genes and their mutations underlying the pathophysiology of cancer development is crucial for designing the most appropriate treatment strategy for patients using targeted therapies[3]. Although Sanger sequencing[4] is still being used as a gold standard in clinical applications, the more recently developed second-generation technology, namely the next-generation sequencing (NGS) technology, has superseded it owing to simpler usage and functional advantage[5-7]. Over the last decade, economic growth and changing lifestyle in China has led to an increase in the incidence of cancer and cancer-related mortality[8]. Continuous progressive research is being conducted in the areas of molecular targeted therapy, which is currently considered as a promising treatment. China has supported research initiatives directed towards progress in precision medicine, considering it as a new phase in health care. On September 17th, 2015, the Chinese Society of Clinical Oncology (CSCO) convened a group of renowned oncologists, pathologists, biologists, and NGS experts in Xiamen to set up the China Actionable Genome Consortium (CAGC). The CAGC announced the start of the precision oncology initiative (CAGC-POI), and the framework and goals of this organization were laid out. According to the 2015 Chinese cancer report, the most commonly diagnosed cancers in China are carcinomas of the lung and bronchus, stomach, esophagus, liver, colorectum, and breast in the order listed. The overall national incidence of all cancer types in 2015 was approximately 4.29 million, and cancer-related mortality was 2.81 million people[8]. Therefore, there is a significant medical need for precision oncology for diagnosis and treatment in China. Under the preliminary consensus on the clinical application of NGS, the CAGC-POI project stage I aims to utilize NGS technology to profile cancer driver genes for five malignant solid tumors, including lung cancer, breast cancer, hepatocellular carcinoma, gastric cancer, and colorectal cancer. On September 27th 2015, hematopoietic malignancy was included as the 6th tumor type after discussions between CSCO Chairman Professor Yi-long Wu, Academician Zhu Chen (President of Chinese Medical Association), and Academician Saijuan Chen, aiming to perform a more comprehensive mutational atlas study on leukemia. Under the consensus, the CAGC-POI will determine a set of optimized standard operating procedures (SOPs) and mutational atlases of driver gene mutations across these six cancers. This will guide the recruitment of patients with such mutations into precision medicine-related clinical trials. In the final version of the consensus, standards and compliance rules suitable for clinical oncology in China regarding NGS techniques, analysis tools, along with diagnosis and treatment models will be put forward, which will further fill the gap between NGS laboratory and clinical application, ultimately improving the health of cancer patients. The goal of CAGC-POI stage I is to establish a consensus on the application of NGS technology in clinical oncology, based on the advice and opinions from the expert group, which would serve as a guideline for oncology-related clinical and personal testing in the near future. Currently, there is an urgent need for a consensus based on the discussions from expert professionals in all related fields to guide the application of NGS in cancer diagnosis and precision medicine.

Overview of NGS technology

NGS technology, also referred to as massively parallel sequencing (MPS), is a parallel sequencing technology applied to specific samples obtained from patients with cancer, which has the ability to sequence billions of DNA base pairs in a single run. NGS for cancer samples can range from targeted gene panels analyzing few thousand base cells to whole-exome sequencing (WES) and whole-genome sequencing (WGS) analyzing 40 to 50 million bases and 3.3 billion bases, respectively. The target panels in NGS offer a larger exon coverage, greater than that with WES and WGS applications. The targeted panel sequencing detects only the specific carcinogenic genes, while WES and WGS can identify unknown variants along with known mutations. WGS provides the most unbiased examination of the cancer genome, thereby paving the way for the discovery of previously unrecognizable mutations[9]. Presently, NGS of targeted gene panels are very common in clinical practice for the purpose of finding targetable genomic alterations. Whole exome or genome sequencing are mainly used for research purposes. NGS of gene panels longer than one million megabases are under extensive validation for TMB (tumor mutational burden) estimation, aiming to serve as surrogate test for whole exome sequencing based TMB analysis. In the future, whole exome sequencing might be validated for the use of identifying functional neoantigens or other clinical biomarkers. Although NGS technology has the advantage of a high-throughput workflow and can discover hereto unknown mutations[10,11], it also confronts many challenges such as technology, data management, data analysis, interpretation, reporting, and genetic consulting. Many publications and international consensus have reported the application of NGS in single-gene inheritance diseases and prenatal diagnostics[12-15]; however, there has been no consensus on the application of NGS technology in clinical oncological practice, especially when used in cancer precision diagnosis and treatment to date[16]. With the widespread scope of tumor gene testing, NGS for precision diagnostics has gradually progressed from single-gene analysis to the profiling of several hundreds of genes. It is possible in the future that integrated information from the exome, transcriptome, whole genome, and epigenome will be adopted into clinical practice. Furthermore, as biotechnology is also evolving, it is foreseeable that the application of high-throughput NGS in clinical diagnosis and precision treatment will change perpetually in terms of testing technology, analysis tools, variant interpretation, etc., in the future, which will also bring various challenges to the practical application. This technical consensus describes quality requirements, clinical tumor-related NGS testing content, sample processing, sequencing procedures, data management, informatics analysis, interpretation of reports, and consulting. This consensus also describes patient-informed consent and quality control of NGS tests, and adds a note on the differences in research and diagnostic NGS.

Quality requirements of NGS in cancer diagnosis

With evolving technology, it is of crucial importance to understand the current status of NGS technology in order to ensure a superior survival outcome and guarantee the well-being of the patient. A variety of NGS platforms are being used in different clinical laboratories. All laboratories, regardless of the platform used, should conduct NGS tests in accordance with the recommendations in this consensus and use them to validate the platform and associated analytical tools for clinical use. Quality management (QM) plays a pivotal role in the standardization of NGS workflow by providing basic guidelines to ensure an advanced reproducibility of data and high turnover with reduced cost. Quality documentation containing procedural instructions and verified documents is the preliminary requirement to a good standardization method[17], which improves the transparency and reliability of the results. One of the most important criteria for QM is quality assurance (QA). The QA program provides quality control (QC) methods for the predetermined checkpoints, such as contamination identification including initial sample check, fragmentation, library evaluation, error rate monitoring, and data analysis. These methods assist to confirm the formerly established performance status of a sample and indicate an error in case of any change in the status. The aforementioned QC features ensure that no sequence or sample data is used in the testing without meeting the established laboratory quality standards, and the QA procedure minimizes the risk of errors due to contamination. This is of central importance, especially in the case of a high or unlikely false-positive rate in assays or in detection of unknown agents[18,19]. Many professional organizations have recommended the use of reference standards to minimize errors of inappropriate analysis leading to misdiagnosis by reducing bias associated with any method[18-25]. These reference standards reflect a wide range of genomic features assessed by the NGS assay, advancing precision and reducing the systematic sequencing errors[26]. In case of multigene detection of tumors, a specific NGS test needs to be designed to clearly describe the gene status required for clinical diagnosis and treatment. NGS tests should be fully validated in terms of analysis and technological capabilities, before being used in clinical practice. Several factors affect the quality of NGS results: platform, target region enrichment, library preparation, amplification efficiency, sequencing data volume, bioinformatics analysis pipeline, etc. NGS testing results for specific mutation sites that have low data quality should be validated by other methods before clinical application. Experts performing the testing in laboratories and consultants should have adequate communication channels and informed consent from patients regarding the advantages and disadvantages of conducting a tumor NGS test. Simultaneously, clinicians should understand the medical implications of specific NGS tests.

Recommendations

(1) Standardization of NGS assay with complete validation is necessary for its application in clinical practice for diagnosis of cancer driver gene mutations, in order to meet the clinical diagnostic standards. (2) Laboratories should determine the content of the assay with relevant technical parameters and also specify the purpose and utility of the NGS testing. (3) When used as a reference for deciding on targeted therapy, the NGS test results should identify the variations in genes that can be targeted for a particular drug. When used for molecular classification, an analytical model needs to be validated before further application to predict superior efficacy and prognosis of NGS testing.

NGS test content in clinical oncology

In routine clinical practice, along with the differential diagnosis suggested by the physician, a genetic test is recommended so as to obtain a confirmatory diagnosis with the laboratory reporting the wide range of genes analyzed, test methods used, and the performance parameters of every test. Therefore, for higher sensitivity and specificity, the initial testing is done with disease-targeted panels based on the relevant gene regions related to the disease. The referring physicians are recommended to provide detailed phenotypic data, such that the data can aid the laboratory in interpreting the results. In oncology, since a wide range of genomic variants exist within a single tumor type, or similar driver gene mutations are found among different kinds of tumors, the therapeutic strategies could be different for a single tumor, or a similar treatment could be applied to different tumor types[27,28]. Thus, molecular classification of tumors based on specific genomic characteristics may move beyond traditional histopathological categorization and become the key step in cancer diagnosis and treatment[29]. Owing to this genomic phenomenon, elaborate disease panel testing is recommended to track the disease phenotype and correlate it with the genotype. Only those genes with clear scientific evidence of clinical relevance should be included in the disease-targeted gene panel. In case of overlapping phenotype, laboratories should consult with the physicians to restrict the analysis to a subpanel related to sub-phenotype to reduce the number of less relevant variants. Further, during exome and genome sequencing, the laboratories should be aware of the commercially available reagents and refractory areas in the experiment design[19]. In recent years, multi-genotyping of tumors has shown to have a significant impact on drug development strategies with two new designs of targeted-therapy clinical trials: umbrella trials and basket trials. The former involves the same histological cancer with different driver gene mutations that could be potentially treated with different targeted drugs; the latter involves the same genomic mutation found in different histological tumors that could potentially be treated with a single targeted drug[30,31]. In both clinical practices and clinical trials, the content of an NGS test for genotyping requires meticulous design and verification of technical reliability. This content, including candidate genes, targeted testing regions, and actionable variants should be well defined within the NGS test. The establishment and development of the “core gene list” should be patient-oriented, fully integrated with the precision medicine concept from clinical oncologists, and combined with proper application and operability. In lung cancer, the current National Comprehensive Cancer Network (NCCN), Domestic Lung Cancer Clinical Guidelines and National Health and Family Planning Commission Diagnosis and Treatment Norms suggest that some driver gene variants including EGFR mutations, KRAS mutations, ALK rearrangements (fusion), ROS1 rearrangements (fusion), variable shear variations in MET exon 14, MET copy number amplification, RET rearrangements (fusion), HER2 mutations or amplification, and BRAF mutations are the essential parts of the “core gene list”[32]. The other core genes include HER2, BRCA1, BRCA2, ESR1, and other genes in breast cancer[33]; HER2, MSI-related genes like MLH1, MSH2, MSH6, PMS2, PDL1, SMAD4, STK11, APC and other genes for gastric cancer[34]; TP53, IDH1, IDH2, FGF, KRAS for hepatocellular carcinoma[35,36] and KRAS, NRAS, BRAF and MSI related genes like MLH1, MSH2, MSH6, PMS2, etc. in colorectal and several other cancers[37]. For hematological malignancies, BCR-ABL, PML-RARA, IDH1, KIT, FLT3, MYC, STAT3, STAT5B, and other potentially actionable or targetable genes should be included in the NGS test. From the perspective of clinical oncology in the CAGC-POI project, test content (the gene panel and its genomic region) should not only include tumor-specific gene panels, but also a large and comprehensive gene panel. Moreover, because gene mutation profiles of solid tumors and hematologic malignancies are significantly different, comprehensive gene panels for these two different malignancies should be designed separately. “Actionable genes” refer to driver genes and associated upstream and downstream regulators to which the therapeutic drug may be implemented in order to inhibit tumor progression or regress the tumor. In addition, actionable gene variations can also be used to determine molecular subtypes in diagnosis or prognosis. The first category of “actionable gene” is referred to as Level 1 actionable alterations, which includes clinical genomic variants published in various international and domestic guidelines (CSCO, ASCO, NCCN, ESMO, etc.), printed in the drug labels, related to indications approved by the food and drug administration of the USA (FDA)/China Food and Drug Administration CFDA, and having definite molecular diagnostic or prognostic value. The second category, described as Level 2 actionable alterations, includes drug-related targets in ongoing phase 1–3 clinical trials, mutations as inclusion criteria in current or upcoming clinical trials, drug targets of other tumors or indications in international and domestic guidelines, and targets for adjuvant diagnosis or prognosis. In the NGS test, direct analysis of tumor specimens without leukocyte DNA control is accepted for Level 1, while matched leukocyte DNA control is strongly recommended for larger gene sets for Level 2 actionable alterations, or for large groups of targets ranging from hundreds of genes to the whole exome. (4) Application of a core panel list with a clear distinction of genes that need to be included in clinical molecular diagnostics, which is formed after a joint discussion by multidisciplinary experts including clinical oncologists from specific subspecialties and laboratory technologists. (5) Further, “actionable gene” variations may affect clinical decisions and thus guide the clinicians in the diagnosis and treatment regimen planning based on genotyping of test results. As continuous studies are performed on the mechanism of tumorigenesis, it is necessary to discuss and update the gene list regularly based on the progress of important clinical research. Comprehensive NGS tests on more genes will certainly provide more valuable clinical information. (6) Due to the redundancy and complexity of the tumor signaling pathway, annotations depicting genomic alterations and potential actionable sites, or signaling pathways, should be clearly described for each gene. In practice, the increase in the number of tested genes should not be at the expense of the quality of core genes.

Verification of technical parameters is vital to ensure the accuracy of NGS testing results

All NGS measurement parameters should be explicitly stated. The batch and relevant characteristics of analysis specimens to be assayed should be recorded. Testing laboratories have the responsibility of maintaining a structured database to ensure the integrity of the testing platform, branch projects, and sample processing. Each sample should receive a unique identifier during the testing and analysis. During the verification of platform stability, all instruments and reagents should meet quality standards to ensure high accuracy and precision in the investigation of specific driver genes. The following relevant parameters related to the technical verification process should be recorded: (i) sample-related patient information, unique identifiers, time and date of reception, storage, and processing; (ii) name, manufacturer, and batch number of reagents used in DNA/RNA processing and quality indicators (such as RIN, etc.); (iii) data analysis- and database management-related software (version number, analysis parameters), data-related parameters (data size, QC, etc.), database security, and backup status[18,23,38-43]. (7) The NGS testing process and bioinformatics analysis pipeline should be fully optimized for specific platforms and tests, with pipeline validation providing parameters for their analytical sensitivity and specificity. (8) Analytical sensitivities and specificities should be established for each type of variant during pipeline validation. This is due to the differences in analytical methods among variation types including single-nucleotide variants (SNVs), indels (insertions and deletions), rearrangements (fusions), and copy number variations (CNVs). Moreover, limitations in sampling or probe design may constrain the detection of certain variation types, such as CNVs. (9) In the cases when accuracy of an NGS test may be affected, a reliable single-gene testing approach should be considered to validate the NGS result.

Sample preparation

A crucial step in the NGS test is the library construction with DNA or cDNA that is reverse-transcribed from RNA. Although fresh tissue specimens are generally preferred for clinical NGS testing, formalin-fixed paraffin-embedded (FFPE) specimens, tumor cytology specimens, and plasma (cell-free DNA/RNA) also can be used in NGS tests[44,45]. NGS tests are performed on clinical specimens collected at critical time points during disease management. A multidisciplinary discussion should be undertaken and fully informed consent should be received to allow the use of NGS tests at multiple time points during treatment[44]. The following methods are used for sample collection: (i) Fresh frozen tissues from surgery and biopsy: Snapping frozen tissue in liquid nitrogen is an ideal preservation method for fresh tissues. Otherwise the fresh tissues can be stored in liquid nitrogen or in a –80 °C freezer within 30 minutes after removal from the body to prevent the degradation of nucleic acids such as RNA. Alternatively, fresh tissues can also be stored in preservative reagents and transferred to a –80 °C freezer as soon as possible. The tumor content can be determined by frozen section staining[46]; (ii) FFPE tissues: Handling of tissues should be performed according to standard pathological operating procedures. Tissues should be fixed in 10% neutral formalin solution within 30 minutes after excision, avoiding the usage of fixatives containing acidic or heavy metal ions. Large specimens should be trimmed to proper sizes and fixed for 6 to 48 hours, but not longer than 72 hours. Biopsy specimens should be fixed for 6 to 12 hours, and the tumor content should be determined by H&E staining before NGS testing[47-49]; (iii) Cytology samples: the presence of tumor cells in pleural effusion and ascites samples must be confirmed before NGS testing. Samples meeting quality requirements by cytopathological examination can undergo nucleic acid extraction directly, or FFPE preparation for future use[50,51]; (iv) Plasma or blood: cell-free/circulating tumor DNA (cf/ctDNA) is often found in plasma, and the proportion of tumor-derived DNA varies dramatically among distinct types and stages of cancers. Evidence indicates that ctDNA assays provide more reliable results at the time of disease progression. However, there is no evidence of clinical utility to suggest that ctDNA assays are useful for diagnosis of early-stage cancer or screening[52]. EDTA anticoagulation vacuum collection tubes can be used in blood sampling for cfDNA preparation. Plasma should be isolated within 2 hours after about 8–10 mL of whole blood is collected and transported under refrigerated condition, after which cfDNA is extracted. cfDNA should be stored either at –80 °C to avoid repeated freeze-thaw cycles or in the commercial cfDNA collection tubes, which can preserve cfDNA for 3 to 7 days at room temperature if peripheral blood samples need to be transported for a long time. The large amount of genomic DNA (gDNA) released from blood cell diminishes the relative frequency of ctDNA dramatically. Therefore, blood samples with obvious hemolytic signs are not suitable for ctDNA NGS testing. cfDNA samples with suspected gDNA contamination should be first determined by the size distribution analysis of nucleic acid fragments before NGS testing[45]. The evaluation of sample quality has a profound impact on the interpretation of results. Therefore, it is essential to keep a careful record of sample collection circumstances, transportation details, processing before analysis, and pathological assessment. The evaluation for sample integrity should include the tumor content, number of tumor cells, and processing and transportation carried out according to SOPs. Evaluation methods include visual inspection, microscopic observation, and nucleic acid concentration analysis of plasma samples. If tumor-cell–enriched regions are marked for nucleic acid extraction, then microdissection could be carried out whenever required[45,46]. (10) Prevention of contamination in sample collection and processing: (a) The use of disposable materials for sample collection is recommended. (b) Any residual samples left on instruments from the previous operations must be cleared away, and single-use blades must be changed between samples from different patients[45]. (11) Sample transportation and storage (a) The lab that performs this testing must establish detailed sample shipping SOPs, provide a sample collection manual for clinicians, and require from the logistics personnel to complete relevant records to ensure a safe and trackable shipment. FFPE samples can be transported at room temperature, plasma samples on dry ice, and nucleic acid samples should be transported at 4 °C or under freezing conditions[45]. (b) Because the clinical sample quality is crucial to the accuracy of clinical NGS tests for tumor driver genes, we recommend that the collection and processing of all types of biological samples should have SOPs. (c) Each sample undergoing NGS testing should have a unique identifier, including different tissue lesions, sections, and nucleic acid samples from the same patient. (d) Except for DNA/RNA from plasma or blood samples, specimens involved in morphology should be assessed for tumor content under the microscope, and the proportion of morphologically malignant cells should be recorded. (e) In general, tissue samples with tumor contents >20% are suitable for NGS testing; if the patient is still willing to have NGS testing performed after being informed, microdissection techniques should be applied to enrich tumor cells as much as possible, or NGS sequencing data volume should be increased to improve sequencing depth. In this situation, sample limitations should be mentioned in the report. More than 50-ng DNA obtained from tissue samples or cfDNA obtained from at least 8-mL whole blood samples are ideal quantities for NGS testing. (f) QC such as nucleic acid concentration and purity analysis on tissue DNA and plasma cfDNA should be performed before NGS testing. (g) Tissue DNA should be analyzed for nucleic acid integrity to determine its quality. (h) Plasma cfDNA should be analyzed for the distribution of fragment lengths to exclude the presence of blood gDNA contamination. (i) QC for concentration and fragment distribution of the library should be performed after library construction. (j) The testing laboratory should consider storing additional biological samples for result validation by other technical methods. They may also consider reserving several more slices for subsequent validations by other methods such as IHC, FISH, and PCR when collecting FFPE sections.

NGS workflow

NGS testing is at present mainly used for driver gene sequencing in clinical cancer practice, which represents an important aspect of precise diagnosis and treatment of tumors. The advantages of NGS technology include its high-throughput workflow, mutation frequency analysis, and relative low cost. The disadvantages include lack of bioinformatics analysis software to meet the clinical application, the interpretation of genotypes depending on the accuracy of bioinformatics results, and the challenge of achieving the sequencing depth due to high overall cost[5,53-56]. In terms of the application of NGS technology in cancer diagnosis and disease monitoring, laboratories should pay attention to reagent management, sample QA/QC, testing environment, personnel qualification and proficiency, and NGS standards for technical parameters (including library construction, sequencing process, and QC). SOPs should be applied to every NGS test to ensure robust and accurate NGS testing for specific tumors. All aspects of sequencing technology should have standard laboratory records to establish a structured database for management, which can track platform environmental parameters, reagent usage and performance, sample tracking, and quality control data. Laboratories must ensure the accuracy of testing by daily and periodic QC procedures[5]. In order to meet the highest quality of clinical needs, the testing laboratory must provide a reasonable backup solution to each step of the workflow that is found to be abnormal or inadequate. Backup solutions could involve adopting another molecular diagnostic platform to detect important sites, validating positive results using other technologies, repeating the high-throughput sequencing process, or requesting new specimens, if this is a possibility. (12) NGS testing should follow strict laboratory environment guidelines regarding all types of sample storage, nucleic acid extraction and processing, and NGS testing to be completed in the certified clinical gene amplification laboratory. (13) Reagents should be stored separately to prevent cross contamination, and the performance of each batch of reagents, including self-made reagents, probes, and commercialized reagents, should be checked for quality before being application to clinical sample testing. (14) Each NGS test should be validated before its use in clinical tumor testing, and the performance of the diagnostic test must be evaluated in terms of accuracy, sensitivity, specificity, and precision[57]. (15) If any major changes are made in the reagent or procedure, these quality parameters must be re-evaluated. (16) When the updated or upgraded tests need re-validations, the testing laboratory should clearly define specimen types and the number of cases to be assayed. It is highly recommended that laboratories take part in NGS-related external quality assessment (EQA) to evaluate testing performance and quality. (17) The workflow of an NGS test includes sample QC/QA, sample preprocessing, adaptor ligation, pre-amplification, target capturing of a gene panel, target purification, library amplification, library quantification, and sequencing on NGS instruments. The laboratories should establish SOPs to guide the operation and quality of each step. (18) If the amplicon-based library preparation method is used, the laboratory must establish the appropriate SOP for quality control management. The technical standardization of NGS should include the technical standardization of the above-mentioned steps, including sample preprocessing, library construction, sequencing, and quality control, as well as the standardization of generation and management of raw data and process for bioinformatics analysis. Therefore, CAGC will establish a set of NGS SOPs on the basis of feasibility verification in the next phase. (19) Staffing and proficiency should meet practical needs. The laboratory should have dedicated quality control personnel, and NGS experimental technicians should receive adequate training to achieve a certain degree of proficiency. Each platform should be operated by full-time technicians. (20) The core gene list and the corresponding reference standards will be used for quality control comparison among laboratories, and a scoring system should be established based on the results of driver gene mutation detections and coverage of the target region. This scoring system will be used for analysis and certification of NGS testing capabilities between laboratories.

NGS data generation, management, and bioinformatic analysis

Due to the tremendous volumes of NGS data, powerful computing platforms are needed to support data management, storage, and analysis. SOPs for data QC, bioinformatics analysis, I/O format, storage interfaces, and many other fields should exist. A structured database is needed to manage raw data, quality control data, and results[53,55,58-63]. (21) NGS analysis tools must be validated by an adequate amount of raw data with different variation types and variants allele frequency (VAF) used to establish accuracy and stability of the complete analysis pipeline. (22) The reliability of the analysis pipeline needs validation when more testing genes are included, and the process should include the details of software development and operation records. CSCO CAGC will also organize projects based on proficient validation of software analysis. (23) The testing laboratories must use structured databases to manage diverse kinds of variants including SNPs, indels, rearrangements (fusions), and CNVs with the data storage adopting common formats such as FASTQ, BAM, and VCF for data exchange and external evaluation. (24) Complete logs should be recorded and saved for distinguishing pipeline version information, tracing sources of abnormal results, and analyzing reproducibility of raw data generated for diagnostic reports. Diagnostic laboratories should keep such databases for at least 15 years. (25) The quality inspection of NGS raw data should be guided by strict operating procedures, with all parameters compared with those of the testing performance evaluation process. The acceptance and rejection criteria should be determined and executed (). Original sequences (FASTQ) and aligned sequences (BAM) quality parameters (26) The data analysis process should include pre-analysis, adaptor removal, primer removal, low-quality sequence removal, mapping to a reference genome (alignment), duplicate removal, indel realignment, base quality score recalibration, variant calling, annotation, filtering, and output steps (). The CAGC-POI project requires a common procedure across different kind of tumors. A brief description of the procedure for clinical tumor NGS testing Genetic variants can be germline or somatic. Somatic events, such as acquired mutations are the most commonly reported variants in cancer genomic studies. However, there is evidence that shows the role of inherited genetic variations in cancer risk, pharmacogenomics, and gene regulation[64]. The interpretation of variations in germline sequences is based mainly on the pathogenicity of a variant for a specific disease, while the interpretation of somatic variants is based on their clinical care effect[65]. According to the joint consensus issued by the Association for Molecular Pathology, American Society of Clinical Oncology, and College of American Pathologists, clinical judgment should be used in the absence of normal paired sequencing data to differentiate somatic and germline variants, since many germline pathogenic variants such as TP53, PTEN, and BRCA1/2 also occur as acquired somatic variants. Hence, sequencing of non-tumor tissue should be performed for a definite germline status[65]. In the routine practice of NGS to gene panels without control non-tumor tissue or blood genomic DNA, an alternative method using public SNP databases and/or laboratory baseline cohort SNP data for filtering the germline variants is recommended. We recommend a clear discrimination between the somatic and germline variants, with somatic variants being more clinically significant in tumor diagnosis and treatment. The biological significance of specific germline variants should be annotated, e.g., BRCA1/2 in breast cancer. Clinically related variants in each tumor type should be highlighted with detailed annotations in test reports. For instance, EGFR mutations on exons 18 to 21, ALK rearrangement, ROS rearrangement, MET exon 14 skipping, HER2 (ERBB2) insertion, and copy number amplification in lung cancer. Recent studies regarding molecular heterogeneity of tumors have shown that tumor subclones may contain different molecular variants. The presence of heterogeneous subclones affects the overall evolution of the tumor and its response and resistance to targeted therapies. In order to understand the relationship between heterogeneity and clinical outcomes in the metastasis stages, the sequencing depth of NGS should be deep enough for both clinical tumor diagnosis and monitoring of drug resistance[66,67]. CAGC recommends that the “effective depth” of an NGS testing should be ≥ 500× for tumor samples, and ≥ 2,000× for plasma cfDNA. The data management of CAGC projects should be standardized in terms of standard processes. The consensus will be verified in the following analyses of 100 samples for each tumor type.

NGS reporting and consulting

Due to the complexity of NGS to generate large volumes of information, the test report should be represented according to the following rules: a concise front page with the essential content, clear results, explicit interpretation for genomic alteration, and sufficient supporting evidence[19,20,53,65,68,69]. The target range of driver genes should be clarified before NGS testing and included in the information to clinicians in the report. NGS reports should be concise and clear. Results with clinical significance should be prioritized, such as EGFR L858R mutation and mutant allele fraction (MAF) = 15.6%. Basic information of patients such as patient identity, clinical diagnosis and a brief description of testing technology should be included in the front page of the report. The report may contain supplementary material depending on the situation. The “core gene” testing results for each tumor type must be clearly reported. Other genes may be summarized based on these results, and specific reports may be made for particular mutations (e.g., core mutations found in other tumors). It is recommended that variants are classified into one of the following categories: “pathogenic, likely pathogenic, uncertain significance, likely benign, benign.” Furthermore, it is strongly recommended that variations of unknown significance (VUS) or unclassified variants (UVs) are be recorded and shared among peers. Genetic variant databases with valid scientific evidence should be used for variant classification, and these databases must meet the requirement of good quality or latest guidance.[70] This will facilitate future research in pinpointing their clinical value.[65] Due to the development of many NGS studies, FDA has approved several NGS-based multigene diagnostic services, including Oncomine Dx from Thermo Fisher, Integrated Mutation Profiling of Actionable Cancer Targets (IMPACT) from Memorial Sloan Kettering (MSK), and Foundation One CDx from Foundation Medicine, Inc.(FMI).[71] FMI reports not only the SNP, indel, rearrangement and copy number alterations (CNA) but also microsatellite instability (MSI) and TMB. Further studies and validations are needed to apply these biomarkers to the Chinese population. It is recommended to integrate NGS data from different types of tumors into the same database, which will be helpful for the referencing of variation information in different tumor types for clinical applications. Therefore, we recommend the format and content of the report to be standardized, while providing enough flexibility to adapt to the development trend in the short term (updates regarding actionable genes and testing content). Appendix 1 is an example of a clinical tumor NGS test report. The report should include patient ID, disease information, sample information, testing methodology information, main results, interpretation of results, and the signature of the operator and the lab expert (molecular oncologist or genetics specialist). The reportable range should be based on NGS testing purposes and qualified detection regions. For example, it is not necessary to report all the genomic alterations by whole-exome sequencing (WES) for lung cancer. However, it is necessary to inform clinicians that there may be potential “non-attended region” variations. The NGS test report should be concise, clear, and understandable, and it should provide the complete necessary information for the clinician to make decisions about the current clinical target-therapy strategy. The interpretation of results must be objective and unbiased, and the genomic variants associated with the tumor should be clearly described. The report should not only describe the curative or predictive correlations from previous studies regarding disease correlation, but also suggest the no-treatment recommendation. Only clinicians or the multidisciplinary team (MDT) can make clinical decisions based on the NGS testing results. Multidisciplinary team discussions are required for genetic counseling, and multiparty communication among the experts in laboratory, cancer genetics, molecular biology, bioinformatics, and pathology is essential for the final clinical decision.

Informed consent

Informed consent form must contain: (1) The basis for testing, including medical guidelines, expert consensus, treatment norms and other content, to inform patients of the significance and purpose of the testing; (2) The conception that the testing process may have limitations, such as the sample volume sufficient for DNA/RNA extraction and failure of test due to bad sample quality. If such a situation occurs to the patient, the patient and family members should be informed as soon as possible, and next steps should be discussed; (3) Gene sequencing can assist in personalized therapy; however, the final decision must be made by the clinician; (4) Testing results only refer to the corresponding samples submitted; (5) The patient’s ID, clinical information, and test results must be kept confidential by the testing institution. Patients must be aware of the above agreement terms and sign informed consent forms before testing[20,72-74]. (27) NGS tests should be carried out after informed consent, genetic counseling, and explanation of the purpose, advantages, and disadvantages of NGS tests under disease conditions to patients. (28) Clinicians and patients need to be informed of the limitations of the test, including driver gene information, reportable range, and the analytical sensitivity and specificity.

Differences between research and diagnostic NGS tests

Because of the rapid technology development and improvement of experimental abilities, the boundaries between research NGS and diagnostic NGS have been blurred. However, it should be made explicit whether a specific NGS test is used for research or diagnosis in practical applications. The purpose of diagnostic NGS is to answer the question of whether a patient’s DNA contains a specific genetic variation in the tumor (actionable variation). Normally, only a certified laboratory is qualified to perform diagnostic NGS tests. For hypothesis-driven research purposes, research NGS tests only provide limited or uncertain clinical significance for enrolled patients. The massive quantities of data obtained from WES or WGS not only serve for tumor diagnosis, but also generate new research hypotheses[20]. (29) Clinically relevant information from research NGS test can be recorded in a patient’s medical records only after the results are validated under a clinical diagnosis condition. (30) Both research and diagnostic NGS test results should be used to establish database management to facilitate the comparison with local, regional, national, or international databases, which will be helpful for clinical classification of variants.

CAGC-POI project NGS test supervision and management

To ensure the same quality and continuity, the CSCO CAGC expert group shall supervise all NGS projects in CAGC-POI clinical practice, clinical trial, and research. This will entail on-site inspections, document management, process monitoring, and external evaluation of testing samples to ensure that every NGS laboratory executing the CAGC-POI projects is at the highest quality standards. Acknowledgements This work was supported by grants from Guangdong Provincial Key Lab of Translational Medicine in Lung Cancer (Grant No. 2017B030314120). General Research Project of Guangzhou Science and Technology Bureau (Grant No. 201607010391); National Key Research and Development Program of China (Grant No. 2016YFC1303800); Guangdong Provincial Applied S&T R&D Program (Grant No. 2016B020237006).

Conflict of interest statement

No potential conflicts of interest to disclose.

Original sequences (FASTQ) and aligned sequences (BAM) quality parameters

Parameter	Description
Median base quality for each cycle	Base quality dropped at the end of reads. The average “median base quality” for a batch sequence should not be <20 (Phred quality score)
Duplication rate	Duplication rate reflects the library complexity
Adaptor removal ratio (if applicable)	The ratio of removed adaptor to the reads is an index of sequence quality
Mapping rate	The ratio of reads that are successfully mapped to reference genome
On target rate	The ratio of reads that are mapped to targeted regions
Average sequencing depth on target region	The average sequencing depth for the target regions meeting the clinical needs
Distribution of sequencing depth on target region	Either a distribution plot or a table to indicate the sequencing depth across the target regions meeting the clinical needs

Variants detecting quality parameters
Parameter	Description
Total variant count	Total variant count in target regions meeting the clinical needs should be similar to the same patient population by using the same gene test with the same target regions
Known SNP ratio	In general, the ratio of known SNPs to the total variant count should be >90%
Insertion/deletion (Indel) ratio	The ratio of insertion/deletion to the total variant count
Homozygous variant ratio	The ratio of homozygous variants to the total variant count
Nonsense mutation ratio	The ratio of nonsense mutations to the total variant count
Transition to transversion ratio	The ratio of transition to transversion

A brief description of the procedure for clinical tumor NGS testing

Step	Description	Tools and database	Output
Base calling and duplicate removal	Base calling and duplicate removal, also known as initial analysis	Sequencing platform configuration software	FASTQ format
Primer removal	Primer sequences for amplicon sequencing must be removed from the reads	CutAdapt, BWA, etc.	FASTQ or BAM format
Adaptor removal	Remove the adaptor sequences from the end of reads. It may interfere with the alignment and cause false-positive/false-negative variant calling if not being trimmed	CutAdapt, BWA, Trimmomatic, SeqPrep, etc.	FASTQ or BAM format
Low-quality base removal	Low-quality bases may also interfere with the alignment and cause false results. These bases should usually be trimmed from the ends of read	CutAdapt, BWA, Trimmomatic, SeqPrep, etc.	FASTQ or BAM format
Alignment	In the alignment step, paired-/single-end reads are aligned to the reference genome. SNVs and small indels could be recognized in this step	BWA, Novalign, Stampy , SOAP2, LifeScope, Bowtie, etc.	BAM format
Duplicate removal (optional)	Duplicates can be introduced by PCR amplifications in the library construction and sequencing steps. Implausible duplicates in the original DNA decrease the accuracy of the calling and should be removed. Probe hybridization capture sequencing generates fewer duplicates, because DNA is randomly fragmented during library construction. Amplicon sequencing does not require deduplication if there are no allele barcodes, and requires if there are	Picard Mark Duplicates, SAMtools, etc.	BAM format
Indel realignment (optional)	Misalignment is usually seen around indels which can cause false results, especially at the beginning or end of the reads. Local realignment method can determine these locations, minimize this error, and increase accuracy	GATK RealignerTargetCreator and IndelRealigner, SRMA, etc.	BAM format
Base quality score recalibration (optional)	The base quality score could be recalibrated after the alignment/realignment to decrease the false-positive rate	GATK BaseRecalibrator and PrintReads, ReQON, etc.	BAM format
Variant calling	Variant calling refers to the detection and description of variations (including SNVs and small indels) based on differences between sequencing data and reference genomes	GATK UnifiedGenotyper, GATK HaplotypeCaller, SAMtools, MuTect, Varscan, Platypus, etc.	VCF format
Annotation	The variant interpretation relies on detailed annotation. The basic annotation includes gene name, gene structure areas (exon, splicing region, intron, intragenic region, etc.), and coding information. SNP information, pathogenicity, and other references could also be included	ANNOVAR, SnpEff, , Cartagenia Bench Lab NGS, dbSNP, 1000 Genomes, ESP6500, SIFT, PhyloP, MutationTaster, COSMIC, OMIM, ClinVar, HGMD, etc.	CSV, TSV, TXT, Excel, etc.
Filtering	Disease related variants could be identified by strict filtering large amount of annotated variant calling results. Typical filtering criteria removes low-quality variants, non-coding regions (eg, intron and intragenic region), synonymous SNVs, and known low-frequency SNPs in healthy populations. Labs should set up an internal database to analyze the false positives that often occur on their own platforms and perform rigorous filtering of these false positives	Cartagenia Bench Lab NGS, SnpSift, etc.	CSV, TSV, TXT, Excel, database, etc.

Appendix 1 Project name: multigene detection of lung cancer
Patient information:
Name	XXX	Gender	Male	Pathological diagnosis	Lung adenocarcinoma
Hospital number	XXX	Age	X yrs
Specimen information:
Specimen number	XXX
Specimen type	Fresh tissue specimen	Tumor cell content	Frozen section assessment 80%	DNA content and quality	XXX
Specimen acceptance date	2018-3-5	Result report date	2018-3-9
Technical brief:	Using NGS technology to detect 286 exon mutations in lung cancer (Appendices 1). Massively parallel sequencing exons and the sequence near the splicing site of these genes. Sample processing, library construction, sequencing, and analysis were all performed at the GLCI Central Laboratory. Testing platform is XXX, and analysis software is XXX. Reference genome is GRCh38.
Content of the test items: Lung cancer-related 286 gene mutations
Test purpose and reason: Pathological diagnosis of patients with stage IV lung adenocarcinoma. Detection of gene targets provides a basis for clinical targeted therapy in patients.
Continued

Continued
Patient information:
1: List of testing gene omitted. 2: NGS quality parameters omitted. 3: List of all mutations and variations in this test (generic descriptions of signaling pathways where major variant genes are located, other polymorphisms, or general descriptions of mutations with uncertain clinical targeting may be listed at the end of the table) omitted.
Test results:Data parameters: XX ng of nucleic acid was used to construct the sequencing library; X% of the target region was effectively sequenced to a depth of XX.Analysis results (can also be described in tabular form):1. The status of eight actionable genes (EGFR、ALK、ROS1、RET、BRAF、HER2、KRAS、MET) in multiple international and domestic guidelines such as NCCN/IASLC/CSCO guidelines.(1) SNPs or Indels in gene sequence:EGFR, located at Chr7 (GRCh38). EGFR c.2155G>A (p.Leu858Arg), missense mutation, mutant allele frequency 19.0%; EGFR c.2369C>T (p.Thr790Met), missense mutation, mutant allele frequency 16.9%.(2) Gene copy number variation: Example: MET amplification 2.1 folds(3) Gene rearrangements or fusion: none.2. The status of other major driver genes:Gene TP53 has a mutation of c.734G>T (p.Gly245Val) in sequence. This is a missense mutation, mutant allele frequency 15.9%. 3. Tumor Mutation Burden (TMB):The TMB value for this examination was: XX mutations/Mb.Note: Tumor mutation burden (TMB) indicates the number of somatic mutations that occur per million bases in the genome. According to existing clinical studies, TMB can be divided into three levels: TMB>=20 as TMB-High; 10<TMB<20 as TMB-Medium; TMB<=10 as TMB-Low. TMB obtained by testing plasma cfDNA could be affected by ctDNA abundance in plasma. In the case of very low ctDNA abundance (<0.5%), the ctDNA mutation burden in blood may not fully represent the mutation burden of tumor lesions. High tumor mutation burden might be in association with the efficacy of immune checkpoint inhibitors. 4. Variation of MMR and other DNA damage related genes: omitted. Note: MSI stands for microsatellite instability. Microsatellites are tandem repeats of short DNA sequences that are widely present in the genome. In the entire genome of the tumor, there are many small mutations in the microsatellites, causing some microsatellites to be unstable, which is called MSI. Due to the mismatch repair defect (dMMR), tumors can be further developed through the MSI pathway. MSI-high or MMR deficiency might be correlated with the efficacy of immune checkpoint inhibitors. 5. Variation of other potentially actionable genes: omitted. Interpretation of results:The results of this test detected EGFR c.2155G>A, p. (Leu858Arg) activation coexistence with EGFR c.2369C>T (p.Thr790Met), which may be related to the resistance of first-generation EGFR-targeted drugs, and the drug sensitivity of the third-generation drug Osimertinib (AZD 9291). TP53 mutation is very common in many cancers, which might influence the efficacy of drug treatments like EGFR TKI and others.
Note:[1] The EGFR reference sequence is RefSeq NM_005228.3; the mutation nomenclature is written according to the HGVS nomenclature guidelines (www.hgvs.org), with the start codon A as the first nucleotide count. [2] Demonstration depth of this test: 300X. [3] Appendixes: see Appendix 1 for the list of testing gene; Appendix 2 for NGS quality parameters; Appendix 3 for list of all mutations and variations in this test. Lung cancer genetic diagnosis test background:The occurrence, development and treatment of lung cancer are closely related to gene mutations. Mutation or ectopic targeting of gene targets in lung cancer samples can benefit from targeted drug therapies, and most patients will eventually become resistant to targeted drug tyrosine kinase inhibitors (TKIs). The method can simultaneously detect multiple oncogenes, targeted drug sensitivity and resistance related genes, and provide a basis for clinical targeted therapy of lung cancer patients. In general, the major driver gene mutations in lung cancer have clinical significance in predicting the efficacy or prognosis of targeted drugs. Mutations or copy number mutations of genes such as EGFR, ALK, ROS1, RET, BRAF, HER2, and MET have a certain relationship with the corresponding targeted therapeutic efficacy. KRAS mutations may be associated with poor therapeutic efficacy of other molecular targeted therapies. Targeted drugs for EGFR mutations include Gefitinib, Erlotinib, Icotinib, Afatinib, Osimertinib, etc. EGFR T790M mutations are associated with first-generation drug-resistant and third-generation drug-sensitive effect, and EGFR C797S point mutations are associated with third-generation drug resistance. Targeted drugs forALK variation were Crizotinib, Alectinib, etc. Secondary mutations like ALK L1196M, C1156Y are associated with first-generation drug-resistance, andALK L1198F is associated with third-generation drug-resistance. Targeted drugs for ROS1 include Crizotinib, etc. Targeted drugs for RET include Cabozantinib, etc. Targeted drugs for HER2 include Transtuzumab, Afatinib, etc. Targeted drugs for MET include Crizotinib, etc. Targeted drugs for BRAF include Vemurafinib, Dabrafinib, etc. Please consult with your attending physician about treatment decision and the implications of driving gene variants in clinical specific setting. Note: The results of this analysis are only responsible for the test specimens.
Technician (Signature): XXX Date: YYYY MM DD
Reviewer (Signature): XXX Date: YYYY MM DD

5 in total

Review 1. Genetic Markers in Lung Cancer Diagnosis: A Review.

Authors: Katarzyna Wadowska; Iwona Bil-Lula; Łukasz Trembecki; Mariola Śliwińska-Mossoń
Journal: Int J Mol Sci Date: 2020-06-27 Impact factor: 5.923

2. Drug resistance gene expression and chemotherapy sensitivity detection in Chinese women with different molecular subtypes of breast cancer.

Authors: Jing Zhao; Hailian Zhang; Ting Lei; Juntian Liu; Shichao Zhang; Nan Wu; Bo Sun; Meng Wang
Journal: Cancer Biol Med Date: 2020-12-15 Impact factor: 4.248

3. Characterization of AKT Somatic Mutations in Chinese Breast Cancer Patients.

Authors: Lingzhu Wen; Guochun Zhang; Chongyang Ren; Xuerui Li; Hsiaopei Mok; Minghan Jia; Yulei Wang; Bo Chen; Kai Li; Li Cao; Cheukfai Li; Weikai Xiao; Jianguo Lai; Jiali Lin; Guangnan Wei; Yingzi Li; Yuchen Zhang; Xiaoqing Chen; Ning Liao
Journal: Cancer Manag Res Date: 2021-04-07 Impact factor: 3.989

4. [Chinese Expert Consensus on Next Generation Sequencing Diagnosis  for Non-small Cell Lung Cancer (2020 Edition)].

Authors:
Journal: Zhongguo Fei Ai Za Zhi Date: 2020-09-20

Review 5. Is miRNA Regulation the Key to Controlling Non-Melanoma Skin Cancer Evolution?

Authors: Tiberiu Tamas; Mihaela Baciut; Andreea Nutu; Simion Bran; Gabriel Armencea; Sebastian Stoia; Avram Manea; Liana Crisan; Horia Opris; Florin Onisor; Grigore Baciut; Bogdan Crisan; Daiana Opris; Bogdan Bumbu; Adela Tamas; Cristian Dinu
Journal: Genes (Basel) Date: 2021-11-29 Impact factor: 4.096