Literature DB >> 29667179

Characteristics of genomic alterations of lung adenocarcinoma in young never-smokers.

Wenxin Luo1, Panwen Tian1, Yue Wang2, Heng Xu3, Lu Chen3, Chao Tang3, Yang Shu3, Shouyue Zhang3, Zhoufeng Wang1, Jun Zhang3, Li Zhang4, Lili Jiang5, Lunxu Liu6, Guowei Che6, Chenglin Guo6, Hong Zhang2, Jiali Wang2, Weimin Li1.   

Abstract

Non-small-cell lung cancer (NSCLC) has been recognized as a highly heterogeneous disease with phenotypic and genotypic diversity in each subgroup. While never-smoker patients with NSCLC have been well studied through next generation sequencing, we have yet to recognize the potentially unique molecular features of young never-smoker patients with NSCLC. In this study, we conducted whole genome sequencing (WGS) to characterize the genomic alterations of 36 never-smoker Chinese patients, who were diagnosed with lung adenocarcinoma (LUAD) at 45 years or younger. Besides the well-known gene mutations (e.g., TP53 and EGFR), our study identified several potential lung cancer-associated gene mutations that were rarely reported (e.g., HOXA4 and MST1). The lung cancer-related copy number variations (e.g., EGFR and CDKN2A) were enriched in our cohort (41.7%, 15/36) and the lung cancer-related structural variations (e.g., EML4-ALK and KIF5B-RET) were commonly observed (22.2%, 8/36). Notably, new fusion partners of ALK (SMG6-ALK) and RET (JMJD1C-RET) were found. Furthermore, we observed a high prevalence (63.9%, 23/36) of potentially targetable genomic alterations in our cohort. Finally, we identified germline mutations in BPIFB1 (rs6141383, p.V284M), CHD4 (rs74790047, p.D140E), PARP1 (rs3219145, p.K940R), NUDT1 (rs4866, p.V83M), RAD52 (rs4987207, p.S346*), and MFI2 (rs17129219, p.A559T) were significantly enriched in the young never-smoker patients with LUAD when compared with the in-house noncancer database (p < 0.05). Our study provides a detailed mutational portrait of LUAD occurring in young never-smokers and gives insights into the molecular pathogenesis of this distinct subgroup of NSCLC.
© 2018 The Authors International Journal of Cancer published by John Wiley & Sons Ltd on behalf of UICC.

Entities:  

Keywords:  genetic predisposition; genomics; lung adenocarcinoma; next generation sequencing; young age

Year:  2018        PMID: 29667179      PMCID: PMC6175072          DOI: 10.1002/ijc.31542

Source DB:  PubMed          Journal:  Int J Cancer        ISSN: 0020-7136            Impact factor:   7.396


American College of Medical Genetics and Genomics background mutation rate Burrows‐Wheeler Aligner copy number variations Exome Aggregation Consortium false discovery rate Integrative Genomics Viewer insertions and deletions kilobase long noncoding RNA lung adenocarcinoma megabase microRNA next generation sequencing non‐small‐cell lung cancer small nucleolar RNA small nuclear RNA single nucleotide variations structural variations whole genome sequencing Lung cancer is the most common cancer and the leading cause of cancer‐related death worldwide, accounting for over 1 million deaths per year. Among all lung cancer cases, 85% are non‐small‐cell lung cancer (NSCLC) with two main histological types of adenocarcinoma and squamous cell carcinoma.1, 2 With the development of precision medicine, NSCLC has been increasingly recognized as a highly heterogeneous disease with phenotypic and genotypic diversity in each subgroup.3, 4, 5, 6 While tobacco smoking is the most important risk factor for lung cancer, there is a distinct subset of patients (∼10–40%) who develop the disease with no history of smoking.6 Previous studies have characterized the genomic alterations of NSCLC in never‐smoker patients using next generation sequencing (NGS). Never‐smoker patients with lung adenocarcinoma (LUAD) harbor significantly lower somatic mutation burden than smoker patients with the same disease.7 Besides, C > T transitions are more common in never‐smoker patients, while C > A transversions occur more often in smoker patients.8 Moreover, EGFR activating mutations and EML4ALK fusions have been identified to be more frequent in never‐smoker patients than smoker patients. Thanks to targeted tyrosine kinase inhibitors, patients with the two genetic alterations have experienced a better survival.9 Aging is another fundamental factor for the development of lung cancer. Recently, it has been demonstrated that young patients have unique disease biology among a number of cancers. For instance, colon cancer arising at young age has been identified to be characterized with high frequency of microsatellite instability.10 Breast cancer diagnosed at a young age has a higher proportion of BRCA1/BRCA2 mutations and ERBB2 overexpression than the older ones.11 Although only 1.3–5.3% of patients with lung cancers are 45 years or younger at diagnosis, there is a trend of increasing incidence of lung cancer among young adults.12, 13, 14, 15 Many recent studies have suggested that NSCLC occurring in young patients constitutes a disease entity with distinct clinicopathologic characteristics.4, 5, 16, 17 Early‐onset NSCLC occurs more often in women and never‐smokers, presents a predominance of LUAD. However, only a few studies have investigated the genomic alterations of NSCLC occurring in young patients, and all of them focused on the mutational frequency of several certain driver events involved in lung cancer. Compared with older patients with NSCLC, higher incidence of ALK, ROS1 and RET fusions exist among the younger patients.4, 5, 16, 17 Despite these progresses, the landscape of genomic alterations of LUAD in young never‐smoker patients remains to be characterized. In this study, we elucidated the both somatic and germline alterations of 36 never‐smoker patients with LUAD aged 45 years or younger through whole genome sequencing (WGS). Our aim was to identify the molecular features of LUAD in young never‐smoker patients and to explore their clinical implications.

Material and Methods

Study population and sample collection

Thirty‐six never‐smoker (defined as <100 cigarettes in a life time) patients, who were diagnosed with LUAD at 45 years or younger were included for this study from West China Hospital from 2011 to 2016. None of them underwent neoadjuvant therapy before surgery. Tumors and matched distal normal lung tissues were obtained during surgery, snap‐frozen in liquid nitrogen and stored at −80°C until sequencing. All samples were reviewed by two pathologists to determine the histological subtype and tumor cellularity. The tumor tissues containing at least 60% of tumor cells were included. All patients provided informed consent, and this study was approved by the Institutional Review Board of West China Hospital, Sichuan University, Chengdu, China. The retrospective study of 1,296 patients with LUAD that received ALK‐Ventana immunohistochemistry testing was also approved by the Institutional Review Board of West China Hospital, Sichuan University, Chengdu, China.

Genomic DNA preparation and whole genome sequencing

The genomic DNA from frozen tissues was extracted using the DNeasy blood and tissue kit (Qiagen, USA) following the manufacturer's protocol. Degradation and contamination were monitored on 1% agarose gel, while the concentration was measured by Qubit DNA Assay Kit in Qubit 2.0 Flurometer (Life Technologies, USA). For WGS, a total amount of 0.5 μg genomic DNA per sample with high‐molecular weight (>20 kb single band) was used for the DNA library preparation. Sequencing library was generated using Truseq Nano DNA HT Sample Prep Kit (Illumina, USA) following the manufacturer's recommendations, and index codes were added to each sample. Briefly, genomic DNA sample was fragmented by a Covarias sonication system to a size of ∼350 bp. Then DNA fragments were endpolished, A‐tailed and ligated with the full‐length adapter for Illumina sequencing, followed by further PCR amplification. After PCR products were purified (AMPure XP system), libraries were analyzed for size distribution by Agilent 2100 Bioanalyzer and quantified by real‐time PCR (3 nmol/L). The clustering of the index‐coded samples was performed on a cBot Cluster Generation System using HiSeq X PE Cluster Kit V2.5 (Illumina, USA) according to the manufacturer's instructions. After cluster generation, the DNA libraries were sequenced on Illumina HiSeq X platform and 150 bp paired‐end reads were generated. The original fluorescence image files obtained from HiSeq platform were transformed to short reads (raw data) by base calling and recorded in FASTQ format, which contained sequence information and corresponding sequencing quality information. After excluding reads containing adapter contamination and low‐quality/unrecognizable nucleotides, clean data were applied for downstream bioinformatical analyses. Meanwhile, the total reads number, sequencing error rate, percentage of reads with average quality >20 and with average quality >30 and GC content distribution were calculated (Supporting Information, Table 1).

Reads mapping and somatic genetic alteration detection

Valid sequencing data were mapped to the reference human genome (UCSC hg19) by Burrows‐Wheeler Aligner (BWA) software to get the original mapping results stored in BAM format.18 Then, SAMtools,19 Picard (http://broadinstitute.github.io/picard/) and GATK20 were used to sort BAM files and do duplicate marking, local realignment and base quality recalibration to generate final BAM file for computing the sequence coverage and depth. To call somatic single nucleotide variations (SNVs) and small insertions and deletions (InDels) from paired tumor‐normal samples, MuTect and Strelka were used respectively.21, 22 In addition to default filters, polymorphisms of somatic SNVs and InDels referenced in the 1000 Genomes Project23 or Exome Aggregation Consortium (ExAC)24 with a minor allele frequency over 1% were removed. Subsequently, VCF (Variant Call Format) was annotated by ANNOVAR.25 Somatic copy number variations (CNVs) were identified by Control‐FREEC,26 while GISTIC27 algorithm was used to infer recurrently amplified or deleted genomic regions. G‐scores were calculated for genomic and gene‐coding regions on the basis of the frequency and amplitude of amplification or deletion affecting each gene. A significant CNV region was defined as having amplification or deletion with a G‐score >0.1, corresponding to a p value threshold of 0.05 from the permutation‐derived null distribution. For somatic structural variations (SVs) detection based on the soft‐clipped reads, CREST28 was implemented to directly map SVs at the nucleotide level of resolution. Only breakpoint pairs with at least three supporting clipped reads spanning the breakpoint were selected for further analysis.

PCR and Sanger sequencing

To validate somatic SNVs, InDels and SVs identified from the WGS data, we used PCR to amplify genomic DNA spanning mutation sites with specific primers. PCR products were electrophoresed through 1.0% agarose gel and sequenced by Sanger method. For ALK and RET fusions detected by WGS, Chimeric reads covering breakpoints were visualized and carefully evaluated using Integrative Genomics Viewer (IGV).26 A total of 29 identified somatic nonsynonymous SNVs/InDels were successfully verified (93.5%, 29/31) (Supporting Information, Table 2) and 9 SVs were verified (Supporting Information, Table 3).

Identification of significantly mutated genes and pathways

Significantly mutated genes were identified using MuSiC and MutSigCV,29, 30 which estimate the background mutation rate (BMR) for each gene‐patient‐category combination based on the observed silent mutations in the gene and noncoding mutations in the surrounding regions. Significant levels (p values) were determined by testing whether the observed mutations in a gene occurred more frequently than expected by random chance based on the background model. False discovery rates (FDR, q value) were then calculated, and candidate driver genes with q value <0.1 were exhibited after the elimination of apparent false‐positive findings and genes encoding proteins with >4,000 amino acids.30 Pathway enrichment analysis was carried out using PathScan algorithm to identify known cellular pathways with significant accretions of somatic mutations in lung tumors.31 Regardless of the frequency of mutation in specific genes, the entire nonsynonymous mutation was investigated to figure out the distribution of genes within the KEGG database.

Interpretation of germline variants

Genetic predisposition was estimated by variants considered as “Pathogenic” or “Likely Pathogenic” using the American College of Medical Genetics and Genomics (ACMG) guidelines.32 Germline SNVs/InDels were detected by SAMtools,19 followed by classification assignment of the input variant. The mutation frequency of positive genes assigning “pathogenic” in our cohort were compared with those disclosed in the in‐house noncancer WGS database (Novo‐Zhonghua Non‐Cancer Genomes Database from Novogene Co., Ltd, Beijing, China), which recorded germline variants found in 568 Chinese noncancer individuals. The mutation frequency in the controls included not only candidate variants identified in cases but also the other pathogenic loci located in the gene of attention.

Statistical analysis

The difference of somatic mutation rate between two groups was tested by Wilcoxon test. The other statistical comparisons between two groups were determined by Fisher's exact test. The difference of log10 of number of SNVs per kilobase (kb) among different gene types was tested by Fisher's least significant difference (LSD) test. Overall survival was estimated using Kaplan–Meier method with log rank test. All statistical analyses were performed using R software.

Results

Clinical and whole‐genome sequencing data

Among the 36 young never‐smoker patients with LUAD, 30 patients (83.3%) were female; the median age was 40 years (range, 28–45). Detailed clinicopathologic data were in Supporting Information, Table 4. The mean sequencing coverage was 53× (range, 50.47–57.04×) for the tumors and 34× (range, 30.28–45.88×) for the matched normal tissues. Overall, we detected 4,344–60,259 somatic mutations per tumor (Supporting Information, Table 5). Among them, 2,739 nonsynonymous mutations were identified, including 2,412 missense mutations, 138 nonsense mutation, 1 nonstop mutation, 8 in‐frame InDels, 27 frame‐shift InDels and 153 splicing mutations (Fig. 1 a Supporting Information, Table 2).
Figure 1

Mutation landscape of lung adenocarcinoma in young never‐smoker patients. (a) Nonsynonynous mutation rates (number of mutations per Mb) in 36 tumor samples. (b) Genes predicted to be significantly mutated by MuSiC. Asterisk indicated that the genes were noted by both MuSiC and MutSigCV. Genes were sorted by significant FDR value (right panel); the frequency was indicated by the number of mutated samples (left panel). (c) Focal CNVs in known lung cancer genes in 36 tumor samples. (d) SVs previously implicated in lung cancer in 36 tumor samples. (e) Percentage of six types of single nucleotide substitutions in each tumor sample. The samples were ordered on the horizontal axis based on the clustering of their mutated genes. The colors denoted different types of somatic events. [Color figure can be viewed at http://wileyonlinelibrary.com]

Mutation landscape of lung adenocarcinoma in young never‐smoker patients. (a) Nonsynonynous mutation rates (number of mutations per Mb) in 36 tumor samples. (b) Genes predicted to be significantly mutated by MuSiC. Asterisk indicated that the genes were noted by both MuSiC and MutSigCV. Genes were sorted by significant FDR value (right panel); the frequency was indicated by the number of mutated samples (left panel). (c) Focal CNVs in known lung cancer genes in 36 tumor samples. (d) SVs previously implicated in lung cancer in 36 tumor samples. (e) Percentage of six types of single nucleotide substitutions in each tumor sample. The samples were ordered on the horizontal axis based on the clustering of their mutated genes. The colors denoted different types of somatic events. [Color figure can be viewed at http://wileyonlinelibrary.com] The mean somatic mutation rate in our cohort was 4.7 per megabase (Mb), lower than that of LUAD from the TCGA cohort (8.87 per Mb).8 Patients with somatic TP53 mutation harbored higher somatic mutation rate than patients with TP53 wild type (p < 0.001), patients aged older than 40 carried higher somatic mutation rate than patients aged at 40 years or younger (p = 0.035), and patients with larger tumor size (maximum diameter of tumor >3 cm) bore heavier mutational burdens with statistically significance, when compared with that of patients with smaller tumor size (p < 0.001). Mutational spectrum analysis revealed that except for five patients (Y5, Y34, Y30, Y26 and Y25) carrying dominant C > A transversions, the most common somatic nucleotide substitution in our cohort was C > T transitions, which had been implicated in LUAD in never‐smokers (Fig. 1 e).8

Recurrent somatic mutations in protein‐coding genes

To identify potential driver mutations, we characterized somatic mutations and identified 35 significantly mutated genes (q < 0.1) (Fig. 1 b). Apart from well‐known gene mutations (e.g., TP53, EGFR, MTOR, KRAS and SETD2),33 our analysis also identified several potential lung cancer associated gene mutations that were rarely reported (e.g., HOXA4, MST1 and CD209).34, 35, 36, 37 When comparing with the mutation frequency of top 50 most frequently mutated genes reported in 412 LUAD cases from the TCGA cohort,8 our cohort exhibited lower prevalence of recurrent mutations, especially in TP53 (27.78% vs. 47.57%, p = 0.024), KRAS (11.11% vs. 27.91%, p = 0.030), STK11 (0.00% vs. 15.53%, p = 0.005) and KEAP1 (0.00% vs. 14.81%, p = 0.009) (Supporting Information, Table 6). Conversely, EGFR and RB1 were relatively enriched in our study, although the difference did not reach the statistical significance (p = 0.16 and p = 0.19, respectively). Given the genetic heterogeneity between different ethnic populations, the distinct mutation patterns of our cohort were further investigated through comparing with previous study on Asians.38 Fewer EGFR mutations were observed in our cohort than previous data from Asians under the similar genetic background (25.00% vs. 39.40%, p = 0.1). Consistent with the previous studies,8, 38 EGFR activating mutations in our cohort occurred recurrently at the most common sites, including exon 19 deletions (3/9, 33.3%) and exon 21 L858R mutations (6/9, 66.7%). As to CNVs detected in our cohort, only deletion in CDKN2A located on 9p21.3 was found statistically significant (p < 0.05). Additionally, we figured out 4 substantially amplified or deleted genes (EGFR, TERT, NKX2‐1 and MET) by examining the copy number of reported CNV hotspots in our sequencing data.33 A total of 15 patients (41.7%) carried lung cancer‐related CNVs (Fig. 1 c Supporting Information, Table 7). Among the SVs detected in our cohort, the lung cancer related structural variations (e.g., EML4ALK and KIF5BRET) were commonly observed (22.2%, 8/36). Apart from two well‐known gene fusions, EML4ALK and KIF5BRET, we also identified two novel gene fusions, which were SMG6‐ALK and JMJD1C‐RET (Fig. 1 d Supporting Information, Table 3). Interestingly, the different fusion partners of ALK co‐occurred in a single patient, the same with RET. In addition, the frequency of ALK fusions in our cohort were higher than previous studies on never‐smoker patients with LUAD from China (16.7% vs. 7.0%, p = 0.039).39 In the six EML4ALK fusions, all breakpoints of ALK were generated in intron 19 and three of EML4 in intron 12, intimating that these two segments might be hotspots. To validate the high frequency of ALK fusions in young never‐smokers with LUAD, we reviewed the records of the 1296 patients with LUAD that received ALK‐Ventana immunohistochemistry testing between January 1, 2016 and January 1, 2017 in West China Hospital. Among the 839 patients with no history of smoking, the positive rate of ALK fusions in patients aged 45 years or younger was significantly higher than that in older patients (17.1% vs. 5.8%, p < 0.001). Meanwhile, among the 457 patients having smoking history, the positive rate of ALK fusions in patients aged 45 years or younger was also significantly higher than that in older patients (15.9% vs. 3.4%, p < 0.001).

Recurrent somatic mutations in noncoding regions

To investigate the role of noncoding RNA in LUAD occurring in young never‐smokers, we compared the frequency of mutations of different gene types in our samples. First, we calculated the number of SNVs in per kilobase (kb). Our results indicated that the number of SNVs in long noncoding RNAs (lncRNAs) and other types of noncoding genomic regions were higher than that in protein coding genes (Fig. 2 a). Among the gene types with >100 mutational genes, lncRNA had the highest average frequency of mutations (32.65 SNVs/kb), while the average frequency of protein‐coding genes was 6.16 SNVs/kb.
Figure 2

Mutations of noncoding regions in lung adenocarcinoma occurring in young never‐smoker patients. (a) The log10 of number of SNVs per Kb in the six noncoding regions and protein‐coding genes, and the number of mutational genes displayed in the parentheses behind every gene types. Items with different letters were significantly different (LSD test, p < 0.01). (b) The mutation frequency heatmap of the top 10 genes with highest mutation frequency in 36 tumor samples. The legend showed the meaning of the heatmap color; the depth of the color represented the size of the log10 (number of SNVs per kb). (c) RP11–774D14.1 and RP11–435B5.4 mutations in each tumor. The x‐axis indicated the absolute position of the gene and the dotted line showed the recurrent mutations, which had a high mutation percentage (>10% in all samples). And the mutation percentage of the recurrent mutations was showed in the upper of the figure. [Color figure can be viewed at http://wileyonlinelibrary.com]

Mutations of noncoding regions in lung adenocarcinoma occurring in young never‐smoker patients. (a) The log10 of number of SNVs per Kb in the six noncoding regions and protein‐coding genes, and the number of mutational genes displayed in the parentheses behind every gene types. Items with different letters were significantly different (LSD test, p < 0.01). (b) The mutation frequency heatmap of the top 10 genes with highest mutation frequency in 36 tumor samples. The legend showed the meaning of the heatmap color; the depth of the color represented the size of the log10 (number of SNVs per kb). (c) RP11–774D14.1 and RP11–435B5.4 mutations in each tumor. The x‐axis indicated the absolute position of the gene and the dotted line showed the recurrent mutations, which had a high mutation percentage (>10% in all samples). And the mutation percentage of the recurrent mutations was showed in the upper of the figure. [Color figure can be viewed at http://wileyonlinelibrary.com] We next accessed whether the SNVs in these lncRNA genes were recurrent in our patients. Patient Y25, Y26 and Y27, who carried a large amount of somatic mutations across the whole genome level (Fig. 1 a), also harbored high level of variants within the lncRNA genes. When plotting the top 10 lncRNA genes with most mutations in our study, we found that RP11–435B5.4 and RP11–774D14.1 showed mutation in majority of the patients (Fig. 2 b). Among the loci in RP11–774D14.1 gene, five SNVs were found in >10% of the patients (Fig. 2 c), and one SNV of RP11–435B5.4 were found in 11% of the patients, suggesting these recurrent SNVs in the two lncRNAs may be associated with biological function or can be used as potential biomarkers.

Somatically altered pathways and clinical implications

Integrative analysis of altered key pathways affected by SNVs/InDels, CNVs and SVs was performed to construct a comprehensive view of genomic characteristics of LUAD in young never‐smokers (Fig. 3). The most frequently aberrant pathways were RTK/RAS/PI3K pathway, affecting 32 patients (88.9%), which was in accordance with the TCGA study on LUAD.8 The p53 pathway was also frequently aberrant, with 21 patients (58.3%) harboring genomic alterations involved in this pathway.
Figure 3

Somatically altered pathways in lung adenocarcinoma in young never‐smoker patients. Components and inferred functions of p53 signaling/cell cycle process, RTK/Ras/PI3K pathway and histone/chromatin modification were summarized in the main text. Percentage presented alteration frequencies in 36 tumor samples. Pathway alterations including somatic SNVs, CNVs and SVs were shown. Activated and inactivated pathways/genes and activating or inhibitory symbols were based on predicted effects of genome alterations and/or pathway function. [Color figure can be viewed at http://wileyonlinelibrary.com]

Somatically altered pathways in lung adenocarcinoma in young never‐smoker patients. Components and inferred functions of p53 signaling/cell cycle process, RTK/Ras/PI3K pathway and histone/chromatin modification were summarized in the main text. Percentage presented alteration frequencies in 36 tumor samples. Pathway alterations including somatic SNVs, CNVs and SVs were shown. Activated and inactivated pathways/genes and activating or inhibitory symbols were based on predicted effects of genome alterations and/or pathway function. [Color figure can be viewed at http://wileyonlinelibrary.com] To comprehensively identify potentially targetable genomic alterations in our cohort, we matched SNVs/InDels, CNVs and SVs with previous data from published clinical trials.40, 41 As a result, we identified 5 genes (EGFR, ALK, RET, MET and MTOR) with potentially targetable alterations that were responsive to specific kinase inhibitors or antibodies in 23 patients (63.9%) (Fig. 4). The most frequent potentially targetable genomic alterations were EGFR activating mutations and ALK fusions.
Figure 4

Therapeutic targets in young never‐smoker patients with lung adenocarcinoma. Missense mutations, in‐frame indel, copy number amplifications and gene fusions that were regarded as potential targets of specific kinase inhibitors or antibodies were investigated thoroughly. Tumors with at least one alteration were shown. [Color figure can be viewed at http://wileyonlinelibrary.com]

Therapeutic targets in young never‐smoker patients with lung adenocarcinoma. Missense mutations, in‐frame indel, copy number amplifications and gene fusions that were regarded as potential targets of specific kinase inhibitors or antibodies were investigated thoroughly. Tumors with at least one alteration were shown. [Color figure can be viewed at http://wileyonlinelibrary.com]

Genetic predisposition

To explore the genetic factor for early‐onset of LUAD, germline variants from the current 36 cases and 28 additional unpublished LUAD cases occurring in never‐smokers were evaluated according to ACMG guideline (Fig. 5 and Supporting Information, Table 8). These cases were classified into two groups: the young group (patients aged at 45 or younger, n = 46) and the old group (patients aged older than 55 years, n = 18). Pathogenic or likely pathogenic germline mutations in 35 cancer genes were identified in 36 patients of the young Group (36/46, 78.3%) (Fig. 5 and Supporting Information, Table 9). Notably, germline mutations in BPIFB1 (rs6141383, p.V284M), CHD4 (rs74790047, p.D140E), PARP1 (rs3219145, p.K940R), NUDT1 (rs4866, p.V83M), RAD52 (rs4987207, p.S346*) and MFI2(rs17129219, p.A559T) were significantly enriched in the young group when compared with the in‐house noncancer database (p < 0.05). Among them, BPIFB1, CHD4 and RAD52 susceptibility loci were also not detected in the old group.
Figure 5

Pathogenic and likely pathogenic germline alterations classified by ACMG. Genes were categorized into “lung cancer genes” (top) and “other type cancer genes” (below). The square frame indicated genes related to DNA reparation. The color denoted different tiers of germline events. Patients with assumable predisposition of cancer were indicated by * (multiple primary cancers) or # (immediate family member diagnosed with cancer). [Color figure can be viewed at http://wileyonlinelibrary.com]

Pathogenic and likely pathogenic germline alterations classified by ACMG. Genes were categorized into “lung cancer genes” (top) and “other type cancer genes” (below). The square frame indicated genes related to DNA reparation. The color denoted different tiers of germline events. Patients with assumable predisposition of cancer were indicated by * (multiple primary cancers) or # (immediate family member diagnosed with cancer). [Color figure can be viewed at http://wileyonlinelibrary.com] The patient (Y25) that had a germline TP53 missense mutation (rs121912664, p. R205H) was found to be hypermutated when compared with the others. However, it seemed that germline defections may not be served as a predictor for prognosis, as neither identified lung susceptibility genes nor genes significantly enriched in cases showed obvious effects on clinical outcome according to Kaplan–Meier curves for overall survival (p = 0.318 and p = 0.827).

Discussion

WGS provides a unique opportunity to perform an integrated analysis concerning not only point mutation but also structural alteration. That, combined with the restriction of sample included, enabled us to identify recurrent somatic mutations with tumorigenic ability in the special group of lung cancer (young never‐smoker patients with LUAD). In contrast to common sense that LUAD was labeled with high mutation burden,8, 42 young never‐smoker patients possessed the characteristics of lower mutation load and fewer classic driver substitutions. Nevertheless, oncogenetic fusions occurred more frequently, emphasizing the importance of more study and special consideration of non‐SNV aberrations in the carcinogenic processes of this distinct subgroup of lung cancer. It was well known that the prevalence of EGFR mutations varied hugely in different settings, according to age, smoking status and ethics. Previous researches agreed that EGFR activating mutations was most commonly found in the Asian decent and never‐smokers, and the mutation rate climbed extremely high (60.7%, 462/761) when focusing on Asian never‐smokers.43 In striking contrast to the driver landscape of Asian never‐smokers dominated by EGFR mutations, only 25.0% patients in our cohort were featured with activating EGFR. The discrepancy might be ascribed to younger age; however, previous studies concerning the difference of proportion of EGFR mutations between young and old patient groups were inconclusive.4, 5, 13, 17 In addition, other fairly frequent events (e.g., TP53, KRAS and KEAP1) in the TCGA cohort did not appear frequent in our cohort, which also demonstrated heterogeneity exist in the specific subgroup of NSCLC. Furthermore, among the recurrent somatic mutations in protein‐coding genes in the study, HOXA4 and MST1 are labeled as lung cancer‐associated genes. HOXA4 belongs to the HOX family of transcription factors that has been implicated in regulating gene expression. Previous studies have found that HOX overexpression exist in LUAD and result in enhanced motile and invasive properties.36 MST1 encodes serine threonine kinase, which has been identified to perform tumor‐suppressor function involving in cell growth, proliferation and apoptosis. A recent study has shown that MST1 overexpression inhibit the growth of NSCLC A549 cells both in vitro and in vivo.37 Apart from identifying novel fusions of ALK and RET (i.e., SMG6‐ALK and JMJD1C‐RET), we also found that the prevalence of EML4ALK in young LUAD patients was significantly higher than that in older ones regardless of the smoking status. Intriguingly, other malignancies that harbor ALK fusions, including anaplastic large cell lymphomas, neuroblastoma and inflammatory myofibroblastic tumor, mainly occur in young adults and children.44, 45, 46 In addition, other gene fusions in LUAD, such as ROS1 and RET fusions, are also reported to be associated with earlier onset.4, 5 These findings suggest that SVs result in more aggressive tumors that require less time to become overt phenotypes, and highlight the need for performing genetic SVs testing in young patients with LUAD. Another major finding of this study was that young never‐smoker patients with LUAD harbored a high frequency (63.9%) of potentially targetable genomic alterations in EGFR, ALK, RET, MET and MTOR. Consistent with our finding, Sacher et al. evaluated molecular features of 2,237 patients with NSCLC and found that patients aged 50 years or younger were significantly more likely to carry a targetable genomic alteration than the older ones (78% vs. 49%, p < 0.001).5 These data suggest that young never‐smoker patients with LUAD represent a distinct subgroup of NSCLC that was enriched with targetable genotypes, and deserve extensive screening of targetable genomic alterations and subsequently benefit from personalized medicine strategies with specific targeted therapy. Moreover, we had the opportunity to better understand the genetic predisposition to LUAD with stringent requirement for enrollment of the study. We found that young never‐smoker patients with LUAD harbored a unique pattern of germline mutations in cancer predisposition genes, including BPIFB1 (rs6141383, p.V284M), CHD4 (rs74790047, p.D140E), PARP1 (rs3219145, p.K940R), NUDT1 (rs4866, p.V83M), RAD52 (rs4987207, p.S346*) and MFI2 (rs17129219, p.A559T). Among them, BPIFB1 and CHD4 susceptibility loci have been identified to be associated with lung cancer risk (p = 1.79 × 10−7 and p < 0.001, respectively) and the former was also linked to age of onset of lung cancer (p = 0.006).47, 48 Meanwhile, studies have been shown that PARP1, NUDT1, RAD52 and MFI2 susceptibility loci would increase the risk of other cancers, for example, PARP1 in gastric and breast cancer and MFI2 in colorectal cancer.49, 50, 51 These findings suggest that the existence of mutations in cancer predisposition genes may be a possible reason for their early onset of LUAD without smoking history and that a better understanding of lung cancer risk will depend on evaluation of cancer predisposition genes. Although we failed to find considerable differences in clinical presentation and overall survival across the patients with or without genetic defects, it seemed that predisposition genes had an impact on the initiation rather than the development of tumor. It was arbitrary to negate the bridge from germline mutations to clinical features of disease. More patients and longer follow‐up period would help to explore the relative contributions of inherited genetic factors to prognosis. This study is the first to characterize the genomic alterations of LUAD in young never‐smokers through WGS. Our study provides insights into understanding the genomic landscape and molecular basis for this specific subgroup of NSCLC. A limitation of this study is its small sample size. Future studies should include validation of these findings in a larger size of samples. Supplementary Table 1 Click here for additional data file. Supplementary Table 2 Click here for additional data file. Supplementary Table 3 Click here for additional data file. Supplementary Table 4 Click here for additional data file. Supplementary Table 5 Click here for additional data file. Supplementary Table 6 Click here for additional data file. Supplementary Table 7 Click here for additional data file. Supplementary Table 8 Click here for additional data file. Supplementary Table 9 Click here for additional data file.
  49 in total

Review 1.  Lung cancer.

Authors:  Roy S Herbst; John V Heymach; Scott M Lippman
Journal:  N Engl J Med       Date:  2008-09-25       Impact factor: 91.245

2.  Dendritic cells recognize tumor-specific glycosylation of carcinoembryonic antigen on colorectal cancer cells through dendritic cell-specific intercellular adhesion molecule-3-grabbing nonintegrin.

Authors:  Klaas P J M van Gisbergen; Corlien A Aarnoudse; Gerrit A Meijer; Teunis B H Geijtenbeek; Yvette van Kooyk
Journal:  Cancer Res       Date:  2005-07-01       Impact factor: 12.701

3.  [Overexpression of human homeobox gene in lung cancer A549 cells results in enhanced motile and invasive properties].

Authors:  T Omatu
Journal:  Hokkaido Igaku Zasshi       Date:  1999-09

Review 4.  New and emerging targeted treatments in advanced non-small-cell lung cancer.

Authors:  Fred R Hirsch; Kenichi Suda; Jacinta Wiens; Paul A Bunn
Journal:  Lancet       Date:  2016-09-01       Impact factor: 79.321

5.  Mapping the hallmarks of lung adenocarcinoma with massively parallel sequencing.

Authors:  Marcin Imielinski; Alice H Berger; Peter S Hammerman; Bryan Hernandez; Trevor J Pugh; Eran Hodis; Jeonghee Cho; James Suh; Marzia Capelletti; Andrey Sivachenko; Carrie Sougnez; Daniel Auclair; Michael S Lawrence; Petar Stojanov; Kristian Cibulskis; Kyusam Choi; Luc de Waal; Tanaz Sharifnia; Angela Brooks; Heidi Greulich; Shantanu Banerji; Thomas Zander; Danila Seidel; Frauke Leenders; Sascha Ansén; Corinna Ludwig; Walburga Engel-Riedel; Erich Stoelben; Jürgen Wolf; Chandra Goparju; Kristin Thompson; Wendy Winckler; David Kwiatkowski; Bruce E Johnson; Pasi A Jänne; Vincent A Miller; William Pao; William D Travis; Harvey I Pass; Stacey B Gabriel; Eric S Lander; Roman K Thomas; Levi A Garraway; Gad Getz; Matthew Meyerson
Journal:  Cell       Date:  2012-09-14       Impact factor: 41.582

6.  Molecular alterations in non-small cell lung carcinomas of the young.

Authors:  Christopher J VandenBussche; Peter B Illei; Ming-Tseh Lin; David S Ettinger; Zahra Maleki
Journal:  Hum Pathol       Date:  2014-09-02       Impact factor: 3.466

7.  Advanced non-small cell lung cancer in patients aged 45 years or younger: outcomes and prognostic factors.

Authors:  Chia-Lin Hsu; Kuan-Yu Chen; Jin-Yuan Shih; Chao-Chi Ho; Chih-Hsin Yang; Chong-Jen Yu; Pan-Chyr Yang
Journal:  BMC Cancer       Date:  2012-06-13       Impact factor: 4.430

8.  Control-FREEC: a tool for assessing copy number and allelic content using next-generation sequencing data.

Authors:  Valentina Boeva; Tatiana Popova; Kevin Bleakley; Pierre Chiche; Julie Cappo; Gudrun Schleiermacher; Isabelle Janoueix-Lerosey; Olivier Delattre; Emmanuel Barillot
Journal:  Bioinformatics       Date:  2011-12-06       Impact factor: 6.937

Review 9.  Biology of breast cancer in young women.

Authors:  Hatem A Azim; Ann H Partridge
Journal:  Breast Cancer Res       Date:  2014-08-27       Impact factor: 6.466

10.  An integrated map of genetic variation from 1,092 human genomes.

Authors:  Goncalo R Abecasis; Adam Auton; Lisa D Brooks; Mark A DePristo; Richard M Durbin; Robert E Handsaker; Hyun Min Kang; Gabor T Marth; Gil A McVean
Journal:  Nature       Date:  2012-11-01       Impact factor: 49.962

View more
  20 in total

1.  Distinctive targetable genotypes of younger patients with lung adenocarcinoma: a cBioPortal for cancer genomics data base analysis.

Authors:  Helei Hou; Chuantao Zhang; Xiaogai Qi; Lei Zhou; Dong Liu; Hongying Lv; Tianjun Li; Dantong Sun; Xiaochun Zhang
Journal:  Cancer Biol Ther       Date:  2019-10-09       Impact factor: 4.742

Review 2.  A Systematic Literature Review of Whole Exome and Genome Sequencing Population Studies of Genetic Susceptibility to Cancer.

Authors:  Alisa M Goldstein; Elizabeth M Gillanders; Melissa Rotunno; Rolando Barajas; Mindy Clyne; Elise Hoover; Naoko I Simonds; Tram Kim Lam; Leah E Mechanic
Journal:  Cancer Epidemiol Biomarkers Prev       Date:  2020-05-28       Impact factor: 4.254

Review 3.  "Keaping" a lid on lung cancer: the Keap1-Nrf2 pathway.

Authors:  Sarah A Best; Kate D Sutherland
Journal:  Cell Cycle       Date:  2018-08-01       Impact factor: 4.534

4.  NRF2: KEAPing Tumors Protected.

Authors:  Ray Pillai; Makiko Hayashi; Anastasia-Maria Zavitsanou; Thales Papagiannakopoulos
Journal:  Cancer Discov       Date:  2022-03-01       Impact factor: 38.272

5.  High kinesin family member 18A expression correlates with poor prognosis in primary lung adenocarcinoma.

Authors:  Xiaoqing Li; Meirong Liu; Zheng Zhang; Linlin Zhang; Xingmei Liang; Linlin Sun; Diansheng Zhong
Journal:  Thorac Cancer       Date:  2019-03-25       Impact factor: 3.500

Review 6.  Inherited lung cancer: a review.

Authors:  Viviane Teixeira Loiola de Alencar; Maria Nirvana Formiga; Vladmir Cláudio Cordeiro de Lima
Journal:  Ecancermedicalscience       Date:  2020-01-29

7.  Molecular characterization of lung adenocarcinoma from Korean patients using next generation sequencing.

Authors:  You Jin Chun; Jae Woo Choi; Min Hee Hong; Dongmin Jung; Hyeonju Son; Eun Kyung Cho; Young Joo Min; Sang-We Kim; Keunchil Park; Sung Sook Lee; Sangwoo Kim; Hye Ryun Kim; Byoung Chul Cho
Journal:  PLoS One       Date:  2019-11-25       Impact factor: 3.240

8.  GPC3 affects the prognosis of lung adenocarcinoma and lung squamous cell carcinoma.

Authors:  Jing Ning; Shenyi Jiang; Xiaoxi Li; Yang Wang; Xuhong Deng; Zhiqiang Zhang; Lijie He; Daqing Wang; Youhong Jiang
Journal:  BMC Pulm Med       Date:  2021-06-10       Impact factor: 3.317

9.  Oncogenic driver mutations in Swiss never smoker patients with lung adenocarcinoma and correlation with clinicopathologic characteristics and outcome.

Authors:  Claudia Grosse; Alex Soltermann; Markus Rechsteiner; Alexandra Grosse
Journal:  PLoS One       Date:  2019-08-06       Impact factor: 3.240

10.  Low-dose CT lung cancer screening in never-smokers and smokers: results of an eight-year observational study.

Authors:  Ryutaro Kakinuma; Yukio Muramatsu; Hisao Asamura; Shun-Ichi Watanabe; Masahiko Kusumoto; Takaaki Tsuchida; Masahiro Kaneko; Koji Tsuta; Akiko Miyagi Maeshima; Genichiro Ishii; Kanji Nagai; Taiki Yamaji; Takahisa Matsuda; Noriyuki Moriyama
Journal:  Transl Lung Cancer Res       Date:  2020-02
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.