Literature DB >> 33889506

Deep sequencing reveals the genomic characteristics of lung adenocarcinoma presenting as ground-glass nodules (GGNs).

Nan Wu1, Sixue Liu2,3, Jingjing Li4, Zhenyu Hu2,3, Shi Yan1, Hongwei Duan2,3, Dafei Wu2,5, Yuanyuan Ma1, Shaolei Li1, Xing Wang1, Yaqi Wang1, Xiang Li1, Xuemei Lu2,3,6.   

Abstract

BACKGROUND: The concept of multi-step progression from atypical adenomatous hyperplasia (AAH) to invasive adenocarcinoma (ADC) has been proposed, and ground-glass nodules (GGNs) may play a critical role during the early lung tumorigenesis. We present the first comprehensive description of the genomic architecture of GGNs to unravel the genetic basis of GGN.
METHODS: We investigated 30 GGN-like lungs ADC by performing >1,000× whole-exome sequencing (WES) and characterized the genomic variations and evaluate the relationship between the clinicopathologic and molecular characteristics in this disease.
RESULTS: Despite the low somatic mutation burden, GGNs exhibited high intratumor heterogeneity (ITH) characterized by the proportion of subclonal mutations. Different mutagenesis shaped the genomes of GGN during cancer evolution and were mostly featured by molecular clock-like signatures that occur in clonal mutations and defective DNA mismatch signatures that occur in subclonal mutations. Moreover, 10.7-67.1% clonal mutations occurred after whole-genome doubling (WGD), indicating that WGD could be a frequent truncal event in GGNs. Samples with WGD showed higher genomic instability but lower ITH. These GGNs were characterized by recurrent focal copy-number changes that are highly associated with tumorigenesis, with only two genes (EGFR and RBM10) that were recurrently mutated. Additionally, GGNs with different pathological subtypes or computed tomography (CT) features exhibited distinct genetic characteristics. Lepidic predominant or pure GGNs in CT images carried a lower mutation burden and had a relatively stable genome than nonlepidic or mixed GGNs. GGNs with RBM10 mutations tended to accompany a pathologically lepidic pattern, indicating RBM10 may drive the distinct subtype of lung cancer with better prognosis.
CONCLUSIONS: These findings facilitated interpreting the genomic characteristics of GGNs, provided insight into the early stages of lung cancer evolution, and possessed potential clinical significance. 2021 Translational Lung Cancer Research. All rights reserved.

Entities:  

Keywords:  Ground-glass nodules (GGNs); deep sequencing; genomic analysis; lung adenocarcinoma (LUAD)

Year:  2021        PMID: 33889506      PMCID: PMC8044491          DOI: 10.21037/tlcr-20-1086

Source DB:  PubMed          Journal:  Transl Lung Cancer Res        ISSN: 2218-6751


Introduction

Lung cancer is a leading cause of cancer-related deaths worldwide (1). Currently, the majority of lung cancer cases are diagnosed at advanced stage, and despite improvements in molecular diagnosis and targeted therapies, the average 5-year survival rate for lung cancer remains less than 20% (2). Limiting these advances is a poor knowledge of the earliest events that underlie lung cancer development and that would constitute markers and targets for early detection and prevention. Understanding the genomic variations of early stage lung cancer may reflect the initial features of tumorigenesis, and is crucial for the proper management of lung cancer. Ground-glass nodules (GGNs) are characterized as nodules with ground-glass opacity (GGO) in lung parenchyma, which has been described as haziness with increased lung attenuation by computed tomography (CT) and preserved bronchial and vascular margins (3-5). Several studies have shown that persistent GGN on CT is associated with lung adenocarcinoma (LUAD), which should be suspected with a high risk of malignancy (6,7). With recent advances in diagnostic imaging modalities and the widespread use of chest CT screening, the detection rate of lung ADC presenting as GGNs is increasing. GGNs generally grow slowly, have a good prognosis (8), and are considered an early stage of tumorigenesis (9,10). Accumulating studies have analyzed the characteristics of GGNs in various aspects, including radiology, pathology, surgery, and molecular biology, providing information on emerging and rapidly progressing aspects of surgical treatments for GGNs (11-14). However, the distinct genomic profiles involved in GGN progression and their potential for guiding therapeutic strategies have not yet been defined. According to different clinical characteristics, GGNs could be divided into different categories. GGNs are radiologically divided into the following two categories: pure GGNs, which contain no solid components, and mixed GGNs, which contain both a pure GGO region and a consolidated region (15). Moreover, GGNs are generally divided into different categories according to the number, i.e., solitary or multiple, as well as the lepidic components, i.e., lepidic or invasive ADC (16). Whether genetic alterations are associated with the clinical characteristics remain unsolved. Here, we presented the first comprehensive description of the genomic architecture of GGNs by whole-exome deep sequencing (>1,000×) and amplicon deep sequencing (~30,000×). The comprehensive analysis, including genetic alterations, intratumor heterogeneity (ITH), frequent events, and mutational signatures, was performed for the samples. The genetic alterations associated with clinical characteristics of GGNs were analyzed. We present the study in accordance with the MDAR reporting checklist. Available at http://dx.doi.org/10.21037/tlcr-20-1086.

Methods

The study was conducted in accordance with the Declaration of Helsinki (as revised in 2013). This study was approved by the Human Research Ethics Committee of Beijing Cancer Hospital and Beijing Institute of Genomics (2015KT71), and informed consent was obtained from the patients.

Patient and samples

Consecutive patients who had been diagnosed with primary lung cancer and underwent surgical resection in Peking University Cancer Hospital (Beijing, China) between 2012 and 2017 were recruited for this retrospective study. Of these, 28 patients were selected based on having GGN in the preoperative CT examination. Primary tumors at stage IIA or greater and tumors without GGN features in the CT examination were excluded. These patients were not treated with neoadjuvant chemotherapy or radiation therapy, and all the nodules from these patients were diagnosed as early-stage LUAD (stage I). Exclusion criteria included those cases did not meet the criteria of containing at least 20% of tumor cells by pathologists, as well as those without the adjacent normal tissue specimens. This series included 30 lesions detected from 28 patients, and two of them contributed two samples respectively (P17, R31, see Table S1).

Clinical features

Histopathologic diagnoses of GGNs were according to the new IASLC/ATS/ERS multi-disciplinary. Lepidic-predominant adenocarcinomas (LEPs) and non-LEPs with predominant invasive components, such as acinar, papillary, and micropapillary ADC were defined (Table S1 and Figure S1). CT images were interpreted by two experienced thoracic radiologists in Beijing cancer hospital. GGN is defined as hazy increased attenuation of the lung tissue with preservation of the bronchovascular structures. The lesions were classified as (I) pure GGO (pGGO), no solid part of the nodule; (II) mixed GGO (mGGO), GGO with a solid part occupying less than 50% of the nodule (Table S1 and Figure S1). Clinical information including gender, age at diagnosis, smoking status, pathological TNM stage, tumor location, lymphatic invasion, visceral pleural invasion were collected for further analysis.

Whole-exome sequencing (WES)

Genomic DNA was extracted from the tumor and adjacent normal tissues using the QIAamp DNA Mini Kit (Qiagen 51304). Purified 100 ng–1 µg genomic DNA from each sample was sonicated using Covaris S220. Libraries were constructed from each sample with the Agilent SureSelectXT2 (Illumina) according to the manufacturer’s instructions and were further captured using the Agilent SureSelect Target Enrichment System (Human All Exon V6 Kit). Paired-end sequencing of 2×150 bp was performed on the Illumina HiSeq X Ten platform at Novogene. The sequencing depths of each sample are listed in Table S2.

WES data processing

Paired-end data were aligned to the human reference sequence (UCSC hg19) using the Burrows-Wheeler Aligner (17,18). All the aligned reads were further processed using Picard tools and Genome Analysis Toolkit (GATK) (19) and included deduplication, base quality recalibration, and multiple-sequence realignment prior to mutation detection.

Identification of single-nucleotide variations (SNVs) and indels

To identify somatic variations, adjacent normal tissues were used as the normal control. Somatic SNVs were identified using MuTect (19). The passed variants were further filtered using the following described criteria to obtain a more confident set of SNVs: (I) SNVs in the corresponding normal sample were filtered out; (II) the SNVs showing a mutation frequency of more than 10% in a tumor sample (double-strand support) were maintained; (III) the SNVs with a frequency of less than 10% in the tumor were filtered by Shearwater (20), an algorithm to detect high-confident variants at a low frequency. Only variants that were significantly mutated over the error model were kept, using a q value cutoff of 0.05 by multiple testing; and (IV) the dbSNP germline mutations (dbSNP138 version) were filtered out from the SNV list. VarScan2 (21) was employed to investigate somatic indels. The indels showing more than four variant reads (double-strand support) in a tumor sample and none in the corresponding normal tissue sample were kept for further analysis. Both somatic SNVs and indels were subsequently annotated by multiple databases using the ANNOVAR tool (22).

Targeted amplicon deep sequencing

We performed targeted amplicon deep sequencing to validate SNV calling from the WES data. We randomly selected 130 SNVs calling from p9, p11, p14 and p17t2. Multiple PCR primers for amplicons containing 130 target SNVs were designed by Ion AmpliSeq Designer and implemented using the Ion AmpliSeq™ Library Kit 2.0. Targeted amplifications were used to construct the next-generation sequencing libraries. The libraries were constructed using the NEBNext® Ultra™ End Repair/dA-Tailing Module and the NEBNext® Ultra™ Ligation Module according to the manufacturer’s instructions. The ligation productions were purified by AMPure XP Beads and amplified by KAPA HiFi HotStart ReadyMix. The final sequencing libraries were obtained after PCR purifications. Subsequently, paired-end sequencing of 2×150 bp was performed on the Illumina HiSeq X Ten platform to obtain a depth of ~30,000×.

Variants calling from targeted amplicon deep sequencing

Sequence reads were mapped against the human reference genome hg19 using BWA (17). The bam files were realigned and recalibrated using GATK (19). Samtools mpileup (23) were used for extracting the frequency of selected SNVs from both tumor and normal tissue samples.

Somatic copy number analysis

We used Sequenza software (24) to estimate CNAs, cellularity and ploidy in GGNs. We determined genomic instability by using the methods according to Nahar et al. (25). In brief, the genomic instability index (GII) was calculated as the fraction of the total genome which was altered with a copy change ≥1 relative to the median integer ploidy. The amplification and deletion-based genomic instability index (adGII) was defined as the fraction of the total genome affected by high-copy gains and losses (or amplification and deletions with copy change ≥2 relative to ploidy).

Determination of genome-doubling status

The genome-doubling status for each GGN was estimated by a previously published algorithm (26). In brief, the P value was obtained using 10,000 simulations with observed probabilities of copy-number events. For samples with a ploidy ≤3, a P value threshold of 0.001 was used. To avoid underestimating genome doubling in high-ploidy samples, a P value threshold of 0.05 was used for samples with a ploidy =4, and all samples were classified as being genome doubled if the ploidy exceeded 4.

Determination of the cancer cell fraction (CCF) and timing of mutations

The CCF and mutant allele copy number for a given SNV were calculated by an algorithm previously described (27). Mutations were classified as “clonal” if the 95% confidence interval of CCF exceeded 1; otherwise, mutations were classified as “subclonal”. As previously described (27), the timing of mutations relative to copy-number alteration or WGD was defined on the integer mutant allele copy-number. Briefly, in samples with WGD, mutations were classified as “pre-WGD” when the integer mutant allele copy number was ≥2, while any mutations with a mutation copy number of 1 were classified as “post-WGD”.

Mutation signature analysis

All the SNVs were categorized into six types, including C > A, C > G, C > T, T > A, T > C, and T > G. Classification of the substitutions was further refined by including flanking 5’ and 3’ bases of each mutated site. For example, T > A could be characterized as ATG > AAG when the 5' site was an A and the 3' site was a G. Considering all the possibility in the 5' and 3' bases, there would be 96 types of substitutions. Subsequently, the profiles of 96 tri-nucleotide mutational contexts for each sample were used for detecting the mutational signatures of GGNs by the R package “Mutational Patterns” (28). The correlation coefficients were calculated between the estimated signatures and the known COSMIC signatures. The known signatures showing the maximum cosine similarity were defined as the mutational signatures in GGNs.

Driver genes and regions analysis

We defined potential LUAD driver genes (n=78) as previously described (25). Significantly mutated genes in GGNs were identified using both MutSigCV2.0 (29) and dNdScv algorithms (30) with a q value cutoff of 0.1. The significance of broad and focal CNAs was assessed from the segmented data using the GISTIC 2.0 algorithm (31). We performed functional enrichment for the genes located in the recurrent CNA regions using DAVID (32).

Neoantigen prediction

POLYSOLVER (33) was employed for HLA typing. We used nonsynonymous mutations to generate a list of peptides ranging from 9–11 amino acids in length with the mutated residues. Predictions for the binding affinity of every mutant peptide and its corresponding wild-type peptide to the patient’s germline HLA alleles were performed using the NetMHCpan 4.0 algorithm (34). Candidate neoantigens were identified as those with a predicted strong or weak mutant peptide binding affinity and no binding affinity of its corresponding wild-type peptide. The clonality of neoantigens was defined on the clonal status of the corresponding mutations.

Statistical analysis

Pearson correlation coefficients were used to evaluate the association between mutated allele frequency calling from the WES and the amplicon deep sequencing, as well as the neoantigen burden and exonic mutation burden among GGNs. R2 was used to depict the squared Pearson correlation coefficient. For comparisons of pathological subtypes between genotypes, P values were based on the Wilcoxon test for categorical variables and two-sample t-test for continuous variables.

Results

Histological and radiological examination

We retrospectively collected 30 lung cancer samples from 28 patients presenting as GGNs on computed tomography (CT) scans. All patients underwent GGN surgical resection and were determined to have early-stage LUAD (stage I). These GGNs were classified into 8 pure GGNs and 22 part-solid nodules (mixed GGNs) based on chest CT results. According to the histological subtype of adenocarcinoma present in the sample, these GGNs were divided into two groups as follows: 6 lepidic-predominant GGNs, as LEPs, and 24 non-LEPs with predominant invasive components, such as acinar, papillary, and micropapillary adenocarcinoma. Comprehensive clinical information and images are provided in Table S1 and Figure S1. We examined the potential relationship between iconography and histologic patterns. Notably, the mixed GGNs tended to accompany a pathologically nonlepidic growth pattern (Fisher’s exact test, P<0.05), indicating a high correlation between radiological and histological examinations. This finding was consistent with previous observations in LUAD (35,36), which reported an association between histological subtypes and GGO features. Due to the low tumor purity shown in pathological detection (Table S1), deep sequencing must be carried out in order to achieve high sensitivity and accuracy in mutation calling.

Deep WES

WES was performed for 30 surgically resected GGNs from 28 patients (two of the patients with multifocal nodules) to obtain a depth of ~1,000× depth for tumor samples and ~300 for adjacent tissue (Table S2). The adjacent tissue in each patient was used as a matched normal control for somatic variation calling (see Methods section). A total of 130 somatic single-nucleotide variants (SNVs, frequency: 0.5–21.4%, Table S3) randomly selected from p9, p11, p14 and p17t2 were subjected to targeted deep sequencing (mean depth of 30,000×). The results validated by target deep sequencing were highly correlated with the results from WES (R2=0.964, Pearson’s correlation coefficients, Figure S2), confirming the reliability of the SNV calling. In total, we identified 4,230 SNVs and 340 indels in GGNs (see Methods section, Tables S4,S5 and ). The average number of somatic variations among the samples was 152 (range, 77–429 variations), corresponding to a median of 2.33 variations/MB and a mean of 2.54 variations/MB (range: 1.28–7.15 variations), showing a relatively low mutational burden in GGNs compared with the results of the TCGA LUAD sequencing study (37), but quite close to the burden in nonsmoking LUADs (25,38).
Figure 1

Somatic mutations in GGNs. (A) The number of diverse mutation types in each GGN. (B) The percentages of somatic mutations that were found to be clonal or subclonal in each GGN. (C) OncoPrint heatmap for mutations in LUAD-associated genes depicting the presence (color legend) or absence (gray box), clonal (thick bar) or subclonal (thin bar) status, and the type of mutation in each GGN. GGN, ground-glass nodule.

Somatic mutations in GGNs. (A) The number of diverse mutation types in each GGN. (B) The percentages of somatic mutations that were found to be clonal or subclonal in each GGN. (C) OncoPrint heatmap for mutations in LUAD-associated genes depicting the presence (color legend) or absence (gray box), clonal (thick bar) or subclonal (thin bar) status, and the type of mutation in each GGN. GGN, ground-glass nodule.

The clonality of functionally significant somatic mutations

We estimated the CCF (see Methods section) of the mutations for each sample, and identified 1,243 (27.2%) clonal mutations and 3,327 (72.8%) subclonal mutations (Tables S4,S5 and ). The percentage of subclonal mutations was employed to determine ITH, and a mean of 74.2% ITH (range, 33.8–97.3%) in GGNs was observed, which was higher than previous findings of ITH in lung cancer sequencing studies (~30% branch mutations) (39-41), but comparable to findings of the EGFR-mutant LUADs (25). We surveyed known cancer genes (see Method section) for potential driver mutations and found that 25 samples (83.3%) contained at least one variant in a gene known to be involved in LUAD (). Alterations of 18 LUAD-associated genes in certain samples were found to be subclonal, including targetable mutations in EGFR (). This result informed potential therapeutic strategies, since target subclonal alterations that present in only a proportion of cells may result in reduced treatment efficacy. We then applied both MutSigCV and dNdScv tools to identify significantly mutated genes (SMGs) with statistically higher-than-expected mutation prevalence across the entire patients (q <0.1). Both methods identified only two SMGs: EGFR (frequency: 46.7%, 14/30 samples; 11 clonal/3 subclonal; 4 male/9 females) and RBM10 (frequency: 30.0%, 9/30 samples; 6 clonal/3 subclonal; 1 male/8 females). Epidermal growth factor receptor (EGFR) is the most common therapeutically targetable driver for LUADs. In these dataset, 11 samples harbored EGFR L858R mutation (10 clonal/1 subclonal), which was a recurrent activating mutation within the EGFR kinase domain (42). RBM10 encodes an RNA-binding protein, and is subject to recurrent inactivating mutations in LUADs (43). Notably, the GGNs with RBM10 mutations tended to accompany a pathologically lepidic pattern (Fisher’s exact test, P<0.05), indicating better outcomes of patients with lepidic tumors than nonlepidic tumors.

Mutational signatures during progression of GGNs

The distinguished clonal (trunk) and subclonal (44) mutations in each sample were further used to detect the mutational signatures in GGNs. The distributions of identified mutational signatures in clonal and subclonal mutations were heterogeneous among the patients, and were also heterogeneous in different nodules within the same patient, i.e., p17t1 and p17t2 ().
Figure 2

Signature analysis in GGNs. (A) Pie charts representing the contributions of the three mutation signatures in clonal and subclonal mutations in each GGN. (B) The percentage of signature 5 is compared between clonal (n=30) and subclonal mutations (n=30) in each GGN. (C) The percentage of signature 15 is compared between clonal (n=30) and subclonal (n=30) mutations in each GGN. (D) The percentage of signature 4 is compared between clonal (n=30) and subclonal mutations (n=30) in each GGN. All P values were calculated using the Wilcoxon test. GGN, ground-glass nodule. **** indicate P<0.0001.

Signature analysis in GGNs. (A) Pie charts representing the contributions of the three mutation signatures in clonal and subclonal mutations in each GGN. (B) The percentage of signature 5 is compared between clonal (n=30) and subclonal mutations (n=30) in each GGN. (C) The percentage of signature 15 is compared between clonal (n=30) and subclonal (n=30) mutations in each GGN. (D) The percentage of signature 4 is compared between clonal (n=30) and subclonal mutations (n=30) in each GGN. All P values were calculated using the Wilcoxon test. GGN, ground-glass nodule. **** indicate P<0.0001. Among the three signatures identified in our cohorts, molecular clock-like signature 5 possessed a significantly higher proportion in clonal mutations (, P value =1.6E-07), while defective DNA mismatch repair-associated signature 15 became more dominant in subclonal mutations (, P value =2E-06). In signature 4, no significant differences were observed between trunk and branch mutations (, P value =0.09). These results revealed that different mutational processes were operative during the progression of GGNs. Notably, an incongruous mutational signature pattern was observed in one patient, l44, in which signature 15 dominantly contributed to the mutational landscape and tended to be more dominate in clonal mutations than in subclonal mutations ().

Copy number alterations in GGNs

We identified a total of 7001 CNAs at a median of 216 CNAs per sample (Figure S3), which is fewer than in the previous large-scale lung cancer study (41). We also examined the known recurrent copy number alterations in LUADs as previous described (25), and almost all the known gains and deletions were observed in at least one sample (Figure S4). We assessed a GII (defined as fraction of the genome altered by CNAs, copy change ≥1 relative to ploidy; see Methods section), and observed that the majority of tumors showed low-to-moderate genomic instability (median of 19.6% per tumor, ). A median of only 1.8% of the genome was affected by high-copy gains and losses (copy change ≥2 relative to ploidy; defined as adGII); see Methods section; ). Among all the GGNs, we found that non-LEPs harbored significantly higher GII scores than LEPs (, P=0.041, Wilcoxon test), indicating a higher degree of malignancy in non-LEPs (45,46). Increased GII scores were also detected in the mixed GGNs compared to pure GGNs, although these differences were not statistically significant (, P=0.5, Wilcoxon test).
Figure 3

Genomic instability and copy number landscape of GGNs. (A) A bar plot representing the fraction of the total genome altered with copy change ≥1 relative to median integer ploidy, which is termed as the genomic instability index (GII). (B) A bar plot representing the fraction of the total genome affected by high-copy gains and losses (amplification and deletions with copy change ≥2 relative to ploidy), which is termed as the amplification- and deletion-based genomic instability index (adGII). (C) The GIIs are compared between LEPs (n=6) and non-LEPs (n=24). (D) The GIIs are compared between mixed (n=8) and pure GGNs (n=22). All P values were calculated using the Wilcoxon test. (E) Recurrent focal copy-number amplifications in the GGNs by GISTIC 2.0 analysis. The green line indicates the significance threshold (FDR ≤0.25). (F) Recurrent focal copy-number deletions in the GGNs by GISTIC 2.0 analysis. The green line indicates the significance threshold (FDR ≤0.25). GGN, ground-glass nodule. * indicate P<0.05.

Genomic instability and copy number landscape of GGNs. (A) A bar plot representing the fraction of the total genome altered with copy change ≥1 relative to median integer ploidy, which is termed as the genomic instability index (GII). (B) A bar plot representing the fraction of the total genome affected by high-copy gains and losses (amplification and deletions with copy change ≥2 relative to ploidy), which is termed as the amplification- and deletion-based genomic instability index (adGII). (C) The GIIs are compared between LEPs (n=6) and non-LEPs (n=24). (D) The GIIs are compared between mixed (n=8) and pure GGNs (n=22). All P values were calculated using the Wilcoxon test. (E) Recurrent focal copy-number amplifications in the GGNs by GISTIC 2.0 analysis. The green line indicates the significance threshold (FDR ≤0.25). (F) Recurrent focal copy-number deletions in the GGNs by GISTIC 2.0 analysis. The green line indicates the significance threshold (FDR ≤0.25). GGN, ground-glass nodule. * indicate P<0.05. GISTIC 2.0 analysis (with a threshold of q <0.25) revealed 3 focal amplifications and 4 focal deletions recurrently altered in GGNs along with 14 recurrently altered whole arms ( and Table S6). Among the recurrent focal regions, 12q15 (23.30%), 17q25.3 (50.00%), and 20q13.33 (36.70%) amplifications and 1p36.13 (66.70%) and 11p15.5 (70.00%) deletions have been previously found in LUAD (37,47-49). Deletion events occurring at 11p15.5 and 3q29 were previously reported to be associated with asbestos-related lung cancer and lung cancer susceptibility in Koreans, respectively (50,51). Notably, 12q15 encodes an oncogene, MDM2, which has been reported as a frequently amplified gene in LUAD (43). In addition to previously reported regions, two novel recurrent deletions were identified at 3q29 (43.30%) and 9q13 (66.70%). To detect the potential functional effect of recurrently focal events, we performed Gene Ontology (GO) analysis for the genes located in the amplified and deleted regions (Table S7), respectively. Among the significant terms of amplified genes (Table S8), regulation of cell proliferation, regulation of growth and negative regulation of apoptotic process were highly associated with tumorigenesis. On the other hand, the deleted genes were enriched in some metabolic process, i.e., arachidonic acid secretion and lipid catabolic process (Table S8). These results highlight the potential functional roles of recurrent CNAs in the formation of GGNs.

Early emergence of genome doubling in GGNs

In total, 90% of the samples were estimated as aneuploidy, and the ploidy ranged from 1.5 to 6.6 (Table S9). We then applied an algorithm to identify tumors that were likely to have undergone a genome-doubling event, even if they are no longer polyploidy (see Methods section). Among all the 30 samples, 15 GGNs (50%) showed genome- doubling status (Table S9), indicating that whole-genome doubling (WGD) is a frequent event in GGNs. The inferred CCFs and timing of the mutations relative to WGD showed that 10.7–67.1% clonal mutations occurred after the WGD (defined as post-WGD mutations, see Methods section; ), indicating that WGD was a truncal event wherever present and occurred during the tumorigenesis.
Figure 4

Genome doubling event in GGNs. (A) The percentages of pre- or post-WGD clonal mutations in samples with WGD events. (B) The GIIs are compared between WGD (n=15) and non-WGD samples (n=15). (C) The adGIIs are compared between WGD (n=15) and non-GD samples (n=15). (D) The degree of mutational ITH is compared between WGD (n=15) and non-GD samples (n=15). All the P values were calculated using the Wilcoxon test. GGN, ground-glass nodule; WGD, whole-genome doubling. *** indicate P<0.001; **** indicate P<0.0001.

Genome doubling event in GGNs. (A) The percentages of pre- or post-WGD clonal mutations in samples with WGD events. (B) The GIIs are compared between WGD (n=15) and non-WGD samples (n=15). (C) The adGIIs are compared between WGD (n=15) and non-GD samples (n=15). (D) The degree of mutational ITH is compared between WGD (n=15) and non-GD samples (n=15). All the P values were calculated using the Wilcoxon test. GGN, ground-glass nodule; WGD, whole-genome doubling. *** indicate P<0.001; **** indicate P<0.0001. Significantly higher GII and adGII scores were observed in genome-doubled samples compared to nondoubled samples (Wilcoxon test, P value <0.05, ), indicating that WGD was associated with significantly higher genomic instability. This result was in agreement with the accumulating evidence showing that genome-doubling events are associated with the propagation of genome instability (26,52,53). There is no evidence that WGD was associated with clinical characteristics (solid portions, Fisher’s exact test, P=0.94; non-Lepidic growth pattern, Fisher’s exact test, P=0.073), RBM10 mutation (Fisher’s exact test, P=1), and EGFR mutation (Fisher’s exact test, P=0.27). However, there were significant associations between WGD events and mutational ITH, as the proportion of subclonal mutations was significantly lower in the WGD samples than in the non-WGD samples (, P=9.7E-05, Wilcoxon test), which suggests a relatively longer trunk in the WGD samples.

Association between mutation burden and clinical characteristics in GGNs

To investigate whether there are differences in the mutation burden between GGNs with distinct clinical features, we further classified these samples into different groups based on iconography and histologic patterns (Table S1). We observed that mixed GGNs occupied significantly more exonic mutations than pure GGNs (, P=0.017, Wilcoxon test). Previous observations in clinical cases showed the progression from nonsolid to part-solid nodules (10,54), which was consistent with the fact that more mutations accumulated in mixed GGNs than in pure GGNs. This result may reflect the fact that pure GGNs are found at an earlier stage of carcinogenesis. Likewise, a significantly increased exonic mutation burden was also observed in nonlepidic GGNs compared to lepidic GGNs (, P=0.0064, Wilcoxon test), indicating that a lower mutation burden was highly associated with the lepidic growth pattern. Moreover, when we divided each pure and mixed sample into sub-groups with lepidic or nonlepidic growth patterns, the statistical significance of the differences in the exonic mutation burden was observed between mixed GGNs with non-LEP patterns and pure GGNs with LEP patterns (, P<0.05, Wilcoxon test). Previous studies have shown the positive association between the mutation burden and patient survival in the setting of anti-PD-1 therapy (55,56). This association highlights the potential of immunotherapeutic strategies for a subset of lung cancers.
Figure 5

Mutation burden in GGNs. (A) The exonic mutation burdens are compared between mixed (n=8) and pure GGNs (n=22). (B) The exonic mutation burdens are compared between lepidic (n=6) and non-lepidic growth GGNs (n=24). P values are calculated using Wilcoxon test. (C) The exonic mutation burdens are compared among four groups, including pure GGNs with lepidic growth (n=4), pure GGNs with non-lepidic growth (n=3), mixed GGNs with lepidic growth (n=2), and mixed GGNs with non-lepidic growth (n=21). P values are calculated using Kruskal-wallis test. (D) The numbers of clonal neoantigens are compared between LEPs (n=6) and non-LEPs (n=24). (E) The numbers of clonal neoantigens are compared between mixed (n=8) and pure GGNs (n=22). GGN, ground-glass nodule; LEPs, lepidic-predominant adenocarcinomas. * indicate P<0.05; ** indicate P<0.01.

Mutation burden in GGNs. (A) The exonic mutation burdens are compared between mixed (n=8) and pure GGNs (n=22). (B) The exonic mutation burdens are compared between lepidic (n=6) and non-lepidic growth GGNs (n=24). P values are calculated using Wilcoxon test. (C) The exonic mutation burdens are compared among four groups, including pure GGNs with lepidic growth (n=4), pure GGNs with non-lepidic growth (n=3), mixed GGNs with lepidic growth (n=2), and mixed GGNs with non-lepidic growth (n=21). P values are calculated using Kruskal-wallis test. (D) The numbers of clonal neoantigens are compared between LEPs (n=6) and non-LEPs (n=24). (E) The numbers of clonal neoantigens are compared between mixed (n=8) and pure GGNs (n=22). GGN, ground-glass nodule; LEPs, lepidic-predominant adenocarcinomas. * indicate P<0.05; ** indicate P<0.01. Some of the mutations could create neoantigens, which are foreign to immune systems and capable of inducing antitumor immune responses. To investigate the neoantigen landscape in GGNs, we predicted neoantigens among the patients (Figure S5). The neoantigen burden was highly associated with the exonic mutation burden (Figure S6), which was consistent with previous studies (57,58). Likewise, we observed significantly higher clonal neoantigen burden in non-LEPs than in LEPs (, P=0.036, Wilcoxon test). We also observed a trend toward higher clonal neoantigen burden in mixed GGNs than in pure GGNs, although the differences between the two groups did not reach statistical significance (, P=0.17, Wilcoxon test). Previous research suggested that neoantigen heterogeneity influences immune surveillance and support therapeutic developments targeting clonal neoantigens (58). This research highlights that there might be more neoantigens to be targetable and effective in a subset of lung cancer.

Common or independent origins among GGNs within patients

Whether multiple GGNs represent as independent origination may influence the treatment and prognosis. We detected the evolutionary relationships between the multifocal GGNs in our dataset. Patient r31 presented with two GGNs, including t1 in right middle lobe and t2 in right upper lobe. The r31t1 and r31t2 displayed as mixed GGNs with different major histological subtypes, presenting as non-LEP and LEP dominant, respectively. These two lesions had no mutations in common, indicating that they are independently originated. Three known LUAD-associated gene, EGFR, RBM10, and EPHA2, were found to be mutated solely in r31t2 (Table S4). Patient p17 harbored two pure GGNs with different major histological subtypes, presenting as non-LEP and LEP dominant, respectively. The p17t1 (in superior segment of left lower lobe) and p17t2 (in basal segment of left lower lobe) shared 16 exonic mutations (Table S4), including EGFR L858R mutation, indicating that they originated from a common ancestor and that intrapulmonary metastasis has occurred even in early stage of lung cancer, in agreement with the previous report by Li et al. (59). Furthermore, amplification of an oncogene MYCL1 and WGD event were also detected in the two lesions.

Discussion

Through >1,000× WES and ~30,000× amplicon deep sequencing, we have, for the first time, characterized the genomic landscape of GGNs. Despite the relatively low somatic mutation burden (a median of 2.33 variations/MB and a mean of 2.54 variations/MB) compared to the TCGA LUAD sequencing study (37), GGNs exhibited high ITH characterized by the proportion of subclonal mutations (range, 33.8–97.3%) in each sample. Subclonal mutations were found in certain LUAD-associated genes, i.e., the targetable mutation in EGFR, indicating the limitation of target therapy for lung cancer. Moreover, 10.7–67.1% clonal mutations occurred after WGD illustrating that the WGD was a frequently truncal event in GGNs, consistent with findings from a non-small cell lung cancer study (41). We identified two significantly mutated genes, EGFR and RBM10, across the GGNs, which were previously identified as potential drivers in LUAD (43). EGFR mutations were the most frequently observed in other studies of ground-glass nodular LUAD (60-62). RBM10 encodes an RNA-binding protein, and is subject to recurrent inactivating mutations in LUADs (43). A previous study in pancreatic ductal adenocarcinoma found that RBM10 mutations were associated with longer survival despite histological features of aggressive disease (63), subsequently, the high frequency of RBM10 mutations might be related to good prognosis of the patients with GGNs. Notably, the GGNs with RBM10 mutations tended to accompany a pathologically lepidic pattern (Fisher’s exact test, P<0.05), indicating better outcomes of patients with lepidic tumors than nonlepidic tumors. In agreement with our finding, patients with lepidic-predominate tumors showed better overall survival than patients with nonlepidic-predominate tumors (64-66). Our analysis of CNAs revealed significantly altered regions, including 5 known regions in LUAD, as well as two novel recurrent deletions at 3q29 and 9q13. Significantly amplified genes were overrepresented in functional terms associated with tumorigenesis, indicating the oncogenic potential of CNAs in the formation of GGNs. These putative drivers could serve as potential therapeutic targets to facilitate clinical therapy. Mutations arising during the carcinogenesis of GGNs tended to accumulate in a clock-like manner, whereas the process of defective DNA mismatch repair was largely associated with genetic heterogeneity within GGNs. In agreement with recent studies in LUAD and melanoma sequencing studies (39,40,67), we also detected diverse mutational signatures during GGN progression. The clonal (trunk) mutations accumulated in a clock-like manner, whereas the reduction in clock-like signature 5 was observed in branch mutations. In the subclonal mutations, the signature associated with defective DNA mismatch repair was significantly dominant. None of the patients in this study received any systemic treatment prior to surgical removal of tumors; therefore, the switch in mutational processes was probably due to evolutionary changes occurring during GGN progression. An unexpected observation was the high ITH within GGNs, as all the patients in this study were diagnosed with early-stage LUAD. In addition, relatively few putative driver mutations have been identified in individual tumors (). Although our research cannot show all the intermediate states before the observed ITH, it indeed indicated the evolutionary trajectory of GGNs. That is, on the background of low-mutation rates, a tumor-initiating cell population acquired mutations in a clock-like manner, and once a driver occurred, i.e., EGFR or RBM10 mutation, it was sufficient to allow a rapid expansion to produce a high ITH and numerous intermixed subclones, as suggested by the big bang model in colorectal cancer (68). Relatively short trunks (a mean of 25.8% clonal mutations) and early diversification observed in GGNs were more likely due to a single expansion rather than selective sweeps, which may reflect the early stage of carcinogenesis in LUAD. However, long-term progress of GGNs might further result in the acquisition of new driver mutations followed by selective sweeps and large clonal expansions. In this scenario, the tumor population could exhibit a decrease in ITH, as ~30% ITH was observed in other lung cancer studies (39-41). The association between genetic variations and clinical characteristics were observed in this study. Mutation burden was highly associated with both solid portions and lepidic growth pattern (), and significantly differences in mutation burden was observed between mixed GGNs with a non-LEP pattern and pure GGNs with an LEP pattern (). We also observed a trend toward significantly higher clonal neoantigen burden in non-LEPs than in LEPs (). Both of mutation burden and clonal neoantigen burden could influence the response of patients to immune checkpoint inhibitors (55,56). Our results suggested nonlepidic-predominate lung cancer might be more sensitive to checkpoint blockade immunotherapy than lepidic-predominate lung cancer. Moreover, non-LEPs harbored significantly higher genomic instability than that of LEPs, which was characterized by GII scores (, P=0.041, Wilcoxon test), indicating a higher degree of malignancy in non-LEPs than in LEPs (45,46). Previous studies suggested that nonlepidic tumors were more malignant than lepidic tumors because patients with lepidic-predominate tumors showed better overall survival than patients with nonlepidic-predominate tumors (64-66), which was in agreement with the higher genomic instability observed in non-LEPs than in LEPs. Although the GGNs with RBM10 mutations tended to accompany a pathologically lepidic pattern (Fisher’s exact test, P=0.049), there is no evidence that RBM10 was associated with solid portions (Fisher’s exact test, P=1). Previous studies reported patients with lepidic-predominate tumors showed better overall survival than patients with nonlepidic-predominate tumors (64-66), which highlighted RBM10 may drive the distinct subtype of lung cancer with better prognosis. Moreover, we did not observed EGFR was associated with either solid portions (Fisher’s exact test, P=0.69) or lepidic pattern (Fisher’s exact test, P=1). The main limitations of this study are (I) small sample size and (II) relative low tumor purity among GGN samples. The purpose of this study is trying to interrogate the subtle genetic changes in very early stage of lung cancer so we selected GGN samples with solid part of the nodules occupying less than 50% for special purpose, and spontaneously, pure GGN or mixed GGN are with low tumor purity inevitably. Only 28 patients who met our sample criterion were selected from our lung cancer cohort, and the small sample size decreased the statistic power during the stratified analysis so more samples are required for further validation against the results of this study. In summary, we performed the first comprehensive genomic analysis for GGNs, providing insights into early events, frequent alterations and mutational processes during GGN evolution; we also determined genetic differences among clinical subtypes. Studies will move toward systematically integrating myriad of aspects of the GGN genome, including the interrelationships among multiple molecular levels, the interplay between somatic and germline variations as well as the tumor microenvironment and the immune system.

Data policy

The sequence data reported in this paper have been deposited in the Genome Sequence Archive (69) in BIG Data Center (70), Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, under accession numbers HRA000044 that are publicly accessible at http://bigd.big.ac.cn/gsa-human. The article’s supplementary files as
  70 in total

Review 1.  Mutational Landscape and Sensitivity to Immune Checkpoint Blockers.

Authors:  Roman M Chabanon; Marion Pedrero; Céline Lefebvre; Aurélien Marabelle; Jean-Charles Soria; Sophie Postel-Vinay
Journal:  Clin Cancer Res       Date:  2016-07-07       Impact factor: 12.531

2.  Stepwise evolution from a focal pure pulmonary ground-glass opacity nodule into an invasive lung adenocarcinoma: an observation for more than 10 years.

Authors:  Ji Hye Min; Ho Yun Lee; Kyung Soo Lee; Joungho Han; Keunchil Park; Myung-Ju Ahn; Su-Jin Lee
Journal:  Lung Cancer       Date:  2010-05-16       Impact factor: 5.705

3.  Identification of specific gene copy number changes in asbestos-related lung cancer.

Authors:  Penny Nymark; Harriet Wikman; Salla Ruosaari; Jaakko Hollmén; Esa Vanhala; Antti Karjalainen; Sisko Anttila; Sakari Knuutila
Journal:  Cancer Res       Date:  2006-06-01       Impact factor: 12.701

4.  Subtype Classification of Lung Adenocarcinoma Predicts Benefit From Adjuvant Chemotherapy in Patients Undergoing Complete Resection.

Authors:  Ming-Sound Tsao; Sophie Marguet; Gwénaël Le Teuff; Sylvie Lantuejoul; Frances A Shepherd; Lesley Seymour; Robert Kratzke; Stephen L Graziano; Helmut H Popper; Rafael Rosell; Jean-Yves Douillard; Thierry Le-Chevalier; Jean-Pierre Pignon; Jean-Charles Soria; Elisabeth M Brambilla
Journal:  J Clin Oncol       Date:  2015-04-27       Impact factor: 44.544

5.  Multiregion Whole-Exome Sequencing Uncovers the Genetic Evolution and Mutational Heterogeneity of Early-Stage Metastatic Melanoma.

Authors:  Katja Harbst; Martin Lauss; Helena Cirenajwis; Karolin Isaksson; Frida Rosengren; Therese Törngren; Anders Kvist; Maria C Johansson; Johan Vallon-Christersson; Bo Baldetorp; Åke Borg; Håkan Olsson; Christian Ingvar; Ana Carneiro; Göran Jönsson
Journal:  Cancer Res       Date:  2016-05-23       Impact factor: 12.701

6.  Sequenza: allele-specific copy number and mutation profiles from tumor sequencing data.

Authors:  F Favero; T Joshi; A M Marquard; N J Birkbak; M Krzystanek; Q Li; Z Szallasi; A C Eklund
Journal:  Ann Oncol       Date:  2014-10-15       Impact factor: 32.976

7.  Clinical, pathological, and radiological characteristics of solitary ground-glass opacity lung nodules on high-resolution computed tomography.

Authors:  Zhi-Xin Qiu; Yue Cheng; Dan Liu; Wei-Ya Wang; Xia Wu; Wei-Lu Wu; Wei-Min Li
Journal:  Ther Clin Risk Manag       Date:  2016-09-20       Impact factor: 2.423

8.  Fast and accurate long-read alignment with Burrows-Wheeler transform.

Authors:  Heng Li; Richard Durbin
Journal:  Bioinformatics       Date:  2010-01-15       Impact factor: 6.937

9.  Mutational heterogeneity in cancer and the search for new cancer-associated genes.

Authors:  Michael S Lawrence; Petar Stojanov; Paz Polak; Gregory V Kryukov; Kristian Cibulskis; Andrey Sivachenko; Scott L Carter; Chip Stewart; Craig H Mermel; Steven A Roberts; Adam Kiezun; Peter S Hammerman; Aaron McKenna; Yotam Drier; Lihua Zou; Alex H Ramos; Trevor J Pugh; Nicolas Stransky; Elena Helman; Jaegil Kim; Carrie Sougnez; Lauren Ambrogio; Elizabeth Nickerson; Erica Shefler; Maria L Cortés; Daniel Auclair; Gordon Saksena; Douglas Voet; Michael Noble; Daniel DiCara; Pei Lin; Lee Lichtenstein; David I Heiman; Timothy Fennell; Marcin Imielinski; Bryan Hernandez; Eran Hodis; Sylvan Baca; Austin M Dulak; Jens Lohr; Dan-Avi Landau; Catherine J Wu; Jorge Melendez-Zajgla; Alfredo Hidalgo-Miranda; Amnon Koren; Steven A McCarroll; Jaume Mora; Brian Crompton; Robert Onofrio; Melissa Parkin; Wendy Winckler; Kristin Ardlie; Stacey B Gabriel; Charles W M Roberts; Jaclyn A Biegel; Kimberly Stegmaier; Adam J Bass; Levi A Garraway; Matthew Meyerson; Todd R Golub; Dmitry A Gordenin; Shamil Sunyaev; Eric S Lander; Gad Getz
Journal:  Nature       Date:  2013-06-16       Impact factor: 49.962

10.  Genomic alterations of ground-glass nodular lung adenocarcinoma.

Authors:  Hyun Lee; Je-Gun Joung; Hyun-Tae Shin; Duk-Hwan Kim; Yujin Kim; Hojoong Kim; O Jung Kwon; Young Mog Shim; Ho Yun Lee; Kyung Soo Lee; Yoon-La Choi; Woong-Yang Park; D Neil Hayes; Sang-Won Um
Journal:  Sci Rep       Date:  2018-05-16       Impact factor: 4.379

View more
  1 in total

1.  Construction and evaluation of a nomogram for predicting survival in patients with lung cancer.

Authors:  Jin Ouyang; Zhijian Hu; Jianlin Tong; Yong Yang; Juan Wang; Xi Chen; Ting Luo; Shiqun Yu; Xin Wang; Shaoxin Huang
Journal:  Aging (Albany NY)       Date:  2022-03-23       Impact factor: 5.682

  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.