| Literature DB >> 29039585 |
Guiyuan Li1, Yunqing Mei2, Fan Yang3, Shengming Yi1, Lemin Wang4.
Abstract
Lung adenocarcinoma is one of the types of non‑small cell lung carcinoma, which tends to be treated with surgical therapy rather than radiation therapy. It occurs in smokers and non‑smokers, and is the most common form of lung cancer among non‑smokers and women. Gene rearrangements, including ALK, ROS1 and RET, and gene mutations, including epidermal growth factor receptor (EGFR), HER2, Kristen rat sarcoma viral oncogene homolog, BRAF, phosphoinositide‑3‑kinase, catalytic, α polypeptide and MET, have been identified in lung adenocarcinoma, which enable targeted therapy in lung adenocarcinoma, for example erlotinib, gefitinib and afatinib, which are EGFR inhibitors. The aim of the present study was to further investigate genome variations in lung adenocarcinoma. Single nucleotide polymorphisms (SNPs), insertions and deletions (InDels), structural variations (SVs) and copy number variations (CNVs) were identified in the whole genome from four patients with adenocarcinoma using a whole genome re‑sequencing method performed on the Illumina HiSeq Xten platform. In total, ~415 GB of clean reads were obtained, the average sequencing depth was 31.10‑fold, and 99.29% of the reference genome was covered by the clean reads. An average of 3,364,270 SNPs was identified, 98.76% of which were matched to the SNP database (dbSNP), and an average of 453,547 InDels were identified, 28.28% of which were in the dbSNP. The present study also identified a total of 13,050 SVs and 886 CNVs. The majority of the SVs were deletions (74.25%) and the major CNVs were in intergenic regions and coding sequence regions. In conclusion, the results of the present study generated an output of the genome alterations in lung adenocarcinoma, and provided a foundation for further investigation of the pathogenesis of lung adenocarcinoma.Entities:
Mesh:
Year: 2017 PMID: 29039585 PMCID: PMC5780004 DOI: 10.3892/mmr.2017.7805
Source DB: PubMed Journal: Mol Med Rep ISSN: 1791-2997 Impact factor: 2.952
Patient information.
| Patient | Sex | Age (years) | Cancer | Stage | Tumor sample |
|---|---|---|---|---|---|
| YJY | Female | 67 | Lung adenocarcinoma | IV | Primary tumor |
| GMY | Female | 75 | Lung adenocarcinoma | IV | Primary tumor |
| ZCG | Male | 65 | Lung adenocarcinoma | IV | Primary tumor |
| JLY2 | Male | 52 | Lung adenocarcinoma | IV | Primary tumor |
Quality of sequencing data.
| Sample | Raw reads (n) | Raw data (G) | Clean reads (n) | Effective (%) | Q20 (%) | Q30 (%) | GC (%) |
|---|---|---|---|---|---|---|---|
| YJY | 621,362,127 | 93.20 | 620,492,220 | 99.86 | 96.27; 93.27 | 90.88; 85.32 | 42.05; 41.99 |
| GMY | 695,467,557 | 104.32 | 694,563,450 | 99.87 | 96.53; 93.07 | 91.35; 86.04 | 40.93; 40.93 |
| ZCG | 680,810,250 | 102.12 | 679,993,278 | 99.88 | 96.43; 94.08 | 91.26; 86.94 | 43.71; 43.66 |
| JLY2 | 775,624,875 | 116.34 | 774,771,688 | 99.89 | 96.55; 93.63 | 91.38; 86.22 | 40.95; 40.92 |
Summary of sequenced reads aligned to the reference genome of hg19.
| Sample | YJY | GMY | ZCG | JLY2 |
|---|---|---|---|---|
| Total | 620,492,220 (100%) | 694,563,450 (100%) | 67,9993,278 (100%) | 774771688 (100%) |
| Duplicate | 62,418,141 (10.10%) | 83,850,807 (12.12%) | 78,946,241 (11.66%) | 98027674 (12.70%) |
| Mapped | 617,787,854 (99.56%) | 691,799,013 (99.60%) | 677,069,702 (99.57%) | 771,860,901 (99.62%) |
| Properly mapped | 598,003,088 (96.38%) | 675,101,840 (97.20%) | 664,436,740 (97.71%) | 753,442,342 (97.25%) |
| PE mapped | 615,682036 (99.22%) | 689,578,896 (99.28%) | 674,885,134 (99.25%) | 769,555,296 (99.33%) |
| SE mapped | 4,211,636 (0.68%) | 4,440,234 (0.64%) | 4,369,136 (0.64%) | 4,611,210 (0.60%) |
| With mate mapped to a different chromosome | 3,958,804 (0.64%) | 2,711,738 (0.39%) | 3,465,530 (0.51%) | 2,738,078 (0.35%) |
| With mate mapped to a different chromosome [(mapQ ≥5)] | 2,839,756 (0.46%) | 1,788,585 (0.26%) | 2,347,772 (0.35%) | 1,691,563 (0.22%) |
| Average sequencing depth | 28.31 | 31.08 | 30.55 | 34.47 |
| Coverage | 99.66% | 98.96% | 99.62% | 98.93% |
| Coverage at least 4X | 99.32% | 98.69% | 99.19% | 98.70% |
| Coverage at least 10X | 97.00% | 97.93% | 96.07% | 98.15% |
| Coverage at least 20X | 79.91% | 91.11% | 76.60% | 94.19% |
PE, paired-ended reads; SE, single-ended reads; mapQ, map quality which is the effective reading criteria.
Figure 1.Statistical results of sequence depth and coverage of bases in each sample. (A) Statistical results of sequence depths; the proportion of bases at different sequence depths are shown on the left and the cumulative proportion of bases at different sequence depths are shown on the right. (B) Mean depth (left) and proportion of covered bases (right) of each chromosome. Gray line represents the corresponding autosomal and sex chromosomes in the reference genome of UCSC hg19.
Figure 2.Detection results of single nucleotide polymorphisms in each sample. (A) YJY; (B) GMY; (C) ZCG; and (D) JLY2.
Figure 3.Detection results of insertions and deletions in each sample. (A) YJY; (B) GMY; (C) ZCG; and (D) JLY2.
Statistics of single nucleotide polymorphisms for high quality reads from YJY, GMY, ZCG and JLY2 mapped onto the reference genome of hg19.
| Sample | YJY | GMY | ZCG | JLY2 |
|---|---|---|---|---|
| Total | 3,346,792 | 3,387,147 | 3,334,068 | 3,389,071 |
| Heterozygote | 1,890,012 | 1,951,091 | 1,881,111 | 1,961,901 |
| Homozygote | 1,456,780 | 1,436,056 | 1,452,957 | 1,427,170 |
| Transition | 2,269,999 | 2,296,037 | 2,263,437 | 2,297,128 |
| Transversion | 1,076,793 | 1,091,110 | 1,070,631 | 1,091,943 |
| ts/tv | 2.11 | 2.10 | 2.11 | 2.10 |
| dbSNP percentage | 3,306,030 (98.78%) | 3,345,528 (98.77%) | 3,292,374 (98.75%) | 3,346,895 (98.76%) |
| Novel | 40,762 | 41,619 | 41,694 | 42,176 |
| Novel ts | 26,861 | 27,531 | 27,471 | 27,834 |
| Novel tv | 13,901 | 14,088 | 14,223 | 14,342 |
| Novel ts/tv | 1.93 | 1.95 | 1.93 | 1.94 |
ts, transition; tv, transversion; dbSNP, single nucleotide polymorphism database.
Statistics of insertions and deletions for high quality reads from YJY, GMY, ZCG and JLY2 mapped onto the reference genome of hg19.
| Sample | YJY | GMY | ZCG | JLY2 |
|---|---|---|---|---|
| Total | 443,118 | 461,393 | 436,058 | 473,617 |
| Heterozygote | 191,092 | 202,768 | 187,294 | 207,839 |
| Homozygote | 252,026 | 258,625 | 248,764 | 265,778 |
| dbSNP percentage | 126,222 (28.48%) | 130,260 (28.23%) | 124,274 (28.50%) | 132,201 (27.91%) |
| Novel | 316,896 | 331,133 | 311,784 | 341,416 |
dbSNP, single nucleotide polymorphism database.
Statistics of structural variations for high quality reads from YJY, GMY, ZCG and JLY2 mapped onto the reference genome of hg19.
| Sample | VarType | Total | CDS | Splicing | UTR5 | UTR3 | Intron | Upstream | Downstream | ncRNA | Intergenic | Unknown |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| YJY | Insertion | 295 | 5 | 0 | 1 | 1 | 114 | 2 | 3 | 16 | 153 | 0 |
| Inversion | 150 | 33 | 0 | 0 | 0 | 32 | 1 | 0 | 12 | 72 | 0 | |
| Deletion | 2,193 | 49 | 2 | 2 | 5 | 748 | 12 | 20 | 79 | 1,276 | 0 | |
| Translocation | 298 | 3 | 0 | 1 | 4 | 83 | 3 | 0 | 14 | 190 | 0 | |
| GMY | Inversion | 130 | 30 | 0 | 0 | 0 | 24 | 1 | 0 | 13 | 62 | 0 |
| Deletion | 2,390 | 51 | 0 | 2 | 4 | 809 | 14 | 15 | 92 | 1,403 | 0 | |
| Insertion | 176 | 4 | 0 | 0 | 1 | 71 | 3 | 3 | 8 | 86 | 0 | |
| Translocation | 300 | 1 | 0 | 0 | 10 | 92 | 3 | 0 | 17 | 177 | 0 | |
| ZCG | Deletion | 2,220 | 52 | 2 | 3 | 5 | 807 | 24 | 20 | 77 | 1,230 | 0 |
| Inversion | 133 | 49 | 0 | 0 | 0 | 29 | 1 | 1 | 11 | 42 | 0 | |
| Insertion | 376 | 7 | 0 | 1 | 0 | 156 | 1 | 4 | 19 | 188 | 0 | |
| Translocation | 378 | 4 | 0 | 1 | 5 | 110 | 3 | 0 | 19 | 236 | 0 | |
| JLY2 | Deletion | 2,886 | 57 | 1 | 3 | 8 | 1,002 | 13 | 24 | 94 | 1,684 | 0 |
| Insertion | 631 | 9 | 0 | 1 | 2 | 243 | 6 | 7 | 28 | 335 | 0 | |
| Inversion | 138 | 29 | 0 | 1 | 0 | 36 | 1 | 0 | 10 | 61 | 0 | |
| Translocation | 356 | 3 | 0 | 1 | 6 | 101 | 2 | 0 | 18 | 225 | 0 |
VarType, type of variation; CDS, coding sequence; UTR, untranslated region; ncRNA, non-coding RNA.
Statistics of copy number variations for high quality reads from YJY, GMY, ZCG and JLY2 mapped onto the reference genome of hg19.
| Sample | VarType | Total | CDS | Splicing | UTR5 | UTR3 | Intron | Upstream | Downstream | ncRNA | Intergenic | Unknown |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| YJY | Loss | 96 | 17 | 0 | 2 | 1 | 16 | 0 | 0 | 4 | 56 | 0 |
| Gain | 67 | 24 | 0 | 0 | 0 | 2 | 0 | 1 | 4 | 36 | 0 | |
| GMY | Gain | 82 | 22 | 0 | 0 | 0 | 7 | 3 | 1 | 8 | 41 | 0 |
| Loss | 195 | 13 | 0 | 1 | 1 | 35 | 2 | 1 | 9 | 133 | 0 | |
| ZCG | Gain | 74 | 21 | 0 | 1 | 1 | 4 | 1 | 0 | 9 | 37 | 0 |
| Loss | 106 | 22 | 0 | 1 | 1 | 13 | 2 | 1 | 5 | 61 | 0 | |
| JLY2 | Gain | 88 | 22 | 0 | 0 | 0 | 7 | 2 | 1 | 9 | 47 | 0 |
| Loss | 178 | 20 | 0 | 1 | 1 | 33 | 1 | 3 | 7 | 112 | 0 |
VarType, type of variation; CDS, coding sequence; UTR, untranslated region; ncRNA, non-coding RNA.