| Literature DB >> 25031567 |
Eun-Hye Kim1, Sunghoon Lee1, Jongsun Park2, Kyusang Lee3, Jong Bhak4, Byung Chul Kim5.
Abstract
We present a new next-generation sequencing-based method to identify somatic mutations of lung cancer. It is a comprehensive mutation profiling protocol to detect somatic mutations in 30 genes found frequently in lung adenocarcinoma. The total length of the target regions is 107 kb, and a capture assay was designed to cover 99% of it. This method exhibited about 97% mean coverage at 30× sequencing depth and 42% average specificity when sequencing of more than 3.25 Gb was carried out for the normal sample. We discovered 513 variations from targeted exome sequencing of lung cancer cells, which is 3.9-fold higher than in the normal sample. The variations in cancer cells included previously reported somatic mutations in the COSMIC database, such as variations in TP53, KRAS, and STK11 of sample H-23 and in EGFR of sample H-1650, especially with more than 1,000× coverage. Among the somatic mutations, up to 91% of single nucleotide polymorphisms from the two cancer samples were validated by DNA microarray-based genotyping. Our results demonstrated the feasibility of high-throughput mutation profiling with lung adenocarcinoma samples, and the profiling method can be used as a robust and effective protocol for somatic variant screening.Entities:
Keywords: high-throughput nucleotide sequencing; lung neoplasms; next-generation sequencing; selector technology; somatic mutation screening; target enrichment
Year: 2014 PMID: 25031567 PMCID: PMC4099348 DOI: 10.5808/GI.2014.12.2.50
Source DB: PubMed Journal: Genomics Inform ISSN: 1598-866X
30× coverage of individual target genes after deep sequencing
aPercentage of the sequenced bases in each target genes at 30×.
Target capture specificity analyzed by qPCR
qPCR, real-time quantitative PCR.
aProportion of the target DNA amount after enrichment, estimated by measuring the relative amounts of target and non-target DNA in qPCR reactions; bStandard deviation of the estimated specificity (n = 3); cPercentage of the standard variation when divided by the estimated specificity.
Mapping statistics of next-generation sequencing experiments
aPercentage of the total number of reads aligned to the human reference genome; bPercentage of the uniquely aligned reads to the region of interest (ROI).
Fig. 1Target coverage in cancer and normal samples. The cumulative coverage of targeted bases (i.e., the fraction of all sequenced bases in the target regions that share more than a particular read depth) were plotted after sequencing 4.34 Gb of H-1650 (black), 6.03 Gb of H-23 (blue), and 3.25 Gb of NA17022 (red). The sequencing yield in the three samples resulted in 30× coverage of 92% (H-1650), 95% (H-23), and 97% (NA17022) of all target regions.
Somatic variation candidates from the target gene regions
SNV, single nucleotide variation; nsSNV, non-synonymous single nucleotide variation.
Comparison of SNV calls with DNA microarray genotyping results
Values are presented as number (%).
SNV, single nucleotide variation.
aTotal number of genetic loci in the target region that DNA microarray can genotype; bTotal number of sequenced bases overlapping with DNA microarray genotyping data; cHomozygous genotypes concordant with the microarray genotyping results; dHomozygous genotypes different from the microarray genotyping results; eHeterozygous genotypes concordant with the microarray genotyping results; fHeterozygous genotypes different from the microarray genotyping results.