| Literature DB >> 28870239 |
Jamie K Teer1, Yonghong Zhang2, Lu Chen3, Eric A Welsh2, W Douglas Cress3, Steven A Eschrich2, Anders E Berglund2.
Abstract
BACKGROUND: Observations of recurrent somatic mutations in tumors have led to identification and definition of signaling and other pathways that are important for cancer progression and therapeutic targeting. As tumor cells contain both an individual's inherited genetic variants and somatic mutations, challenges arise in distinguishing these events in massively parallel sequencing datasets. Typically, both a tumor sample and a "normal" sample from the same individual are sequenced and compared; variants observed only in the tumor are considered to be somatic mutations. However, this approach requires two samples for each individual.Entities:
Keywords: Cancer genomics; Next-generation sequencing; Precision medicine; Somatic mutation
Mesh:
Year: 2017 PMID: 28870239 PMCID: PMC5584341 DOI: 10.1186/s40246-017-0118-2
Source DB: PubMed Journal: Hum Genomics ISSN: 1473-9542 Impact factor: 4.639
Fig. 1Overview of cohorts. Cohort description and sample counts for the TGS (a) and WES (b) cohorts
Fig. 2Tumor-only mutation counts with filtering. a Boxplot showing numbers of mutations detected in the TGS cohort using tumor-only methods after each filtering step (left) and using matched tumor-normal methods on 182 sample pairs (right). b Boxplot showing numbers of mutations detected in the WES cohort using tumor-only methods after each filtering step (left) and using matched tumor-normal methods (right). c Boxplot demonstrating that in the TGS cohort, analyzing the normal samples independent of the tumor samples results in reduced ability to remove potential artifacts. GATK variant detection on all tumor and normal samples together, followed by isolation of the normal subset to annotate the tumor samples, results in the removal of more potential artifacts. Median counts are indicated by the dark line in the middle of the box. The bottom and top of the box are the first and third quartiles, respectively. The whiskers represent the most extreme points within 1.5 times the interquartile range. The y-axes are in a log scale
Fig. 3Recall and precision of tumor-only and matched tumor-normal mutation detection. Recall and precision of methods compared to MuTect in TGS (a) and WES (b). Distributions are represented with box plots, and individual data points are plotted as asterisks. Fraction of COSMIC mutations detected by c matched tumor-normal and tumor-only methods within 182 TGS samples (left) and all TGS samples using tumor-only methods (right). d 250 WES samples. Shading indicates the number of times the mutation was observed in the COSMIC v61 database. e TGS alternate allele fraction and accuracy of KRAS G12/G13/Q61 mutations initially discovered by capillary sequencing. Not shown are the seven mutations detected in the TGS but not capillary sequencing
Fig. 4Somatic mutation rates across different tissue types using the tumor-only method. Boxplot of mutation rates for tissue sites of origin in a TGS (sites with more than 50 samples, a total of 3035 samples) and b WES. c The colored dots identify the samples with the indicated POLE exonuclease domain mutation. d Homopolymer run mutations (the presence of ACVR2A and TGFBR2 mutations side by side infers MSI status). The y-axes are in a log scale
Fig. 5Schematic of tumor-only mutation calling pipeline. Analytical pipeline overview for tumor mutation calling with a subset of matched normal samples. See Additional file 5 for details of commands, options, and settings