| Literature DB >> 22675565 |
Zeynep Kalender Atak1, Kim De Keersmaecker, Valentina Gianfelici, Ellen Geerdens, Roel Vandepoel, Daphnie Pauwels, Michaël Porcu, Idoya Lahortiga, Vanessa Brys, Willy G Dirks, Hilmar Quentmeier, Jacqueline Cloos, Harry Cuppens, Anne Uyttebroeck, Peter Vandenberghe, Jan Cools, Stein Aerts.
Abstract
With the advent of whole-genome and whole-exome sequencing, high-quality catalogs of recurrently mutated cancer genes are becoming available for many cancer types. Increasing access to sequencing technology, including bench-top sequencers, provide the opportunity to re-sequence a limited set of cancer genes across a patient cohort with limited processing time. Here, we re-sequenced a set of cancer genes in T-cell acute lymphoblastic leukemia (T-ALL) using Nimblegen sequence capture coupled with Roche/454 technology. First, we investigated how a maximal sensitivity and specificity of mutation detection can be achieved through a benchmark study. We tested nine combinations of different mapping and variant-calling methods, varied the variant calling parameters, and compared the predicted mutations with a large independent validation set obtained by capillary re-sequencing. We found that the combination of two mapping algorithms, namely BWA-SW and SSAHA2, coupled with the variant calling algorithm Atlas-SNP2 yields the highest sensitivity (95%) and the highest specificity (93%). Next, we applied this analysis pipeline to identify mutations in a set of 58 cancer genes, in a panel of 18 T-ALL cell lines and 15 T-ALL patient samples. We confirmed mutations in known T-ALL drivers, including PHF6, NF1, FBXW7, NOTCH1, KRAS, NRAS, PIK3CA, and PTEN. Interestingly, we also found mutations in several cancer genes that had not been linked to T-ALL before, including JAK3. Finally, we re-sequenced a small set of 39 candidate genes and identified recurrent mutations in TET1, SPRY3 and SPRY4. In conclusion, we established an optimized analysis pipeline for Roche/454 data that can be applied to accurately detect gene mutations in cancer, which led to the identification of several new candidate T-ALL driver mutations.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22675565 PMCID: PMC3366948 DOI: 10.1371/journal.pone.0038463
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Performance comparison and parameter optimization.
(A) Different pipelines show different sensitivity and specificity. Varying DoC and VAF thresholds in the variant calling process has an additional affect on the predictions in terms of sensitivity and specificity, respectively. Each pipeline is represented with a different symbol and the performance of each pipeline (in terms of sensitivity and specificity) is plotted under varying DoC and VAF thresholds. Note that the X-axis represents the false positive rate (1-specificity). In this ROC plot, the closer the point to the upper left point of the graph, the better the sensitivity and the specificity. Different colors of the symbols indicate the performance of the pipeline under changing VAF thresholds, and the two shaded boxes indicate the performance under changing DoC thresholds. The plot shows that (i) decreasing the DoC threshold increases the sensitivity of all pipelines as indicated with the blue dotted line; (ii) increasing the VAF threshold increases the specificity with a slight decrease in sensitivity as indicated (in the example of BLAT+VarScan pipeline) with the red dotted line; (iii) the BWA-SW+SSAHA2+Atlas-SNP2 pipeline has the best performance among all pipelines under DoC = 3 & VAF = 0.20 thresholds as indicated with the yellow arrow. The Roche pipeline is indicated with a black diamond shape since no parameter changes were performed on it, and SSAHA2+SAMTools and BWA-SW+SAMTools pipelines were colored grey since no VAF threshold changes were performed on them. (B) The Matthews correlation coefficient for each pipeline is shown for the most optimal performance of that pipeline (). It is interesting to note that the optimal performance of all the pipelines, except Roche gsMapper, was observed for a DoC threshold of 3.
Figure 2Mutations in the 97 genes.
Coding mutations in known cancer genes (A) and candidate genes (B) are indicated with different color codes. Panel A is further subdivided into (I) genes that are known to be drivers in T-ALL, and (II) the genes that have recurrent somatic mutations in various human cancers. The cell lines are located to the left of the table, and the patient samples are located to the right. Genes are ranked according to the frequency of protein altering mutations in the patient samples.
Figure 3JAK kinase mutations.
(A) Sanger sequencing chromatograms corresponding to confirmed JAK2/JAK3 variants. (B) Domain structure of JAK2 and JAK3 proteins with indication of novel detected variants. Non-somatic variants are indicated with an asterisk. (C) Sanger sequences showing examples of TYK2 variants detect in T-ALL cell lines or in leukemia patient samples. (D) Schematic representation of TYK2 protein structure with indication of all novel TYK2 variants detected in this study. Non-somatic variants are indicated with an asterisk.
Figure 4TET1 mutations in T-ALL.
(A) Sanger sequencing chromatograms representing confimed TET1 variants. (B) Schematic representation of TET1 protein structure with indication of all novel TET1 variants detected in this study. Variants detected in cell lines are depicted above the TET1 protein, variants detected in leukemia patient samples are below the TET1 protein. Non-somatic variants are indicated with an asterisk.
Figure 5SPRY4 mutations.
(A) Sanger sequencing chromatograms showing confirmed SPRY4 variants. (B) Domain structure of the SPRY4 protein with indication of novel detected variants.
Analysis of TYK2 variants in cell lines over time and in different subclones.
| Cell line | Tested variant | Result |
| CCRF-CEM Cools lab | R1027H | present |
| CCRF-CEM 2011 DSMZ (ACC240) | R1027H | present |
| CCRF-CEM subclone 1 DSMZ | R1027H | present |
| CCRF-CEM subclone 2 DSMZ | R1027H | present |
| CCRF-CEM subclone 3 DSMZ | R1027H | present |
| CCRF-CEM subclone 4 DSMZ | R1027H | present |
| CCRF-CEM subclone 5 DSMZ | R1027H | present |
| CCRF-CEM Cools lab | A35V | present |
| CCRF-CEM 2011 DSMZ (ACC 240) | A35V | present |
| CCRF-CEM subclone 1 DSMZ | A35V | absent |
| CCRF-CEM subclone 2 DSMZ | A35V | absent |
| CCRF-CEM subclone 3 DSMZ | A35V | absent |
| CCRF-CEM subclone 4 DSMZ | A35V | absent |
| CCRF-CEM subclone 5 DSMZ | A35V | absent |
| KARPAS-45 Cools lab | Q830 | present |
| KARPAS-45 2011 DSMZ (ACC105) | Q830 | present |
| KARPAS-45 1994 DSMZ (ACC105) | Q830 | present |
| JURKAT Cools lab | C192Y | present |
| JURKAT 2011 DSMZ (ACC 282) | C192Y | absent |
| JURKAT 1992 DSMZ (ACC 282) | C192Y | absent |
Presence of the TYK2 R1027 and A35V variants was tested in the CCRF-CEM cell line from our group (“CCRF-CEM Cools lab”) as well as in the CCRF-CEM cell line as it is currently sold by DSMZ (“CCRF-CEM 2011 DSMZ (ACC240)) and in 5 different CCRF-CEM subclones that DSMZ collected over the years. Similarly, KARPAS-45 from the Cools lab and the KARPAS-45 lines obtained from DSMZ in 2011 and in 1994 were screened for presence of the TYK2 Q830* variant. JURKAT cells from the Cools lab as well as JURKAT provided by DSMZ in 2011 and 1992 were tested for the TYK2 C192Y variant.
This cell line has 4 copies of chromosome 19 containing TYK2. The height of the variant peak on the chromatogram suggests that only 1 copy of TYK2 contains the Q830* variant.