| Literature DB >> 31320741 |
Carolina Jaramillo Oquendo1, Helen Parker2, David Oscier3, Sarah Ennis1, Jane Gibson2, Jonathan C Strefford4.
Abstract
The aims of this systematic review are to refine the catalogue of somatic variants in splenic marginal zone lymphoma (SMZL) and to provide a well-annotated, manually curated database of high-confidence somatic mutations to facilitate variant interpretation for further biological studies and future clinical implementation. Two independent reviewers systematically searched PubMed and Ovid in January 2019 and included studies that sequenced SMZL cases with confirmed diagnosis. The database included fourteen studies, comprising 2817 variants in over 1000 genes from 475 cases. We confirmed the high prevalence of NOTCH2, KLF2 and TP53 mutations and analysis of targeted genes further implicated TNFAIP3, KMT2D, and TRAF3 as recurrent targets of somatic mutation based on their high incidence across studies. The major limitations we encountered were the low number of patients with whole-genome, unbiased analysis and the relative sensitivities of differing sequencing approaches. Overall, we showed that there is little concordance between whole exome sequencing studies of SMZL. We strongly support the continuing unbiased analysis of the SMZL genome for mutations in all protein-coding genes and provide a valuable database resource to facilitate this endeavour that will ultimately improve our understanding of SMZL pathobiology.Entities:
Mesh:
Year: 2019 PMID: 31320741 PMCID: PMC6639539 DOI: 10.1038/s41598-019-46906-1
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Methods Summary. (a) Flowchart of manuscript selection. The figure goes through the search strategy, starting with the combination of search terms used in the databases. Numbers denote amount of records or manuscripts at each step. Once all entries were compiled into a single list, duplicate manuscripts were removed and those remaining were reviewed to identify those that would be used in the full text review. The number of manuscripts excluded and those that were kept are stated in each of the steps. (b) Flowchart of database compilation and variant filtering. The flowchart begins at the data collation step, where all the lists of variants from the published manuscripts and Supplementary information were collated into a single list. Subsequent filtering strategies and data manipulation tools are described. Numbers denote amount of variants at each step.
Figure 2Database analysis. (a) Venn diagram of gene overlap in WES studies. The figure shows common and unique genes reported in each study with no overlap between all five. Where there was an overlap of genes identified by more than three studies there are white circles to emphasise the number and name of gene(s) . The gene list was obtained from the WES subset of the final filtered and annotated list. (b) Mutation frequency (%) of the top 20 genes. The graph displays the frequency of mutations in each gene as well as an overview of the type of mutations (missense, non-sense, frameshift and splicing). Genes are listed in descending frequency. (c) Wordcloud of gene symbols present in database drawn using WordArt (https://wordart.com). The size of each gene symbol is proportional to the number of mutations in each gene (range: 1–123 mutations). NOTCH2 (n = 123) and KLF2 (n = 121) had the highest number of mutations, followed by TNFAIP3 (n = 75), TP53 (n = 60) and MYD88 (n = 43). (d) DISCOVER mutual exclusivity test results. Heat map displaying the corrected p-value for the gene pairs tested for mutual exclusivity in the DISCOVER algorithm. The dark green boxes indicate a low p-value where the plus sign (+) those with a p-value < 0.05 and the asterisk (*) highlights those pairwise combinations with a p-value < 0.01. (e) Waterfall plot of mutations found in KLF2, NOTCH2, TP53 and IGLL5. Each column represents a sample, and each row a gene. Each column is coloured according to the mutation type present in the sample and grey if no mutations are present.
Figure 3Mutations in key recurrently mutated genes in the database, drawn using cBioPortal (http://www.cbioportal.org/mutation_mapper.jsp). The figure illustrates a linear protein representing each gene with its respective domains. The height is representative of the number of variants reported (The y-axis is not the same proportion for all figures) and circle colour identifies the type of mutation. The transcript used for each protein is stated under the gene name and the colours of the domains was randomly assigned.