Israel T Silva1, Rafael A Rosales2, Adriano J Holanda2, Michel C Nussenzweig2, Mila Jankovic2. 1. Laboratory of Molecular Immunology, The Rockefeller University, 1230 York Avenue, New York, NY 10065, USA, Departamento de Computação e Matemática, Universidade de São Paulo. Av. Bandeirantes, 3900, Ribeirão Preto, CEP 14049-901 and National Institute of Science and Technology in Stem Cell and Cell Therapy and Center for Cell Based Therapy. Rua Catão Roxo, 2501, Ribeirão Preto, CEP 14051-140, SP, Brazil Laboratory of Molecular Immunology, The Rockefeller University, 1230 York Avenue, New York, NY 10065, USA, Departamento de Computação e Matemática, Universidade de São Paulo. Av. Bandeirantes, 3900, Ribeirão Preto, CEP 14049-901 and National Institute of Science and Technology in Stem Cell and Cell Therapy and Center for Cell Based Therapy. Rua Catão Roxo, 2501, Ribeirão Preto, CEP 14051-140, SP, Brazil. 2. Laboratory of Molecular Immunology, The Rockefeller University, 1230 York Avenue, New York, NY 10065, USA, Departamento de Computação e Matemática, Universidade de São Paulo. Av. Bandeirantes, 3900, Ribeirão Preto, CEP 14049-901 and National Institute of Science and Technology in Stem Cell and Cell Therapy and Center for Cell Based Therapy. Rua Catão Roxo, 2501, Ribeirão Preto, CEP 14051-140, SP, Brazil.
Abstract
MOTIVATION: The detection of genomic regions unusually rich in a given pattern is an important undertaking in the analysis of next-generation sequencing data. Recent studies of chromosomal translocations in activated B lymphocytes have identified regions that are frequently translocated to c-myc oncogene. A quantitative method for the identification of translocation hotspots was crucial to this study. Here we improve this analysis by using a simple probabilistic model and the framework provided by scan statistics to define the number and location of translocation breakpoint hotspots. A key feature of our method is that it provides a global chromosome-wide nominal control level to clustering, as opposed to previous methods based on local criteria. While being motivated by a specific application, the detection of unusual clusters is a widespread problem in bioinformatics. We expect our method to be useful in the analysis of data from other experimental approaches such as of ChIP-seq and 4C-seq. RESULTS: The analysis of translocations from B lymphocytes with the method described here reveals the presence of longer hotspots when compared with those defined previously. Further, we show that the hotspot size changes substantially in the absence of DNA repair protein 53BP1. When 53BP1 deficiency is combined with overexpression of activation-induced cytidine deaminase, the hotspot length increases even further. These changes are not detected by previous methods that use local significance criteria for clustering. Our method is also able to identify several exclusive translocation hotspots located in genes of known tumor supressors. AVAILABILITY AND IMPLEMENTATION: The detection of translocation hotspots is done with hot_scan, a program implemented in R and Perl. Source code and documentation are freely available for download at https://github.com/itojal/hot_scan.
MOTIVATION: The detection of genomic regions unusually rich in a given pattern is an important undertaking in the analysis of next-generation sequencing data. Recent studies of chromosomal translocations in activated B lymphocytes have identified regions that are frequently translocated to c-myc oncogene. A quantitative method for the identification of translocation hotspots was crucial to this study. Here we improve this analysis by using a simple probabilistic model and the framework provided by scan statistics to define the number and location of translocation breakpoint hotspots. A key feature of our method is that it provides a global chromosome-wide nominal control level to clustering, as opposed to previous methods based on local criteria. While being motivated by a specific application, the detection of unusual clusters is a widespread problem in bioinformatics. We expect our method to be useful in the analysis of data from other experimental approaches such as of ChIP-seq and 4C-seq. RESULTS: The analysis of translocations from B lymphocytes with the method described here reveals the presence of longer hotspots when compared with those defined previously. Further, we show that the hotspot size changes substantially in the absence of DNA repair protein 53BP1. When 53BP1 deficiency is combined with overexpression of activation-induced cytidine deaminase, the hotspot length increases even further. These changes are not detected by previous methods that use local significance criteria for clustering. Our method is also able to identify several exclusive translocation hotspots located in genes of known tumor supressors. AVAILABILITY AND IMPLEMENTATION: The detection of translocation hotspots is done with hot_scan, a program implemented in R and Perl. Source code and documentation are freely available for download at https://github.com/itojal/hot_scan.
Authors: Marieke Simonis; Petra Klous; Erik Splinter; Yuri Moshkin; Rob Willemsen; Elzo de Wit; Bas van Steensel; Wouter de Laat Journal: Nat Genet Date: 2006-10-08 Impact factor: 38.330
Authors: Joseph L Wiemels; Brian C Leonard; Yunxia Wang; Mark R Segal; Stephen P Hunger; Martyn T Smith; Vonda Crouse; Xiaomei Ma; Patricia A Buffler; Sharon R Pine Journal: Proc Natl Acad Sci U S A Date: 2002-11-01 Impact factor: 11.205
Authors: J Dierlamm; M Baens; I Wlodarska; M Stefanova-Ouzounova; J M Hernandez; D K Hossfeld; C De Wolf-Peeters; A Hagemeijer; H Van den Berghe; P Marynen Journal: Blood Date: 1999-06-01 Impact factor: 22.113
Authors: Almudena R Ramiro; Mila Jankovic; Elsa Callen; Simone Difilippantonio; Hua-Tang Chen; Kevin M McBride; Thomas R Eisenreich; Junjie Chen; Ross A Dickins; Scott W Lowe; Andre Nussenzweig; Michel C Nussenzweig Journal: Nature Date: 2006-01-08 Impact factor: 49.962
Authors: Andreas Reiter; Susanne Saussele; David Grimwade; Joseph L Wiemels; Mark R Segal; Marina Lafage-Pochitaloff; Christoph Walz; Andreas Weisser; Andreas Hochhaus; Andreas Willer; Anja Reichert; Thomas Büchner; Eva Lengfelder; Rüdiger Hehlmann; Nicholas C P Cross Journal: Genes Chromosomes Cancer Date: 2003-02 Impact factor: 5.006
Authors: P Revy; T Muto; Y Levy; F Geissmann; A Plebani; O Sanal; N Catalan; M Forveille; R Dufourcq-Labelouse; A Gennery; I Tezcan; F Ersoy; H Kayserili; A G Ugazio; N Brousse; M Muramatsu; L D Notarangelo; K Kinoshita; T Honjo; A Fischer; A Durandy Journal: Cell Date: 2000-09-01 Impact factor: 41.582
Authors: Lillian B Cohn; Israel T Silva; Thiago Y Oliveira; Rafael A Rosales; Erica H Parrish; Gerald H Learn; Beatrice H Hahn; Julie L Czartoski; M Juliana McElrath; Clara Lehmann; Florian Klein; Marina Caskey; Bruce D Walker; Janet D Siliciano; Robert F Siliciano; Mila Jankovic; Michel C Nussenzweig Journal: Cell Date: 2015-01-29 Impact factor: 41.582
Authors: Lei Zhang; Wenqing Su; Rong Tao; Weiyi Zhang; Jiongjiong Chen; Peiyao Wu; Chenghuan Yan; Yue Jia; Robert M Larkin; Dean Lavelle; Maria-Jose Truco; Sebastian Reyes Chin-Wo; Richard W Michelmore; Hanhui Kuang Journal: Nat Commun Date: 2017-12-22 Impact factor: 14.919