Vito M R Muggeo1, Giada Adelfio. 1. Dipartimento di Scienze Statistiche e Matematiche Vianelli, Università di Palermo, Palermo, Italy. vito.muggeo@unipa.it
Abstract
MOTIVATION: Knowing the exact locations of multiple change points in genomic sequences serves several biological needs, for instance when data represent aCGH profiles and it is of interest to identify possibly damaged genes involved in cancer and other diseases. Only a few of the currently available methods deal explicitly with estimation of the number and location of change points, and moreover these methods may be somewhat vulnerable to deviations of model assumptions usually employed. RESULTS: We present a computationally efficient method to obtain estimates of the number and location of the change points. The method is based on a simple transformation of data and it provides results quite robust to model misspecifications. The efficiency of the method guarantees moderate computational times regardless of the series length and the number of change points. AVAILABILITY: The methods described in this article are implemented in the new R package cumSeg available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=cumSeg.
MOTIVATION: Knowing the exact locations of multiple change points in genomic sequences serves several biological needs, for instance when data represent aCGH profiles and it is of interest to identify possibly damaged genes involved in cancer and other diseases. Only a few of the currently available methods deal explicitly with estimation of the number and location of change points, and moreover these methods may be somewhat vulnerable to deviations of model assumptions usually employed. RESULTS: We present a computationally efficient method to obtain estimates of the number and location of the change points. The method is based on a simple transformation of data and it provides results quite robust to model misspecifications. The efficiency of the method guarantees moderate computational times regardless of the series length and the number of change points. AVAILABILITY: The methods described in this article are implemented in the new R package cumSeg available from the Comprehensive R Archive Network at http://CRAN.R-project.org/package=cumSeg.
Authors: William D Chronister; Ian E Burbulis; Margaret B Wierman; Matthew J Wolpert; Mark F Haakenson; Aiden C B Smith; Joel E Kleinman; Thomas M Hyde; Daniel R Weinberger; Stefan Bekiranov; Michael J McConnell Journal: Cell Rep Date: 2019-01-22 Impact factor: 9.423
Authors: Jeffrey R Brubacher; Herbert Chan; Penelope Brasher; Shannon Erdelyi; Edi Desapriya; Mark Asbridge; Roy Purssell; Scott Macdonald; Nadine Schuurman; Ian Pike Journal: Am J Public Health Date: 2014-08-14 Impact factor: 9.308
Authors: Cristina Rueda; Miguel A Fernández; Sandra Barragán; Kanti V Mardia; Shyamal D Peddada Journal: Biometrics Date: 2016-03-17 Impact factor: 2.571
Authors: Evert van den Broek; Stef van Lieshout; Christian Rausch; Bauke Ylstra; Mark A van de Wiel; Gerrit A Meijer; Remond J A Fijneman; Sanne Abeln Journal: F1000Res Date: 2016-09-19