Alberto Magi1, Lorenzo Tattini1, Flavia Palombo1, Matteo Benelli1, Alessandro Gialluisi1, Betti Giusti1, Rosanna Abbate1, Marco Seri1, Gian Franco Gensini1, Giovanni Romeo1, Tommaso Pippucci2. 1. Department of Experimental and Clinical Medicine, University of Florence, Florence 50019, Medical Genetics Unit, Polyclinic Sant'Orsola-Malpighi, Department of Medical and Surgical Sciences, University of Bologna, Bologna 40138, Diagnostic Genetic Unit, Careggi Hospital, Florence 50019, Italy and Language and Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen 6525 EN, The Netherlands. 2. Department of Experimental and Clinical Medicine, University of Florence, Florence 50019, Medical Genetics Unit, Polyclinic Sant'Orsola-Malpighi, Department of Medical and Surgical Sciences, University of Bologna, Bologna 40138, Diagnostic Genetic Unit, Careggi Hospital, Florence 50019, Italy and Language and Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen 6525 EN, The Netherlands Department of Experimental and Clinical Medicine, University of Florence, Florence 50019, Medical Genetics Unit, Polyclinic Sant'Orsola-Malpighi, Department of Medical and Surgical Sciences, University of Bologna, Bologna 40138, Diagnostic Genetic Unit, Careggi Hospital, Florence 50019, Italy and Language and Genetics Department, Max Planck Institute for Psycholinguistics, Nijmegen 6525 EN, The Netherlands.
Abstract
MOTIVATION: Runs of homozygosity (ROH) are sizable chromosomal stretches of homozygous genotypes, ranging in length from tens of kilobases to megabases. ROHs can be relevant for population and medical genetics, playing a role in predisposition to both rare and common disorders. ROHs are commonly detected by single nucleotide polymorphism (SNP) microarrays, but attempts have been made to use whole-exome sequencing (WES) data. Currently available methods developed for the analysis of uniformly spaced SNP-array maps do not fit easily to the analysis of the sparse and non-uniform distribution of the WES target design. RESULTS: To meet the need of an approach specifically tailored to WES data, we developed [Formula: see text], an original algorithm based on heterogeneous hidden Markov model that incorporates inter-marker distances to detect ROH from WES data. We evaluated the performance of [Formula: see text] to correctly identify ROHs on synthetic chromosomes and examined its accuracy in detecting ROHs of different length (short, medium and long) from real 1000 genomes project data. [Formula: see text] turned out to be more accurate than GERMLINE and PLINK, two state-of-the-art algorithms, especially in the detection of short and medium ROHs. AVAILABILITY AND IMPLEMENTATION: [Formula: see text] is a collection of bash, R and Fortran scripts and codes and is freely available at https://sourceforge.net/projects/h3m2/. CONTACT: albertomagi@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Runs of homozygosity (ROH) are sizable chromosomal stretches of homozygous genotypes, ranging in length from tens of kilobases to megabases. ROHs can be relevant for population and medical genetics, playing a role in predisposition to both rare and common disorders. ROHs are commonly detected by single nucleotide polymorphism (SNP) microarrays, but attempts have been made to use whole-exome sequencing (WES) data. Currently available methods developed for the analysis of uniformly spaced SNP-array maps do not fit easily to the analysis of the sparse and non-uniform distribution of the WES target design. RESULTS: To meet the need of an approach specifically tailored to WES data, we developed [Formula: see text], an original algorithm based on heterogeneous hidden Markov model that incorporates inter-marker distances to detect ROH from WES data. We evaluated the performance of [Formula: see text] to correctly identify ROHs on synthetic chromosomes and examined its accuracy in detecting ROHs of different length (short, medium and long) from real 1000 genomes project data. [Formula: see text] turned out to be more accurate than GERMLINE and PLINK, two state-of-the-art algorithms, especially in the detection of short and medium ROHs. AVAILABILITY AND IMPLEMENTATION: [Formula: see text] is a collection of bash, R and Fortran scripts and codes and is freely available at https://sourceforge.net/projects/h3m2/. CONTACT: albertomagi@gmail.com SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Andrea Angius; Paolo Uva; Insa Buers; Manuela Oppo; Alessandro Puddu; Stefano Onano; Ivana Persico; Angela Loi; Loredana Marcia; Wolfgang Höhne; Gianmauro Cuccuru; Giorgio Fotia; Manila Deiana; Mara Marongiu; Hatice Tuba Atalay; Sibel Inan; Osama El Assy; Leo M E Smit; Ilyas Okur; Koray Boduroglu; Gülen Eda Utine; Esra Kılıç; Giuseppe Zampino; Giangiorgio Crisponi; Laura Crisponi; Frank Rutsch Journal: Am J Hum Genet Date: 2016-07-07 Impact factor: 11.025
Authors: A Catania; R Battini; T Pippucci; R Pasquariello; M L Chiapparini; M Seri; B Garavaglia; G Zorzi; N Nardocci; D Ghezzi; V Tiranti Journal: Neurogenetics Date: 2018-07-03 Impact factor: 2.660
Authors: Flavia Palombo; Nadia Al-Wardy; Guido Alberto Gnecchi Ruscone; Manuela Oppo; Mohammed Nasser Al Kindi; Andrea Angius; Khalsa Al Lamki; Giorgia Girotto; Tania Giangregorio; Matteo Benelli; Alberto Magi; Marco Seri; Paolo Gasparini; Francesco Cucca; Marco Sazzini; Mazin Al Khabori; Tommaso Pippucci; Giovanni Romeo Journal: J Hum Genet Date: 2016-10-13 Impact factor: 3.172
Authors: Margot J Wyrwoll; Şehime G Temel; Liina Nagirnaja; Manon S Oud; Alexandra M Lopes; Godfried W van der Heijden; James S Heald; Nadja Rotte; Joachim Wistuba; Marius Wöste; Susanne Ledig; Henrike Krenz; Roos M Smits; Filipa Carvalho; João Gonçalves; Daniela Fietz; Burcu Türkgenç; Mahmut C Ergören; Murat Çetinkaya; Murad Başar; Semra Kahraman; Kevin McEleny; Miguel J Xavier; Helen Turner; Adrian Pilatz; Albrecht Röpke; Martin Dugas; Sabine Kliesch; Nina Neuhaus; Kenneth I Aston; Donald F Conrad; Joris A Veltman; Corinna Friedrich; Frank Tüttelmann Journal: Am J Hum Genet Date: 2020-07-15 Impact factor: 11.025
Authors: Francisco C Ceballos; Peter K Joshi; David W Clark; Michèle Ramsay; James F Wilson Journal: Nat Rev Genet Date: 2018-01-15 Impact factor: 53.242