Literature DB >> 32665844

Accuracy of short tandem repeats genotyping tools in whole exome sequencing data.

Andreas Halman1,2,3,4, Alicia Oshlack1,2,5.   

Abstract

Background: Short tandem repeats are an important source of genetic variation. They are highly mutable and repeat expansions are associated dozens of human disorders, such as Huntington's disease and spinocerebellar ataxias. Technical advantages in sequencing technology have made it possible to analyse these repeats at large scale; however, accurate genotyping is still a challenging task. We compared four different short tandem repeats genotyping tools on whole exome sequencing data to determine their genotyping performance and limits, which will aid other researchers in choosing a suitable tool and parameters for analysis.
Methods: The analysis was performed on the Simons Simplex Collection dataset, where we used a novel method of evaluation with accuracy determined by the rate of homozygous calls on the X chromosome of male samples. In total we analysed 433 samples and around a million genotypes for evaluating tools on whole exome sequencing data.
Results: We determined a relatively good performance of all tools when genotyping repeats of 3-6 bp in length, which could be improved with coverage and quality score filtering. However, genotyping homopolymers was challenging for all tools and a high error rate was present across different thresholds of coverage and quality scores. Interestingly, dinucleotide repeats displayed a high error rate as well, which was found to be mainly caused by the AC/TG repeats. Overall, LobSTR was able to make the most calls and was also the fastest tool, while RepeatSeq and HipSTR exhibited the lowest heterozygous error rate at low coverage. Conclusions: All tools have different strengths and weaknesses and the choice may depend on the application. In this analysis we demonstrated the effect of using different filtering parameters and offered recommendations based on the trade-off between the best accuracy of genotyping and the highest number of calls. Copyright:
© 2020 Halman A and Oshlack A.

Entities:  

Keywords:  gangstr; hipstr; lobstr; microsatellites; repeatseq; short tandem repeats

Mesh:

Year:  2020        PMID: 32665844      PMCID: PMC7327730          DOI: 10.12688/f1000research.22639.1

Source DB:  PubMed          Journal:  F1000Res        ISSN: 2046-1402


  28 in total

1.  Detecting Expansions of Tandem Repeats in Cohorts Sequenced with Short-Read Sequencing Data.

Authors:  Rick M Tankard; Mark F Bennett; Peter Degorski; Martin B Delatycki; Paul J Lockhart; Melanie Bahlo
Journal:  Am J Hum Genet       Date:  2018-11-29       Impact factor: 11.025

2.  Tandem repeats finder: a program to analyze DNA sequences.

Authors:  G Benson
Journal:  Nucleic Acids Res       Date:  1999-01-15       Impact factor: 16.971

Review 3.  Tandem repeats mediating genetic plasticity in health and disease.

Authors:  Anthony J Hannan
Journal:  Nat Rev Genet       Date:  2018-02-05       Impact factor: 53.242

4.  Profiling the genome-wide landscape of tandem repeat expansions.

Authors:  Nima Mousavi; Sharona Shleizer-Burko; Richard Yanicky; Melissa Gymrek
Journal:  Nucleic Acids Res       Date:  2019-09-05       Impact factor: 16.971

5.  A framework for variation discovery and genotyping using next-generation DNA sequencing data.

Authors:  Mark A DePristo; Eric Banks; Ryan Poplin; Kiran V Garimella; Jared R Maguire; Christopher Hartl; Anthony A Philippakis; Guillermo del Angel; Manuel A Rivas; Matt Hanna; Aaron McKenna; Tim J Fennell; Andrew M Kernytsky; Andrey Y Sivachenko; Kristian Cibulskis; Stacey B Gabriel; David Altshuler; Mark J Daly
Journal:  Nat Genet       Date:  2011-04-10       Impact factor: 38.330

6.  lobSTR: A short tandem repeat profiler for personal genomes.

Authors:  Melissa Gymrek; David Golan; Saharon Rosset; Yaniv Erlich
Journal:  Genome Res       Date:  2012-04-20       Impact factor: 9.043

7.  Accurate typing of short tandem repeats from genome-wide sequencing data and its applications.

Authors:  Arkarachai Fungtammasan; Guruprasad Ananda; Suzanne E Hile; Marcia Shu-Wei Su; Chen Sun; Robert Harris; Paul Medvedev; Kristin Eckert; Kateryna D Makova
Journal:  Genome Res       Date:  2015-03-30       Impact factor: 9.043

8.  Inferring short tandem repeat variation from paired-end short reads.

Authors:  Minh Duc Cao; Edward Tasker; Kai Willadsen; Michael Imelfort; Sailaja Vishwanathan; Sridevi Sureshkumar; Sureshkumar Balasubramanian; Mikael Bodén
Journal:  Nucleic Acids Res       Date:  2013-12-17       Impact factor: 16.971

9.  Detection of long repeat expansions from PCR-free whole-genome sequence data.

Authors:  Egor Dolzhenko; Joke J F A van Vugt; Richard J Shaw; Mitchell A Bekritsky; Marka van Blitterswijk; Giuseppe Narzisi; Subramanian S Ajay; Vani Rajan; Bryan R Lajoie; Nathan H Johnson; Zoya Kingsbury; Sean J Humphray; Raymond D Schellevis; William J Brands; Matt Baker; Rosa Rademakers; Maarten Kooyman; Gijs H P Tazelaar; Michael A van Es; Russell McLaughlin; William Sproviero; Aleksey Shatunov; Ashley Jones; Ahmad Al Khleifat; Alan Pittman; Sarah Morgan; Orla Hardiman; Ammar Al-Chalabi; Chris Shaw; Bradley Smith; Edmund J Neo; Karen Morrison; Pamela J Shaw; Catherine Reeves; Lara Winterkorn; Nancy S Wexler; David E Housman; Christopher W Ng; Alina L Li; Ryan J Taft; Leonard H van den Berg; David R Bentley; Jan H Veldink; Michael A Eberle
Journal:  Genome Res       Date:  2017-09-08       Impact factor: 9.438

10.  A New Census of Protein Tandem Repeats and Their Relationship with Intrinsic Disorder.

Authors:  Matteo Delucchi; Elke Schaper; Oxana Sachenkova; Arne Elofsson; Maria Anisimova
Journal:  Genes (Basel)       Date:  2020-04-09       Impact factor: 4.096

View more
  4 in total

Review 1.  An update on the neurological short tandem repeat expansion disorders and the emergence of long-read sequencing diagnostics.

Authors:  Sanjog R Chintalaphani; Sandy S Pineda; Ira W Deveson; Kishore R Kumar
Journal:  Acta Neuropathol Commun       Date:  2021-05-25       Impact factor: 7.801

2.  Population-level genome-wide STR discovery and validation for population structure and genetic diversity assessment of Plasmodium species.

Authors:  Jiru Han; Jacob E Munro; Anthony Kocoski; Alyssa E Barry; Melanie Bahlo
Journal:  PLoS Genet       Date:  2022-01-10       Impact factor: 5.917

Review 3.  An Introductory Overview of Open-Source and Commercial Software Options for the Analysis of Forensic Sequencing Data.

Authors:  Tunde I Huszar; Katherine B Gettings; Peter M Vallone
Journal:  Genes (Basel)       Date:  2021-10-29       Impact factor: 4.096

4.  Novel KCNQ4 variants in different functional domains confer genotype- and mechanism-based therapeutics in patients with nonsyndromic hearing loss.

Authors:  Sang-Yeon Lee; Hyun Been Choi; Mina Park; Il Soon Choi; Jieun An; Ami Kim; Eunku Kim; Nahyun Kim; Jin Hee Han; Min Young Kim; Seung Min Lee; Doo-Yi Oh; Bong Jik Kim; Nayoung Yi; Nayoung K D Kim; Chung Lee; Woong-Yang Park; Young Ik Koh; Heon Yung Gee; Hyun Sung Cho; Tong Mook Kang; Byung Yoon Choi
Journal:  Exp Mol Med       Date:  2021-07-28       Impact factor: 12.153

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.