MOTIVATION: Microsatellites, also known as short tandem repeats (STRs), are tracts of repetitive DNA sequences containing motifs ranging from two to six bases. Microsatellites are one of the most abundant type of variation in the human genome, after single nucleotide polymorphisms (SNPs) and Indels. Microsatellite analysis has a wide range of applications, including medical genetics, forensics and construction of genetic genealogy. However, microsatellite variations are rarely considered in whole-genome sequencing studies, in large due to a lack of tools capable of analyzing them. RESULTS: Here we present a microsatellite genotyper, optimized for Illumina WGS data, which is both faster and more accurate than other methods previously presented. There are two main ingredients to our improvements. First we reduce the amount of sequencing data necessary for creating microsatellite profiles by using previously aligned sequencing data. Second, we use population information to train microsatellite and individual specific error profiles. By comparing our genotyping results to genotypes generated by capillary electrophoresis we show that our error rates are 50% lower than those of lobSTR, another program specifically developed to determine microsatellite genotypes. AVAILABILITY AND IMPLEMENTATION: Source code is available on Github: https://github.com/DecodeGenetics/popSTR. CONTACT: snaedis.kristmundsdottir@decode.is or bjarni.halldorsson@decode.is.
MOTIVATION: Microsatellites, also known as short tandem repeats (STRs), are tracts of repetitive DNA sequences containing motifs ranging from two to six bases. Microsatellites are one of the most abundant type of variation in the human genome, after single nucleotide polymorphisms (SNPs) and Indels. Microsatellite analysis has a wide range of applications, including medical genetics, forensics and construction of genetic genealogy. However, microsatellite variations are rarely considered in whole-genome sequencing studies, in large due to a lack of tools capable of analyzing them. RESULTS: Here we present a microsatellite genotyper, optimized for Illumina WGS data, which is both faster and more accurate than other methods previously presented. There are two main ingredients to our improvements. First we reduce the amount of sequencing data necessary for creating microsatellite profiles by using previously aligned sequencing data. Second, we use population information to train microsatellite and individual specific error profiles. By comparing our genotyping results to genotypes generated by capillary electrophoresis we show that our error rates are 50% lower than those of lobSTR, another program specifically developed to determine microsatellite genotypes. AVAILABILITY AND IMPLEMENTATION: Source code is available on Github: https://github.com/DecodeGenetics/popSTR. CONTACT: snaedis.kristmundsdottir@decode.is or bjarni.halldorsson@decode.is.
Authors: Bjarni Gunnarsson; Guðrún A Jónsdóttir; Gyða Björnsdóttir; Bettina Konte; Patrick Sulem; Snædís Kristmundsdóttir; Birte Kehr; Ómar Gústafsson; Hannes Helgason; Paul D Iordache; Sigurgeir Ólafsson; Michael L Frigge; Guðmar Þorleifsson; Sunna Arnarsdóttir; Berglind Stefánsdóttir; Ina Giegling; Srdjan Djurovic; Kjetil S Sundet; Thomas Espeseth; Ingrid Melle; Annette M Hartmann; Unnur Thorsteinsdottir; Augustine Kong; Daníel F Guðbjartsson; Ulrich Ettinger; Ole A Andreassen; Jónas G Halldórsson; Hreinn Stefánsson; Bjarni V Halldórsson; Kári Stefánsson Journal: Sci Rep Date: 2016-11-04 Impact factor: 4.379
Authors: Ole K Tørresen; Marine S O Brieuc; Monica H Solbakken; Elin Sørhus; Alexander J Nederbragt; Kjetill S Jakobsen; Sonnich Meier; Rolf B Edvardsen; Sissel Jentoft Journal: BMC Genomics Date: 2018-04-10 Impact factor: 3.969
Authors: Gregory P Harhay; Dayna M Harhay; James L Bono; Sarah F Capik; Keith D DeDonder; Michael D Apley; Brian V Lubbers; Bradley J White; Robert L Larson; Timothy P L Smith Journal: Sci Rep Date: 2019-12-02 Impact factor: 4.379