MOTIVATION: Several recent studies have demonstrated the effectiveness of resequencing and single nucleotide variant (SNV) detection by deep short-read sequencing platforms. While several reliable algorithms are available for automated SNV detection, the automated detection of microindels in deep short-read data presents a new bioinformatics challenge. RESULTS: We systematically analyzed how the short-read mapping tools MAQ, Bowtie, Burrows-Wheeler alignment tool (BWA), Novoalign and RazerS perform on simulated datasets that contain indels and evaluated how indels affect error rates in SNV detection. We implemented a simple algorithm to compute the equivalent indel region eir, which can be used to process the alignments produced by the mapping tools in order to perform indel calling. Using simulated data that contains indels, we demonstrate that indel detection works well on short-read data: the detection rate for microindels (<4 bp) is >90%. Our study provides insights into systematic errors in SNV detection that is based on ungapped short sequence read alignments. Gapped alignments of short sequence reads can be used to reduce this error and to detect microindels in simulated short-read data. A comparison with microindels automatically identified on the ABI Sanger and Roche 454 platform indicates that microindel detection from short sequence reads identifies both overlapping and distinct indels. CONTACT: peter.krawitz@googlemail.com; peter.robinson@charite.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Several recent studies have demonstrated the effectiveness of resequencing and single nucleotide variant (SNV) detection by deep short-read sequencing platforms. While several reliable algorithms are available for automated SNV detection, the automated detection of microindels in deep short-read data presents a new bioinformatics challenge. RESULTS: We systematically analyzed how the short-read mapping tools MAQ, Bowtie, Burrows-Wheeler alignment tool (BWA), Novoalign and RazerS perform on simulated datasets that contain indels and evaluated how indels affect error rates in SNV detection. We implemented a simple algorithm to compute the equivalent indel region eir, which can be used to process the alignments produced by the mapping tools in order to perform indel calling. Using simulated data that contains indels, we demonstrate that indel detection works well on short-read data: the detection rate for microindels (<4 bp) is >90%. Our study provides insights into systematic errors in SNV detection that is based on ungapped short sequence read alignments. Gapped alignments of short sequence reads can be used to reduce this error and to detect microindels in simulated short-read data. A comparison with microindels automatically identified on the ABI Sanger and Roche 454 platform indicates that microindel detection from short sequence reads identifies both overlapping and distinct indels. CONTACT: peter.krawitz@googlemail.com; peter.robinson@charite.de SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Stephane Flibotte; Mark L Edgley; Iasha Chaudhry; Jon Taylor; Sarah E Neil; Aleksandra Rogula; Rick Zapf; Martin Hirst; Yaron Butterfield; Steven J Jones; Marco A Marra; Robert J Barstead; Donald G Moerman Journal: Genetics Date: 2010-05-03 Impact factor: 4.562
Authors: Cornelis A Albers; Gerton Lunter; Daniel G MacArthur; Gilean McVean; Willem H Ouwehand; Richard Durbin Journal: Genome Res Date: 2010-10-27 Impact factor: 9.043
Authors: Gordon Robertson; Jacqueline Schein; Readman Chiu; Richard Corbett; Matthew Field; Shaun D Jackman; Karen Mungall; Sam Lee; Hisanaga Mark Okada; Jenny Q Qian; Malachi Griffith; Anthony Raymond; Nina Thiessen; Timothee Cezard; Yaron S Butterfield; Richard Newsome; Simon K Chan; Rong She; Richard Varhol; Baljit Kamoh; Anna-Liisa Prabhu; Angela Tam; YongJun Zhao; Richard A Moore; Martin Hirst; Marco A Marra; Steven J M Jones; Pamela A Hoodless; Inanc Birol Journal: Nat Methods Date: 2010-10-10 Impact factor: 28.547
Authors: Walter J Sandoval-Espinola; Satya T Makwana; Mari S Chinn; Michael R Thon; M Andrea Azcárate-Peril; José M Bruno-Bárcena Journal: Microbiology (Reading) Date: 2013-09-25 Impact factor: 2.777
Authors: C A Colque; A G Albarracín Orio; S Feliziani; R L Marvig; A R Tobares; H K Johansen; S Molin; A M Smania Journal: Antimicrob Agents Chemother Date: 2020-04-21 Impact factor: 5.191
Authors: John P Didion; Hyuna Yang; Keith Sheppard; Chen-Ping Fu; Leonard McMillan; Fernando Pardo-Manuel de Villena; Gary A Churchill Journal: BMC Genomics Date: 2012-01-19 Impact factor: 3.969