Pasi Rastas1,2. 1. Department of Zoology, Butterfly Genetics Group, University of Cambridge, Cambridge, UK. 2. Department of Biosciences, Ecological Genetics Research Unit, University of Helsinki, Helsinki, Finland.
Abstract
MOTIVATION: Accurate and dense linkage maps are useful in family-based linkage and association studies, quantitative trait locus mapping, analysis of genome synteny and other genomic data analyses. Moreover, linkage mapping is one of the best ways to detect errors in de novo genome assemblies, as well as to orient and place assembly contigs within chromosomes. A small mapping cross of tens of individuals will detect many errors where distant parts of the genome are erroneously joined together. With more individuals and markers, even more local errors can be detected and more contigs can be oriented. However, the tools that are currently available for constructing linkage maps are not well suited for large, possible low-coverage, whole genome sequencing datasets. RESULTS: Here we present a linkage mapping software Lep-MAP3, capable of mapping high-throughput whole genome sequencing datasets. Such data allows cost-efficient genotyping of millions of single nucleotide polymorphisms (SNPs) for thousands of individual samples, enabling, among other analyses, comprehensive validation and refinement of de novo genome assemblies. The algorithms of Lep-MAP3 can analyse low-coverage datasets and reduce data filtering and curation on any data. This yields more markers in the final maps with less manual work even on problematic datasets. We demonstrate that Lep-MAP3 obtains very good performance already on 5x sequencing coverage and outperforms the fastest available software on simulated data on accuracy and often on speed. We also construct de novo linkage maps on 7-12x whole-genome data on the Red postman butterfly (Heliconius erato) with almost 3 million markers. AVAILABILITY AND IMPLEMENTATION: Lep-MAP3 is available with the source code under GNU general public license from http://sourceforge.net/projects/lep-map3. CONTACT: pasi.rastas@helsinki.fi. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
MOTIVATION: Accurate and dense linkage maps are useful in family-based linkage and association studies, quantitative trait locus mapping, analysis of genome synteny and other genomic data analyses. Moreover, linkage mapping is one of the best ways to detect errors in de novo genome assemblies, as well as to orient and place assembly contigs within chromosomes. A small mapping cross of tens of individuals will detect many errors where distant parts of the genome are erroneously joined together. With more individuals and markers, even more local errors can be detected and more contigs can be oriented. However, the tools that are currently available for constructing linkage maps are not well suited for large, possible low-coverage, whole genome sequencing datasets. RESULTS: Here we present a linkage mapping software Lep-MAP3, capable of mapping high-throughput whole genome sequencing datasets. Such data allows cost-efficient genotyping of millions of single nucleotide polymorphisms (SNPs) for thousands of individual samples, enabling, among other analyses, comprehensive validation and refinement of de novo genome assemblies. The algorithms of Lep-MAP3 can analyse low-coverage datasets and reduce data filtering and curation on any data. This yields more markers in the final maps with less manual work even on problematic datasets. We demonstrate that Lep-MAP3 obtains very good performance already on 5x sequencing coverage and outperforms the fastest available software on simulated data on accuracy and often on speed. We also construct de novo linkage maps on 7-12x whole-genome data on the Red postman butterfly (Heliconius erato) with almost 3 million markers. AVAILABILITY AND IMPLEMENTATION: Lep-MAP3 is available with the source code under GNU general public license from http://sourceforge.net/projects/lep-map3. CONTACT: pasi.rastas@helsinki.fi. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
Authors: Bohuslav Janousek; Roman Gogela; Vaclav Bacovsky; Susanne S Renner Journal: Philos Trans R Soc Lond B Biol Sci Date: 2022-03-21 Impact factor: 6.237
Authors: Maria de la Paz Celorio-Mancera; Pasi Rastas; Rachel A Steward; Soren Nylin; Christopher W Wheat Journal: Genome Biol Evol Date: 2021-05-07 Impact factor: 3.416
Authors: Francesco Cicconardi; James J Lewis; Simon H Martin; Robert D Reed; Charles G Danko; Stephen H Montgomery Journal: Mol Biol Evol Date: 2021-09-27 Impact factor: 16.240
Authors: Roberta Bergero; Peter Ellis; Wilfried Haerty; Lee Larcombe; Iain Macaulay; Tarang Mehta; Mette Mogensen; David Murray; Will Nash; Matthew J Neale; Rebecca O'Connor; Christian Ottolini; Ned Peel; Luke Ramsey; Ben Skinner; Alexander Suh; Michael Summers; Yu Sun; Alison Tidy; Raheleh Rahbari; Claudia Rathje; Simone Immler Journal: Biol Rev Camb Philos Soc Date: 2021-01-01