Literature DB >> 25649618

PolyMarker: A fast polyploid primer design pipeline.

Ricardo H Ramirez-Gonzalez¹, Cristobal Uauy², Mario Caccamo¹.

Abstract

UNLABELLED: The design of genetic markers is of particular relevance in crop breeding programs. Despite many economically important crops being polyploid organisms, the current primer design tools are tailored for diploid species. Bread wheat, for instance, is a hexaploid comprising of three related genomes and the performance of genetic markers is diminished if the primers are not genome specific. PolyMarker is a pipeline that generates SNP markers by selecting candidate primers for a specified genome using local alignments and standard primer design tools to test the viability of the primers. A command line tool and a web interface are available to the community.
AVAILABILITY AND IMPLEMENTATION: PolyMarker is available as a ruby BioGem: bio-polyploid-tools. Web interface: http://polymarker.tgac.ac.uk.

Entities: Chemical

Mesh：

Substances：

Year: 2015 PMID： 25649618 PMCID： PMC4765872 DOI： 10.1093/bioinformatics/btv069

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

1 Introduction

Breeding programs rely on dense genetic maps with markers (e.g. SNPs) that can be used to identify the presence or absence of specific alleles in homozygous or heterozygous state. Standard primer design tools are designed to work in diploids, where genome duplications are rare. Wheat is a polyploid composed of three genomes (A, B and D; referred to as homoeologues) that are related (between 96 and 98% sequence identity), yet distinct. This creates a problem for the design of PCR primers specific to an individual homoeologue. A common approach to circumvent this issue is to manually design primers with a genome specific variant at the 3′ of the primer to increase specificity. We introduce PolyMarker, a tool to automate the design of genome specific primers, thereby reducing the time invested in this process. To make PolyMarker accessible to scientists and breeders, we developed a web server where custom SNPs can be submitted for the design of genome specific assays.

2 Description of the tool

First, PolyMarker converts the input marker information (chromosome arm, sequence adjacent to the SNP and parental alleles) into a fasta file with the sequences that can be queried to the genomic reference. The search is performed with Exonerate (Slater and Birney, 2005), with the option –ryo (roll your own format) to facilitate the parsing. The region flanking the SNP, twice the size of the maximum amplification product (200 bp for amplification products up to 100 bp), on the best hit of each chromosome is extracted. A local alignment between homoeologues and paralogues, is refined with MAFFT (Katoh and Standley, 2013), executed using the binder provided in bioruby. The local alignment is used to produce a mask containing all the variations across homoeologues and the input sequence (Supplementary Fig. S1). The mask indicates the type of variation on each position: (i) Specific: homoeologous polymorphism which is only present in the target genome; (ii) Semi-specific: homoeologous polymorphism which is found in more than one genome but it discriminates against one of the off-target genomes or when not all the homoeologous sequences were found; (iii) Non-specific: when no variation is found across homoeologues; (iv) Homoeologous: The target SNP is present across different chromosomes and; (v) Non-homoeologous: The target SNP is not present across chromosomes. PolyMarker default is to design a three-primer assay for KASP genotyping (LGC Genomics, 2013), that comprises a common primer and two allele-specific primers. Since the allele-specific primers are restricted in position, the common primer is used to incorporate the chromosome specific variants when possible. To test if the primer candidates are viable Primer3 (Rozen and Skaletsky, 2000) is invoked using the genomic reference of the target chromosome. The starting positions of the primers to distinguish between alleles is selected with the SEQUENCE_FORCE_LEFT_END option of primer3. To design the common primer the option SEQUENCE_FORCE_RIGHT_END is used on two runs of primer3, for the chromosome specific and semi-specific primers. A final run of primer3 is executed without the SEQUENCE_FORCE_RIGHT_END option to find viable primers. After the primers are tested with Primer3, PolyMarker selects a primer pair with the highest specificity. PolyMarker is a pipeline (Fig. 1) written as a Biogem (Bonnal ), extending bioruby (Goto ) to extract regions from fasta files and to support operations on nucleotide sequences with IUPAC ambiguity codes (Cornish-Bowden, 1985).

Fig. 1.

Implementation of PolyMarker. External programs are in squares, trapezoids represent the intermediate files and the document symbols represent inputs and outputs

2.1 Web interface

Our objective was to make PolyMarker accessible via a web interface where breeders and researchers could submit their markers to custom-design genome-specific SNP assays. A typical output is described in the supplemental material. Batch submission of several markers is possible. The web interface is implemented in java with MySQL database the markers information. The source code of the web interface and the daemon are available for the community to set up a private server.

2.2 Example results

The pipeline was developed originally to design KASP assays to validate putative SNPs from RNA-Seq data [28 out of 35 assays polymorphic (80%; Ramirez-Gonzalez )]. PolyMarker was also used to generate KASP assays for the 81 587 markers in the iSelect array from Wang (Supplemental Material).

3 Summary

PolyMarker is a pipeline that facilitates the design of primers in polyploid organisms. A web interface is available to design primers for hexaploid wheat. The use of PolyMarker reduces the time spent designing genome specific assays and highlighting homoeologous SNPs.

8 in total

1. Primer3 on the WWW for general users and for biologist programmers.

Authors: S Rozen; H Skaletsky
Journal: Methods Mol Biol Date: 2000

2. Nomenclature for incompletely specified bases in nucleic acid sequences: recommendations 1984.

Authors: A Cornish-Bowden
Journal: Nucleic Acids Res Date: 1985-05-10 Impact factor: 16.971

3. MAFFT multiple sequence alignment software version 7: improvements in performance and usability.

Authors: Kazutaka Katoh; Daron M Standley
Journal: Mol Biol Evol Date: 2013-01-16 Impact factor: 16.240

4. BioRuby: bioinformatics software for the Ruby programming language.

Authors: Naohisa Goto; Pjotr Prins; Mitsuteru Nakao; Raoul Bonnal; Jan Aerts; Toshiaki Katayama
Journal: Bioinformatics Date: 2010-08-25 Impact factor: 6.937

5. RNA-Seq bulked segregant analysis enables the identification of high-resolution genetic markers for breeding in hexaploid wheat.

Authors: Ricardo H Ramirez-Gonzalez; Vanesa Segovia; Nicholas Bird; Paul Fenwick; Sarah Holdgate; Simon Berry; Peter Jack; Mario Caccamo; Cristobal Uauy
Journal: Plant Biotechnol J Date: 2014-11-08 Impact factor: 9.803

6. Biogem: an effective tool-based approach for scaling up open source software development in bioinformatics.

Authors: Raoul J P Bonnal; Jan Aerts; George Githinji; Naohisa Goto; Dan MacLean; Chase A Miller; Hiroyuki Mishima; Massimiliano Pagani; Ricardo Ramirez-Gonzalez; Geert Smant; Francesco Strozzi; Rob Syme; Rutger Vos; Trevor J Wennblom; Ben J Woodcroft; Toshiaki Katayama; Pjotr Prins
Journal: Bioinformatics Date: 2012-02-12 Impact factor: 6.937

7. Automated generation of heuristics for biological sequence comparison.

Authors: Guy St C Slater; Ewan Birney
Journal: BMC Bioinformatics Date: 2005-02-15 Impact factor: 3.169

8. Characterization of polyploid wheat genomic diversity using a high-density 90,000 single nucleotide polymorphism array.

Authors: Shichen Wang; Debbie Wong; Kerrie Forrest; Alexandra Allen; Shiaoman Chao; Bevan E Huang; Marco Maccaferri; Silvio Salvi; Sara G Milner; Luigi Cattivelli; Anna M Mastrangelo; Alex Whan; Stuart Stephen; Gary Barker; Ralf Wieseke; Joerg Plieske; Morten Lillemo; Diane Mather; Rudi Appels; Rudy Dolferus; Gina Brown-Guedira; Abraham Korol; Alina R Akhunova; Catherine Feuillet; Jerome Salse; Michele Morgante; Curtis Pozniak; Ming-Cheng Luo; Jan Dvorak; Matthew Morell; Jorge Dubcovsky; Martin Ganal; Roberto Tuberosa; Cindy Lawley; Ivan Mikoulitch; Colin Cavanagh; Keith J Edwards; Matthew Hayden; Eduard Akhunov
Journal: Plant Biotechnol J Date: 2014-03-20 Impact factor: 9.803

8 in total

57 in total

1. Indel Group in Genomes (IGG) Molecular Genetic Markers.

Authors: Ted W Toal; Diana Burkart-Waco; Tyson Howell; Mily Ron; Sundaram Kuppu; Anne Britt; Roger Chetelat; Siobhan M Brady
Journal: Plant Physiol Date: 2016-07-19 Impact factor: 8.340

2. Development and validation of KASP assays for genes underpinning key economic traits in bread wheat.

Authors: Awais Rasheed; Weie Wen; Fengmei Gao; Shengnan Zhai; Hui Jin; Jindong Liu; Qi Guo; Yingjun Zhang; Susanne Dreisigacker; Xianchun Xia; Zhonghu He
Journal: Theor Appl Genet Date: 2016-06-15 Impact factor: 5.699

3. TEOSINTE BRANCHED1 Regulates Inflorescence Architecture and Development in Bread Wheat (Triticum aestivum).

Authors: Laura E Dixon; Julian R Greenwood; Stefano Bencivenga; Peng Zhang; James Cockram; Gregory Mellers; Kerrie Ramm; Colin Cavanagh; Steve M Swain; Scott A Boden
Journal: Plant Cell Date: 2018-02-14 Impact factor: 11.277

4. Genome sequence and genetic diversity of European ash trees.

Authors: Elizabeth S A Sollars; Andrea L Harper; Laura J Kelly; Christine M Sambles; Ricardo H Ramirez-Gonzalez; David Swarbreck; Gemy Kaithakottil; Endymion D Cooper; Cristobal Uauy; Lenka Havlickova; Gemma Worswick; David J Studholme; Jasmin Zohren; Deborah L Salmon; Bernardo J Clavijo; Yi Li; Zhesi He; Alison Fellgett; Lea Vig McKinney; Lene Rostgaard Nielsen; Gerry C Douglas; Erik Dahl Kjær; J Allan Downie; David Boshier; Steve Lee; Jo Clark; Murray Grant; Ian Bancroft; Mario Caccamo; Richard J A Buggs
Journal: Nature Date: 2016-12-26 Impact factor: 49.962

Review 5. From markers to genome-based breeding in wheat.

Authors: Awais Rasheed; Xianchun Xia
Journal: Theor Appl Genet Date: 2019-01-23 Impact factor: 5.699

6. Genetic dissection of a major QTL for kernel weight spanning the Rht-B1 locus in bread wheat.

Authors: Dengan Xu; Weie Wen; Luping Fu; Faji Li; Jihu Li; Li Xie; Xianchun Xia; Zhongfu Ni; Zhonghu He; Shuanghe Cao
Journal: Theor Appl Genet Date: 2019-09-12 Impact factor: 5.699

7. Grain protein content and thousand kernel weight QTLs identified in a durum × wild emmer wheat mapping population tested in five environments.

Authors: Andrii Fatiukha; Naveh Filler; Itamar Lupo; Gabriel Lidzbarsky; Valentyna Klymiuk; Abraham B Korol; Curtis Pozniak; Tzion Fahima; Tamar Krugman
Journal: Theor Appl Genet Date: 2019-09-27 Impact factor: 5.699

8. SNP-based pool genotyping and haplotype analysis accelerate fine-mapping of the wheat genomic region containing stripe rust resistance gene Yr26.

Authors: Jianhui Wu; Qingdong Zeng; Qilin Wang; Shengjie Liu; Shizhou Yu; Jingmei Mu; Shuo Huang; Hanan Sela; Assaf Distelfeld; Lili Huang; Dejun Han; Zhensheng Kang
Journal: Theor Appl Genet Date: 2018-04-17 Impact factor: 5.699

9. A major QTL co-localized on chromosome 6BL and its epistatic interaction for enhanced wheat stripe rust resistance.

Authors: Qingdong Zeng; Jianhui Wu; Shengjie Liu; Shuo Huang; Qilin Wang; Jingmei Mu; Shizhou Yu; Dejun Han; Zhensheng Kang
Journal: Theor Appl Genet Date: 2019-02-01 Impact factor: 5.699

10. A roadmap for gene functional characterisation in crops with large genomes: Lessons from polyploid wheat.

Authors: Nikolai M Adamski; Philippa Borrill; Jemima Brinton; Sophie A Harrington; Clémence Marchal; Alison R Bentley; William D Bovill; Luigi Cattivelli; James Cockram; Bruno Contreras-Moreira; Brett Ford; Sreya Ghosh; Wendy Harwood; Keywan Hassani-Pak; Sadiye Hayta; Lee T Hickey; Kostya Kanyuka; Julie King; Marco Maccaferrri; Guy Naamati; Curtis J Pozniak; Ricardo H Ramirez-Gonzalez; Carolina Sansaloni; Ben Trevaskis; Luzie U Wingen; Brande Bh Wulff; Cristobal Uauy
Journal: Elife Date: 2020-03-24 Impact factor: 8.140