Literature DB >> 30624621

KASPspoon: an in vitro and in silico PCR analysis tool for high-throughput SNP genotyping.

Alsamman M Alsamman1, Shafik D Ibrahim1, Aladdin Hamwieh2.   

Abstract

MOTIVATION: Fine mapping becomes a routine trial following quantitative trait loci (QTL) mapping studies to shrink the size of genomic segments underlying causal variants. The availability of whole genome sequences can facilitate the development of high marker density and predict gene content in genomic segments of interest. Correlations between genetic and physical positions of these loci require handling of different experimental genetic data types, and ultimately converting them into positioning markers using a routine and efficient tool.
RESULTS: To convert classical QTL markers into KASP assay primers, KASPspoon simulates a PCR by running an approximate-match searching analysis on user-entered primer pairs against the provided sequences, and then comparing in vitro and in silico PCR results. KASPspoon reports amplimers close to or adjoining genes/SNPs/simple sequence repeats and those that are shared between in vitro and in silico PCR results to select the most appropriate amplimers for gene discovery. KASPspoon compares physical and genetic maps, and reports the primer set genome coverage for PCR-walking. KASPspoon could be used to design KASP assay primers to convert QTL acquired by classical molecular markers into high-throughput genotyping assays and to provide major SNP resource for the dissection of genotypic and phenotypic variation. In addition to human-readable output files, KASPspoon creates Circos configurations that illustrate different in silico and in vitro results.
AVAILABILITY AND IMPLEMENTATION: Code available under GNU GPL at (http://www.ageri.sci.eg/index.php/facilities-services/ageri-softwares/kaspspoon). SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2019. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2019        PMID: 30624621      PMCID: PMC6735863          DOI: 10.1093/bioinformatics/btz004

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Classically acquired quantitative trait loci (QTL) have been introduced using numerous field trials and laboratory experiments with significant correlations with economical useful traits. The expansion in genome sequencing avenues has provided public genome repositories with massive amounts of biological information regarding different genome sequences and single nucleotide variations (Doddamani ). Simultaneously, advanced molecular marker technologies provide extremely high levels of assay robustness and accuracy with significant cost savings. An example of these technologies, the KASP assay (He ), has been efficiently used to detect and validate nucleotide variations, single nucleotide polymorphisms (SNPs)/InDels, related to important traits across different organisms (Kolmer ). As a result, harvesting more gain from published QTL across different organisms requires correlations between genetic and physical positions of these loci, handling different experimental genetic data types, and ultimately converting them into positioning markers using a routine and fast computational tool. In silico PCR is one of the most widely used techniques to determine the physical position of these loci. In silico PCR is a computational procedure that estimates PCR results theoretically using a given set of primers to amplify DNA sequences from a sequenced genome or transcriptome (Lexa ). In recent years, a plethora of software programs have been developed to aid in silico PCR analysis, including Primer-BLAST (Ye ), SNPCheck (http://ngrl.man.ac.uk), FastPCR (Kalendar ), Primersearch-EMBOSS (Rice ) and PUNS (Boutros and Okey, 2004). Nevertheless, researchers have a long journey when decreasing and correlating prior or subsequent information regarding the genetic data of these markers to obtain conclusive results they can employ. This study presents a PCR primer test application, called ‘KASPspoon,’ for routine manipulation and analysis of PCR primers. The final and main goal of KASPspoon is converting classically acquired QTL information into more comprehensive, routine and accurate molecular marker technologies such as a KASP assay. To reach this goal, KASPspoon can be used efficiently to compare in vitro (laboratory observed) and in silico (predicted) PCR results, and reports SNPs that are close to or adjoined by PCR amplimers by integrating a database for known SNPs. KASPspoon uses this information to design KASP assay primers to convert QTL markers into high-throughput genotyping to provide major SNP markers for dissection of genotypic and phenotypic variation. It also reports chromosomal-specific primers (anchored) and compares between physical and genetic maps, which could be useful for linkage and association mapping analysis. In addition, primer set genome coverage can support genome-walking PCR procedures that cover gene-rich genomic regions. Also, by processing simple sequence repeat (SSR) markers, KASPspoon can report amplimers that adjoin SSRs to analyze comparisons between observed and predicted motifs and to use their abundance in genome for more accurate primer selectivity. Furthermore, KASPspoon produces different Circos configurations to illustrate in silico results, which can be easily handled through Circos software package (Krzywinski ). This helps users to visualize results, unify biological data results for multiple analysis tools, magnify and exclude any part of the results, and develop further analysis techniques using simple programming (i.e. Circos scripting) without handling the original source code.

2 Design and implementation

KASPspoon handles a different experimental genetic data type, correlates between the genetic and physical position of loci, and generates ultimate positioning markers for KASP assay. Snapshots of KASPspoon outputs are shown in Supplementary File S1. KASPspoon was developed as a stand-alone package for PCR primer analysis using both C and Perl programming languages. The Boyer–Moore–Horspool (Horspool, 1980) and Baeza-Yates–Perleberg (Baeza-Yates and Perleberg, 1992) string approximate-matching algorithms were used through C to search genomic sequences provided PCR primer-pair sequences were used as queries. For primer genome coverage statistics and PCR-walking procedure, the overlap layout consensus algorithm with a user-defined gap between amplimers is used to report primer(s’) covered area (Supplementary File S1). KASPspoon uses common biological data formats as an input and only amplimers that do not exceed user-defined maximum primer mismatch or maximum total mismatch (forward primer mismatch + reverse primer mismatch), or those that do not have a mismatch in the first user-defined 3′ nucleotides are reported (Supplementary File S1). KASPspoon can compare in silico (predict PCR product size) and in vitro PCR results (observed PCR product size) by defining the maximum molecular weight mismatch between the in silico and in vitro amplimers if the in vitro PCR product length (in base pairs) is provided (approximately). For comparing in silico and in vitro PCR amplimers, KASPspoon generates a text file containing amplimers that exist in both and that contain (or do not) genes. In addition, MISA Perl script (pgrc.ipk-gatersleben.de/misa/) is integrated inside KASPspoon tool to report all SSRs that lie between the PCR-amplified regions. When genetic linkage map is provided, KASPspoon can compare between the physical (bp) and genetic (cM) positions for PCR markers, where information provided by in silico PCR analysis will be used to assign genetic linkage group(s) to physical chromosome(s). If a list of SNP variations is provided, KASPspoon can generate KASP assay primers that can be used for SNP genotyping. These KASP primers are designed to target all SNPs that are nearby PCR-amplified chromosomal regions. Report files that contain all KASP-targeted genes and marker loci are generated. The KASP sequences are designed according to a KASP primer design manual published by LGC (www.lgcgroup.com). Primer3 tool (Untergasser ) was used to design two allele-specific forward primers, and a common reverse primer for allele-specific assays such as KASP assay. These primers designed by KASPspoon use a user-provided SNP database to create degenerate PCR primers to provide primers with minimal mismatches, where the target nucleotide is marked by ‘[]’ and untargeted nucleotides are masked according to IUPAC codes. KASPspoon will generate different Circos configurations for in silico PCR statistical results, comparison between in silico and in vitro PCR data and linkages, and in silico (physical) maps. The search returns a sequence output file in FASTA format, containing all sequences in the database that lie between, and include, the primer pair. FASTA header describes the region in the database and primer(s) names. Comma-separated output files generated by KASPspoon include: in silico PCR-generated amplimer information in silico amplimers near/adjoining genes SNPs near in silico amplimers in VCF in silico amplimers adjoining SSRs in vitro and in silico amplimers acquired by the same PCR primer that share the same approximate band size location of genes adjoining or close to PCR primer regions genomic areas that are covered using this primer set primer set coverage statistics with both base-pair and percentage scales (compared to length of total genome sequence covered and chromosomal sequence length) chromosomal assignment for linkage genetic groups KASP primer(s) sequence and information final report files containing different information about this run in an abbreviated form.

3 Results

Although KASPspoon showed a medium processing speed compared to tools uses BLAST as a search engine such as Primer-BLAST and SNPCheck, KASPspoon provides several additional advantages that are lacking in some or all published tools, such as being free, source code availability, primer coverage, in silico and in vitro PCR comparison, SNP or SSR or anchor primer reporting, physical and linkage map comparison, output graphical illustration, KASP primer design and human readable and easy manipulated outputs (Table 1).
Table 1.

Comparison between KASPspoon and published in silico PCR tools

ComparisonKASPspoonPrimer-BLASTSNPCheckFastPCRPrimersearch
Anchored primer reportingYesNoNoYesNo
Unrestricted list of genomesYesNoNoYesYes
NoncommercialYesYesNoNoYes
Gene reportingYesYesNoYesNo
In silico/in vitro PCR comparisonYesNoNoNoNo
In silico PCR speed (primer/s)a7529012
KASP primer designYesNoNoNoNo
Position-depend mispairing weightYesNoNoNoNo
Linkage/physical maps comparisonYesNoNoNoNo
Local installationYesYesNoYesYes
Primer batch analysisYesNoYesYesYes
Primer mismatch parametersYesYesNoYesYes
Primer coverageYesNoNoNoNo
Output graphical illustrationYesNoYesNoNo
SNP reportingYesNoYesNoNo
Source code availabilityYesNoNoNoYes
SSR reportingYesNoNoYesNo
Table-formatted outputsYesNoYesYesNo
GUIYesYesYesYesYes
Multiple operating systemsYesYesYesNoYes

aSpeed calculated by processing human Chr1 Mohajeri .

Comparison between KASPspoon and published in silico PCR tools aSpeed calculated by processing human Chr1 Mohajeri . The 462 previously published chickpea SSR markers (Supplementary File S2) produced 892 bands, covering 168 082 bp (0.048%) of the chromosomal genome, in which 89.6% were chromosome-specific and 24% had genes nearby (Supplementary Files S1 and S3). Published chickpea linkage map (Nayak ) was used to compare the genetic and in silico map position for 241 primers. Most of these markers were successfully assigned to corresponding chromosomes as assumed by Nayak , and others had a high number of markers belonging to other chromosomes. Chickpea SNP database (Doddamani ) was used to detect SNPs nearby and to design KASP primers (Supplementary File S4) and investigate SNP effects using SnpEff tool (Cingolani ). About 99.41% of the detected SNPs were ‘MODIFIER.’ Two had a ‘HIGH’ impact on two chickpea uncharacterized proteins.

4 Conclusion

KASPspoon could be successfully integrated in different genomics procedures such as primer design, genome mapping, QTL fine mapping, genome wide association analysis, PCR-walking and SNP genotyping. Combining KASPspoon with SNP selection programs such as SnpEff could decrease the number of SNPs used for KASP assay primer design. Adding more than one genome in one basket for in silico PCR could help to select potential polymorphic markers through genomes. Circos configurations will help to give a grand overview regarding QTL chromosomal position and closeness to genes or SNPs.

4.1 Availability

Source code (Linux installer), Microsoft Windows installer, manual, sample data, and sample output are available for non-commercial purposes and can be downloaded from http://www.ageri.sci.eg/index.php/facilities-services/ageri-softwares/kaspspoon. Click here for additional data file.
  12 in total

1.  EMBOSS: the European Molecular Biology Open Software Suite.

Authors:  P Rice; I Longden; A Bleasby
Journal:  Trends Genet       Date:  2000-06       Impact factor: 11.639

2.  Virtual PCR.

Authors:  M Lexa; J Horak; B Brzobohaty
Journal:  Bioinformatics       Date:  2001-02       Impact factor: 6.937

3.  PUNS: transcriptomic- and genomic-in silico PCR for enhanced primer design.

Authors:  Paul C Boutros; Allan B Okey
Journal:  Bioinformatics       Date:  2004-04-08       Impact factor: 6.937

4.  A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.

Authors:  Pablo Cingolani; Adrian Platts; Le Lily Wang; Melissa Coon; Tung Nguyen; Luan Wang; Susan J Land; Xiangyi Lu; Douglas M Ruden
Journal:  Fly (Austin)       Date:  2012 Apr-Jun       Impact factor: 2.160

5.  Circos: an information aesthetic for comparative genomics.

Authors:  Martin Krzywinski; Jacqueline Schein; Inanç Birol; Joseph Connors; Randy Gascoyne; Doug Horsman; Steven J Jones; Marco A Marra
Journal:  Genome Res       Date:  2009-06-18       Impact factor: 9.043

6.  Java web tools for PCR, in silico PCR, and oligonucleotide assembly and analysis.

Authors:  Ruslan Kalendar; David Lee; Alan H Schulman
Journal:  Genomics       Date:  2011-05-03       Impact factor: 5.736

7.  Integration of novel SSR and gene-based SNP marker loci in the chickpea genetic map and establishment of new anchor points with Medicago truncatula genome.

Authors:  Spurthi N Nayak; Hongyan Zhu; Nicy Varghese; Subhojit Datta; Hong-Kyu Choi; Ralf Horres; Ruth Jüngling; Jagbir Singh; P B Kavi Kishor; S Sivaramakrishnan; Dave A Hoisington; Günter Kahl; Peter Winter; Douglas R Cook; Rajeev K Varshney
Journal:  Theor Appl Genet       Date:  2010-01-23       Impact factor: 5.699

8.  Primer-BLAST: a tool to design target-specific primers for polymerase chain reaction.

Authors:  Jian Ye; George Coulouris; Irena Zaretskaya; Ioana Cutcutache; Steve Rozen; Thomas L Madden
Journal:  BMC Bioinformatics       Date:  2012-06-18       Impact factor: 3.169

9.  Primer3--new capabilities and interfaces.

Authors:  Andreas Untergasser; Ioana Cutcutache; Triinu Koressaar; Jian Ye; Brant C Faircloth; Maido Remm; Steven G Rozen
Journal:  Nucleic Acids Res       Date:  2012-06-22       Impact factor: 16.971

10.  CicArVarDB: SNP and InDel database for advancing genetics research and breeding applications in chickpea.

Authors:  Dadakhalandar Doddamani; Aamir W Khan; Mohan A V S K Katta; Gaurav Agarwal; Mahendar Thudi; Pradeep Ruperao; David Edwards; Rajeev K Varshney
Journal:  Database (Oxford)       Date:  2015-08-19       Impact factor: 3.451

View more
  2 in total

1.  Genome-wide microsatellites in amaranth: development, characterization, and cross-species transferability.

Authors:  Kapil K Tiwari; Nevya J Thakkar; Darshan T Dharajiya; Hetal L Bhilocha; Parita P Barvaliya; Bhemji P Galvadiya; N N Prajapati; M P Patel; S D Solanki
Journal:  3 Biotech       Date:  2021-08-04       Impact factor: 2.893

Review 2.  Designing Allele-Specific Competitive-Extension PCR-Based Assays for High-Throughput Genotyping and Gene Characterization.

Authors:  Ruslan Kalendar; Alexandr V Shustov; Ilyas Akhmetollayev; Ulykbek Kairov
Journal:  Front Mol Biosci       Date:  2022-03-01
  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.