Literature DB >> 26249809

ASSIsT: an automatic SNP scoring tool for in- and outbreeding species.

Mario Di Guardo1, Diego Micheletti2, Luca Bianco3, Herma J J Koehorst-van Putten4, Sara Longhi4, Fabrizio Costa3, Maria J Aranzana5, Riccardo Velasco3, Pere Arús5, Michela Troggio3, Eric W van de Weg4.   

Abstract

UNLABELLED: ASSIsT (Automatic SNP ScorIng Tool) is a user-friendly customized pipeline for efficient calling and filtering of SNPs from Illumina Infinium arrays, specifically devised for custom genotyping arrays. Illumina has developed an integrated software for SNP data visualization and inspection called GenomeStudio (GS). ASSIsT builds on GS-derived data and identifies those markers that follow a bi-allelic genetic model and show reliable genotype calls. Moreover, ASSIsT re-edits SNP calls with null alleles or additional SNPs in the probe annealing site. ASSIsT can be employed in the analysis of different population types such as full-sib families and mating schemes used in the plant kingdom (backcross, F1, F2), and unrelated individuals. The final result can be directly exported in the format required by the most common software for genetic mapping and marker-trait association analysis. ASSIsT is developed in Python and runs in Windows and Linux.
AVAILABILITY AND IMPLEMENTATION: The software, example data sets and tutorials are freely available at http://compbiotoolbox.fmach.it/assist/. CONTACT: eric.vandeweg@wur.nl.
© The Author 2015. Published by Oxford University Press.

Entities:  

Mesh:

Year:  2015        PMID: 26249809      PMCID: PMC4653386          DOI: 10.1093/bioinformatics/btv446

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Advances in whole genome genotyping technologies enabled the investigation of several hundred thousand SNP markers simultaneously on a genome-wide scale. To date, Illumina (GoldenGate® and Infinium®) and Affimetrix (Axiom®) are the most widely used array-based genotyping platforms worldwide. Illumina has developed GenomeStudio®, a proprietary software with a graphical user interface (GUI) for SNP data visualization and filtering that enables the selection of high-quality markers showing robust performance across the examined germplasm. However, the actual filtering of such SNPs requires a deep understanding of the performance of SNP markers, genetic segregation patterns and familiarity with the many tools and parameters in GenomeStudio® (GS). ASSIsT accounts for this by offering a user friendly, automated pipeline that builds on the results of Illumina’s GenCall algorithm (Kermani, 2006) as incorporated in GS. In addition to filtering, ASSIsT also re-edits GS-calls in order to better explore the available information for SNPs showing null alleles or additional SNP clusters® due to additional polymorphisms at the probe annealing site. This re-editing enhances correct SNP calling and reduces unnecessary removal of potentially valuable markers.

2 Methods

The analysis and selection of SNPs performed by ASSisT is based on the calls produced by Illumina’s GenCall algorithm (Kermani, 2006). A two tiers approach that employs a bi-allelic genetic model, and then a tri-allelic model is used to classify SNPs on the basis of their real performance on examined germplasm. The tri-allelic model is used to describe more complex segregation patterns due to null-alleles or alleles with variable signal intensity due to additional SNP, as the bi-allelic genetic model used by GS cannot account for such polymorphisms (Bassil ; Gardner ; Pikunova ; Troggio ). In this case, ASSIsT may re-edit GS-calls by applying de novo filters using the original light intensity data and the segregation patterns in the germplasm.

3 Results

ASSIsT supports the analyses of different population types, such as full-sib families (e.g. human, livestock, cross pollinating plants), mating schemes common in plants (backcross, F1, F2) and individuals with unknown genetic relationships. ASSIsT’s GUI allows easy parameter setting and provides a visual output of the SNP clustering analysis. The results produced by ASSIsT can be directly exported to the input format of the most widely used software for genetic and marker-trait association analysis (FlexQTLTM, GAPIT, JoinMap, PLINK, Structure and Tassel). This straightforward integration will improve marker performance in association and QTL mapping studies. ASSIsT is developed in Python (www.python.org). Its source code is released under the GNU General Public Licence (GNU-GPLv3) to allow its integration into bioinformatic pipelines. ASSIsT requires three input files: a pedigree file in which the parents of each sample are reported and two standard report files from GS (Final Report and DNA Report). The two GS reports are standard output of commercial service companies; therefore, ASSIsT does not necessarily require access to GS. A map file with the genetic or physical position of the markers may also be included. This information is mandatory for exporting results in Structure or PLINK formats. ASSIsT allows pre-selection of the stringency of the filtering procedure by customizing the following parameters: (1) Proportion of missing data, (2) Call Rate threshold, (3) Segregation distortion (χ2 P-value), (4) Frequency of not allowed genotypes (structured germplasm) and (5) Minor Allele Frequency. The first step of the filtering analysis is a quality check of the individuals; samples with a high proportion of unexpected marker genotypes due to outcrossing, different ploidy levels and DNA admixture, among other causes, are considered deviating germplasm and further excluded from the analysis. Samples with poor DNA quality (call rate significantly lower than the average of the dataset) will not be considered in the analysis either. All discarded samples are listed in the ‘summary’ output file. Only ‘robust’ markers (i.e. those showing a clear cluster separation and few No Calls) are allowed through the initial filtering. These markers can show two (one homozygous and one heterozygous) or three clusters (two homozygous and one heterozygous). For some markers, the AB cluster might result in two distinct sub-clusters, due to additional SNPs at the probe site, which may lead to differential hybridization efficiency and to distinct classes of signal intensity within a marker allele. The variation in signal intensity, generally ignored by GS, is considered by ASSisT instead. For instance, a cross between two heterozygous parents generates three genotype clusters at a single locus (e.g. CT × CT produces ¼CC + ½CT + ¼TT). When one allele (let us say T) shows two distinct intensity classes, it may be interpreted as  × Ct, which gives ¼CC + ¼CT + ¼Ct + ¼Tt. The discernment between the two heterozygous classes (CT and Ct) makes this marker fully informative in inheritance studies, where as ‘classical’ heterozygotes are not informative in the generation of genetic linkage maps as it is not possible to determine the parental origin of the alleles. Additional SNPs in the probe, as well as INDELS (Pikunova ), may also give rise to null alleles, due to the lack of signal in one of the DNA templates, which results in additional clusters. GS cannot currently account for this scenario; thus, informative markers are lost. Conversely, ASSIsT succeeds in the analysis of the majority of such markers (A0 × A0, A0 × 00 and A0 × B0), allowing a more efficient marker calling. All the above-mentioned SNP classes are suitable for the generation of genetic linkage maps or for marker–trait association studies. Discarded markers are grouped according to their performance considering absence of or severe distortion in segregation, presence of not allowed genotypes in segregating families and number of No Calls. ASSIsT has been used to analyze SNP markers of several bi-parental full-sib families and germplasm of apple (Bianco ), peach, melon and grape. For each family, ∼99% of the ‘approved’ (those that passed the filtering procedure) SNPs showed to have high-quality data as they integrated smoothly in the generation of high-quality genetic linkage maps. The remaining 1% presented several types of issues, largely related to the presence of paralog loci where the AB cluster was too close or even merged to one of the two homozygous clusters. ASSIsT thus proved to be an effective tool for genotyping studies as it allows to easily filter informative and well-performing SNP and to recover potentially useful SNPs from indels or regions of high-sequence divergence, feeding them directly to the most common downstream analysis tools through its easy interface.
  4 in total

1.  Evaluation of SNP Data from the Malus Infinium Array Identifies Challenges for Genetic Analysis of Complex Genomes of Polyploid Origin.

Authors:  Michela Troggio; Nada Surbanovski; Luca Bianco; Marco Moretto; Lara Giongo; Elisa Banchi; Roberto Viola; Felicdad Fernández Fernández; Fabrizio Costa; Riccardo Velasco; Alessandro Cestaro; Daniel James Sargent
Journal:  PLoS One       Date:  2013-06-27       Impact factor: 3.240

2.  Development and preliminary evaluation of a 90 K Axiom® SNP array for the allo-octoploid cultivated strawberry Fragaria × ananassa.

Authors:  Nahla V Bassil; Thomas M Davis; Hailong Zhang; Stephen Ficklin; Mike Mittmann; Teresa Webster; Lise Mahoney; David Wood; Elisabeth S Alperin; Umesh R Rosyara; Herma Koehorst-Vanc Putten; Amparo Monfort; Daniel J Sargent; Iraida Amaya; Beatrice Denoyes; Luca Bianco; Thijs van Dijk; Ali Pirani; Amy Iezzoni; Dorrie Main; Cameron Peace; Yilong Yang; Vance Whitaker; Sujeet Verma; Laurent Bellon; Fiona Brew; Raul Herrera; Eric van de Weg
Journal:  BMC Genomics       Date:  2015-03-07       Impact factor: 3.969

3.  Fast and cost-effective genetic mapping in apple using next-generation sequencing.

Authors:  Kyle M Gardner; Patrick Brown; Thomas F Cooke; Scott Cann; Fabrizio Costa; Carlos Bustamante; Riccardo Velasco; Michela Troggio; Sean Myles
Journal:  G3 (Bethesda)       Date:  2014-07-16       Impact factor: 3.154

4.  Development and validation of a 20K single nucleotide polymorphism (SNP) whole genome genotyping array for apple (Malus × domestica Borkh).

Authors:  Luca Bianco; Alessandro Cestaro; Daniel James Sargent; Elisa Banchi; Sophia Derdak; Mario Di Guardo; Silvio Salvi; Johannes Jansen; Roberto Viola; Ivo Gut; Francois Laurens; David Chagné; Riccardo Velasco; Eric van de Weg; Michela Troggio
Journal:  PLoS One       Date:  2014-10-10       Impact factor: 3.240

  4 in total
  22 in total

1.  Integrated QTL detection for key breeding traits in multiple peach progenies.

Authors:  José R Hernández Mora; Diego Micheletti; Marco Bink; Eric Van de Weg; Celia Cantín; Nelson Nazzicari; Andrea Caprera; Maria Teresa Dettori; Sabrina Micali; Elisa Banchi; José Antonio Campoy; Elisabeth Dirlewanger; Patrick Lambert; Thierry Pascal; Michela Troggio; Daniele Bassi; Laura Rossini; Ignazio Verde; Bénédicte Quilot-Turion; François Laurens; Pere Arús; Maria José Aranzana
Journal:  BMC Genomics       Date:  2017-06-06       Impact factor: 3.969

2.  Construction of a collection of introgression lines of "Texas" almond DNA fragments in the "Earlygold" peach genetic background.

Authors:  Naveen Kalluri; Octávio Serra; José Manuel Donoso; Roger Picañol; Werner Howad; Iban Eduardo; Pere Arús
Journal:  Hortic Res       Date:  2022-03-23       Impact factor: 7.291

3.  A high-density integrated map for grapevine based on three mapping populations genotyped by the Vitis18K SNP chip.

Authors:  Laura Costantini; Jessica A Vervalle; Silvia Lorenzi; Massimo Pindo; Riccardo Mora; Giada Bolognesi; Martina Marini; Justin G Lashbrooke; Ken R Tobutt; Melané A Vivier; Rouvay Roodt-Wilding; Maria Stella Grando; Diana Bellin
Journal:  Theor Appl Genet       Date:  2022-10-21       Impact factor: 5.574

4.  A high-density, multi-parental SNP genetic map on apple validates a new mapping approach for outcrossing species.

Authors:  Erica A Di Pierro; Luca Gianfranceschi; Mario Di Guardo; Herma Jj Koehorst-van Putten; Johannes W Kruisselbrink; Sara Longhi; Michela Troggio; Luca Bianco; Hélène Muranty; Giulia Pagliarani; Stefano Tartarini; Thomas Letschka; Lidia Lozano Luis; Larisa Garkava-Gustavsson; Diego Micheletti; Marco Cam Bink; Roeland E Voorrips; Ebrahimi Aziz; Riccardo Velasco; François Laurens; W Eric van de Weg
Journal:  Hortic Res       Date:  2016-11-23       Impact factor: 6.793

5.  Combining Genome-Wide Information with a Functional Structural Plant Model to Simulate 1-Year-Old Apple Tree Architecture.

Authors:  Vincent Migault; Benoît Pallas; Evelyne Costes
Journal:  Front Plant Sci       Date:  2017-01-12       Impact factor: 5.753

Review 6.  An integrated approach for increasing breeding efficiency in apple and peach in Europe.

Authors:  Francois Laurens; Maria José Aranzana; Pere Arus; Daniele Bassi; Marco Bink; Joan Bonany; Andrea Caprera; Luca Corelli-Grappadelli; Evelyne Costes; Charles-Eric Durel; Jehan-Baptiste Mauroux; Hélène Muranty; Nelson Nazzicari; Thierry Pascal; Andrea Patocchi; Andreas Peil; Bénédicte Quilot-Turion; Laura Rossini; Alessandra Stella; Michela Troggio; Riccardo Velasco; Eric van de Weg
Journal:  Hortic Res       Date:  2018-03-01       Impact factor: 6.793

7.  Deciphering the genetic control of fruit texture in apple by multiple family-based analysis and genome-wide association.

Authors:  Mario Di Guardo; Marco C A M Bink; Walter Guerra; Thomas Letschka; Lidia Lozano; Nicola Busatto; Lara Poles; Alice Tadiello; Luca Bianco; Richard G F Visser; Eric van de Weg; Fabrizio Costa
Journal:  J Exp Bot       Date:  2017-03-01       Impact factor: 6.992

8.  Detecting QTLs and putative candidate genes involved in budbreak and flowering time in an apple multiparental population.

Authors:  Alix Allard; Marco C A M Bink; Sébastien Martinez; Jean-Jacques Kelner; Jean-Michel Legave; Mario di Guardo; Erica A Di Pierro; François Laurens; Eric W van de Weg; Evelyne Costes
Journal:  J Exp Bot       Date:  2016-03-31       Impact factor: 6.992

9.  PediHaplotyper: software for consistent assignment of marker haplotypes in pedigrees.

Authors:  Roeland E Voorrips; Marco C A M Bink; Johannes W Kruisselbrink; Herma J J Koehorst-van Putten; W Eric van de Weg
Journal:  Mol Breed       Date:  2016-08-08       Impact factor: 2.589

Review 10.  Biotechnology and apple breeding in Japan.

Authors:  Megumi Igarashi; Yoshimichi Hatsuyama; Takeo Harada; Tomoko Fukasawa-Akada
Journal:  Breed Sci       Date:  2016-01-01       Impact factor: 2.086

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.