Literature DB >> 29440129

Genomic Predictions and Genome-Wide Association Study of Resistance Against Piscirickettsia salmonis in Coho Salmon (Oncorhynchus kisutch) Using ddRAD Sequencing.

Agustín Barría1,2, Kris A Christensen3, Grazyella M Yoshida1,4, Katharina Correa1,5, Ana Jedlicki1, Jean P Lhorente5, William S Davidson3, José M Yáñez6,5,7.   

Abstract

Piscirickettsia salmonis is one of the main infectious diseases affecting coho salmon (Oncorhynchus kisutch) farming, and current treatments have been ineffective for the control of this disease. Genetic improvement for P. salmonis resistance has been proposed as a feasible alternative for the control of this infectious disease in farmed fish. Genotyping by sequencing (GBS) strategies allow genotyping of hundreds of individuals with thousands of single nucleotide polymorphisms (SNPs), which can be used to perform genome wide association studies (GWAS) and predict genetic values using genome-wide information. We used double-digest restriction-site associated DNA (ddRAD) sequencing to dissect the genetic architecture of resistance against P. salmonis in a farmed coho salmon population and to identify molecular markers associated with the trait. We also evaluated genomic selection (GS) models in order to determine the potential to accelerate the genetic improvement of this trait by means of using genome-wide molecular information. A total of 764 individuals from 33 full-sib families (17 highly resistant and 16 highly susceptible) were experimentally challenged against P. salmonis and their genotypes were assayed using ddRAD sequencing. A total of 9,389 SNPs markers were identified in the population. These markers were used to test genomic selection models and compare different GWAS methodologies for resistance measured as day of death (DD) and binary survival (BIN). Genomic selection models showed higher accuracies than the traditional pedigree-based best linear unbiased prediction (PBLUP) method, for both DD and BIN. The models showed an improvement of up to 95% and 155% respectively over PBLUP. One SNP related with B-cell development was identified as a potential functional candidate associated with resistance to P. salmonis defined as DD.
Copyright © 2018 Barria et al.

Entities:  

Keywords:  GWAS; GenPred; Genomic Selection; Oncorhynchus kisutch; Shared Data Resources; disease resistance; genotyping by sequencing; selective breeding

Mesh:

Substances:

Year:  2018        PMID: 29440129      PMCID: PMC5873909          DOI: 10.1534/g3.118.200053

Source DB:  PubMed          Journal:  G3 (Bethesda)        ISSN: 2160-1836            Impact factor:   3.154


Chile is the largest producer of coho salmon (Oncorhynchus kisutch) globally, reaching about 160,000 tons in 2014, representing more than 90% of total production (FAO 2016). However, the success and sustainability of this industry is constantly threatened by infectious diseases, including Salmon Rickettsial Syndrome (SRS). This disease is caused by Piscirickettsia salmonis, a gram-negative and facultative intracellular bacteria, which was isolated for the first time in Chile in coho salmon (Cvitanich ). Data from the Chilean National Fisheries and Aquaculture Service (Sernapesca) indicates that during the first half of 2016, 53% of the moralities ascribed to infectious diseases in coho salmon were associated with SRS (Sernapesca 2016). To date, control measures and treatments for SRS are based on antibiotics and vaccines. However, both strategies have not had the expected effectiveness under field conditions (Rozas and Enríquez 2014). Because of this, it is necessary to develop alternative strategies for the control of this disease (Yáñez ). In this regard, breeding for enhanced disease resistance is a feasible and sustainable option to improve animal health, welfare and productivity (Stear ). A primary requisite for including disease resistance into a breeding program is the presence of significant additive genetic variation for the trait (Falconer and Mackay 1996). Commonly, data to evaluate resistance comes from experimental challenges carried out using siblings of the selection candidates (Ødegård ; Yáñez ). Quantitative studies have estimated significant genetic variation for resistance against different pathogens in salmonid species (Ødegård ; Yáñez ). For instance, low to moderate heritabilities for resistance against P. salmonis in Atlantic salmon (Salmo salar) (h2 = 0.11 to 0.41) (Yáñez ; Yáñez ) and coho salmon (h2 = 0.16) (Yáñez ) have been estimated. Marker assisted selection (MAS) can improve production traits in cases where the phenotypes are difficult to measure in the selected candidates (e.g., disease resistance traits) and the total additive genetic variance explained by genetic markers is high (Hayes and Goddard 2010). This methodology has been successfully applied for the improvement of resistance against the Infectious Pancreatic Necrosis Virus (IPNV) in Atlantic salmon, which is controlled by a major quantitative trait locus (QTL) (Houston ; Moen ). In the case of polygenic traits, genomic selection (GS) (Meuwissen ) can significantly improve selection accuracy of breeding values compared to traditional selection, and therefore enhance the response of selection for disease resistance in salmonid species (Tsai ; Vallejo , 2017a; Bangera ; Correa ; Yoshida ). Genotyping by sequencing (GBS) is an alternative for genotyping in cases when SNP panels are not available. This approach reduces the complexity of the genome, and can be used to identify thousands of markers without prior marker discovery efforts or a reference genome. Currently, several approaches of GBS have been developed, significantly reducing the cost and labor (Baird ; Elshire ; Peterson ). These methodologies have been widely used in salmonid species, to generate dense linkage maps (Brieuc ; Gonen ), perform association studies to identify genomic regions involved in the resistance against pathogens (Campbell ; Liu ; Palti ) and generate SNPs resources (Houston ). Double-digest restriction-site associated DNA (ddRAD) reduces DNA complexity by digesting DNA with two restriction enzymes (REs) simultaneously, without random shearing (Peterson ). This approach has been widely used in genetic studies in aquaculture species (reviewed in Robledo ). In the present study, we used ddRAD sequencing to dissect the genetic architecture of resistance against P. salmonis in a farmed coho salmon population and identify molecular markers associated with the trait. Furthermore, GS models were used to evaluate the potential to accelerate the genetic improvement of resistance against P. salmonis in this coho salmon population by means of using genome-wide molecular information.

Materials and Methods

Coho salmon breeding population

The coho salmon population used in the present study came from a unique 2012 year-class population. This population belongs to a genetic improvement program that was established in 1997 and is owned by Pesquera Antares and managed by Aquainnovo (Puerto Montt, Chile). Further details about this breeding population, in terms of reproductive management, rearing conditions, fish tagging and breeding objectives are described by Yáñez ; 2016a) and Dufflocq .

Experimental challenge

The experimental challenge against P. salmonis was performed as it is described in detail by Correa and Yáñez . Briefly, 2,606 individuals, belonging to 107 maternal full-sib families and 60 paternal half-sib families, were challenged against P. salmonis. Prior to the experiment, each fish was tagged with a passive integrated transponder (PIT-tag), placed in the abdominal cavity for genealogy traceability during the challenge test. The P. salmonis challenge was performed at Aquainnovo´s research station, located in Lenca River, X Region, Chile. For the lethal dose 50 (LD50) calculation, a random sample of 80 fish were selected from the population. Four different dilutions from the P. salmonis inoculum were evaluated (1/10, 1/100, 1/1000 and 1/10000). Twenty fish were challenged at each dilution. The dilutions were intraperitoneally (IP) injected with a volume of 0.2ml/fish. Daily mortality was recorded. This preliminary test spanned 26 days and a dilution of 1:680 was estimated as the LD50. For the main challenge, fish were distributed into three tanks (7m3) with a salt water concentration of 31 ppt. An average of eight individuals (ranging from 1 to 18) from each of the 107 families were distributed into each tank. The experimental challenge was performed through an intraperitoneal (IP) injection with 0.2ml/fish of the LD50 inoculum. The average weight of the fish at the inoculation was 279g (SD = 138g). To ensure that these fish were free from other pathogens, qRT-PCR was previously performed in order to control for the presence of Infectious Salmon Anemia Virus (ISAV), IPNV and Flavobacterium spp. The P. salmonis challenge continued for up to 50 days post IP injection. Throughout the challenge, environmental parameters (pH, temperature, salinity and oxygen) were measured and controlled. Fish were removed from the tanks after death, and a sample of the anterior kidney was taken and stored at -80° in RNALater. A necropsy assay was performed in conjunction with qRT-PCR to confirm the cause of death and the presence of P. salmonis. This was also done to control for the presence of other pathogens, such as Vibrio ordalii, Renibacterium salmoninarum and IPNV.

ddRAD library preparation and sequencing

Ten ddRAD libraries were produced by multiplexing 828 individuals following the protocol described by Peterson . For this, 64 parents (males and females) and 764 offspring representing the 17 most resistant and 16 most susceptible families were selected. An average of 23 (ranging from 11 to 43 individuals) offspring per family were chosen. Briefly, total DNA was extracted using the commercial kit Wizard SV Genomic DNA purification System (Promega) according to the manufacturer’s protocol. Between 80 and 200 ng of DNA, from each individual was digested with two restriction enzymes (New England Biolabs, UK; NEB); 10 U of SbfI (specific for the CCTGCA|GG motif)) and MseI (specific for the T|TAA motif) in a 12 µl reaction volume, including 1 µl of SbfI and MseI adapter (8.3 pM), for 90 min at 37°. The ligation reaction was carried out by adding 1 µl of T4 ligase (NEB) diluted 1:100 in T4 buffer and incubating for 150 min at 37° and subsequently at 16° overnight. Each ligation mix was diluted with 189 µl of dilute TE buffer (1:10). Kodak DNA Polymerase (ABM), a high-fidelity polymerase, was used to amplify DNA fragments with the correct adapters. PCR reactions (20 µl) were prepared containing 10 µl of PCR mix 2x, 1 µl of primer mix (10 µM each), 6 µl of diluted ligation mix and 3 µl of nuclease-free water. Each sample was PCR amplified using the following conditions: 95° for 2 min, followed by 17 cycles of 95° for 20s, 66° for 30s and 68° for 40s. After PCR, amplicon quality was checked by loading 5 µl on a 2% agarose gel. Subsequently, samples were pooled, so that the final concentration was similar among them within each library. Each library was concentrated through an evaporation step for 80 min in a Centrivap Mobile Console Centrifugal Evaporator (Labconco). This step was conducted until 300 µl of the generated library was obtained. Final volume of each library was loaded on a 1% agarose gel. Size of the bands selected for sequencing ranged from 750 and 1,500 bp and between 1,800 and 2,500 bp. DNA was purified through the QIAquick gel extraction kit (Qiagen) following manufacturer’s instructions. Finally, libraries were sequenced on an Illumina Hiseq2500 platform, using 150 base single-end.

SNP identification

Raw sequence reads obtained from Illumina sequencing were analyzed using STACKS v. 1.41 (Catchen , 2013). This software was specifically developed to analyze short-read data generated through next generation sequencing (NGS) (Davey ). Sample reads were trimmed to 134 bp for all subsequent analyses, demultiplexed and filtered using process_radtags. Rad-tags which passed the quality filter were aligned to the Oncorhynchus kisutch reference genome (GenBank: MPKV00000000.1) using BWA v. 0.7.12 (Li and Durbin 2009). The reference genome was indexed (using the index function) and alignments were performed using the mem algorithm; all parameters were set as default. Loci were then built using pstacks with a minimum depth of coverage of three to build a locus (-m 3). A catalog of loci was constructed using the cstacks program using only the parents’ loci from pstacks. To build the catalog, the maximum number of mismatches allowed between sample tags was set to three (-n 3), and the matching was based on genomic location (g). After catalog construction, the sstacks program was used in order to match rad-tags against the catalog based again on genomic location (g), followed by the populations software, using defaults parameters. Loci were considered as valid if they were present in at least 75% of the individuals of the population. As a quality control step, the following parameters were used to filter low-confidence SNPs: Minor Allele Frequency (MAF) ≤ 0.05, Hardy-Weinberg Equilibrium (HWE) P < 1x10-6 and genotyping call rate < 0.80. Individuals were removed from the analyses if their genotyping call rates were below 0.70.

Trait definitions

Resistance against P. salmonis was defined as the day of death (DD) with values ranging from 1 to 50 depending on the time of death. Additionally, resistance was also evaluated as a binary (BIN) trait, either dead or alive at the end of the challenge. Values for this trait were 1 in cases where the fish died during the challenge, or 0 if the fish survived until the end of the challenge. Initial Body Weight (IW) for each fish, was measured prior to the IP injection.

Pedigree-based BLUP

All challenged individuals (n = 2,606) were used for the pedigree-based approach, PBLUP, as a control for the performance evaluation of genomic predictions. A linear univariate animal model was used to estimate variance components and predict Estimated Breeding Values (EBVs) for DD, while for BIN a univariate threshold animal model was fitted (Table 1). The model used was as follows:In the previous equation, is a vector of phenotypes (BIN or DD), is a vector of fixed effects (sex and tank as factors, and initial weight as covariate), is a vector of random additive polygenic genetic effects that follows a normal distribution ∼N(0,A), and are incidence matrices, is the additive relationship matrix, and is the random residual (Lynch and Walsh 1998). Both models were fitted using the BLUPF90 set of programs (Misztal ) by means of the AIREMLF90 and THRGIBBS1F90 modules to analyze DD and BIN, respectively. The MCMC Gibbs sampling scheme set for running THRGIBBS1F90, included a total of 200,000 iterations. The first 20,000 were discarded as burn-in iterations, and then from the remaining 180,000 samples, one from every 50 samples was saved for analysis. This Gibbs sampling scheme collected 3,600 independent samples for analysis.
Table 1

Estimated genetic parameters and accuracy of breeding values (EBV) estimation for resistance against P. salmonis using a pedigree-based model

Phenotypeaσa2σe2h2(SE)Accuracy (R)
DD12.5577.600.14(0.034)0.271
BIN0.381.000.27 (0.043)0.316

The BLUP analysis included the phenotype of all the progeny of 107 families challenged against Piscirickettsia salmonis (n = 2,606)

The BLUP analysis included the phenotype of all the progeny of 107 families challenged against Piscirickettsia salmonis (n = 2,606) Heritabilities for PBLUP models were computed as follows:where and are the additive genetic and residual variances for each trait. In the case of BIN, the residual variance was set to 1.

Genomic BLUP

The SNP based variance components and GEBVs were estimated using genomic BLUP (GBLUP), similar to the PBLUP model described above, and implemented in the BLUPF90 software. The GBLUP is a modification of the PBLUP method, where g is a vector of random additive genetic polygenic effects with a distribution ∼N(0, G) and the numerator relationship matrix is replaced by a genomic relationship matrix , as described by (VanRaden 2008). Only genotyped animals, which passed all quality controls (n = 580) were analyzed.

Single step genomic GBLUP

The single-step GBLUP (ssGBLUP) and weighted single-step GBLUP (wssGBLUP) models were similar to the PBLUP model except for a combined genomic and pedigree relationship. The kinship matrix used was H (Aguilar ), in which genotype and pedigree data are combined. The inverse of the matrix H is:where is the inverse numerator relationship matrix for all animals, is the inverse of a pedigree-based relationship matrix for genotyped animals only, and is the inverse genomic relationship matrix. SNPs were equally weighted and given an initial value of one in the ssGBLUP method. In the wssGBLUP method, the marker variances were used as weights. The marker variances were estimated from allele frequencies, and from marker effects that were calculated in the ssGBLUP method (Wang ). The DD trait was analyzed as a linear trait using AIREMLF90 and BLUPF90, whereas, BIN was analyzed as a threshold trait with THRGIBBS1F90 in the BLUPF90 family of programs (Misztal ). The MCMC Gibbs scheme for the estimation of the genetic parameters for BIN for ssGBLUP and wssGBLUP, were estimated identically as described above. The ssGBLUP and wssGBLUP models included all the genotyped animals which passed quality control (n = 580), and all the phenotyped fish (n = 2,606) from 107 families.

Bayes C

The Bayes C (Habier ) analyses were performed using GS3 software. A total of 200,000 iterations were used in the Gibbs sampling, with a burn-in period of 20,000 cycles. The results were saved every 50 cycles. The number of samples in this analysis totaled 4,000. Convergence and autocorrelation were assessed by visual inspection of trace plots of the posterior variance components. The adjusted model can be described, in matrix notation, as follows:where y is the vector of phenotypic records (DD or BIN), X is an incidence matrix of fixed effects (sex and tank as factors and IW as covariate), β is the vector of fixed effects, T is an incidence matrix of polygenic effects, p is a random vector of polygenic effects of all individuals in the pedigree, is the vector of the genotypes for the i SNP for each animal, is the random allele substitution effect of the i SNP, is an indicator variable (0, 1) sampled from a binomial distribution with parameters determined such that 1% of the markers were included in the model, and e is a vector of residual effects. The following prior distributions were assumed for the genetic random effects: Independent and identical mixture distributions for the SNP effects; each SNP has a point mass at zero having a probability π and a univariate normal distribution with a probability of 1 – π with null mean and variance ; which in turn has a scaled inverse chi-squared prior with degrees of freedom and scale parameter (or ) (Fernando and Garrick 2013). The scale parameter was estimated as a function of the genetic variance population, based on the mean SNP allele frequency and number of markers assumed with nonzero effects (Fernando ). Only genotyped animals, which passed quality control (n = 580) were used.

Genomic prediction accuracy

The different models were compared using a fivefold cross validation scheme. To reduce stochastic effects of sampling, the cross-validation analysis was replicated ten times. Briefly, all challenged individuals (genotyped, phenotyped, or both), were randomly separated into five validations sets. For each set, predictions were made by masking the animals’ phenotypes and using the remaining fish as a training set to estimate the marker effects. Thus, for each cross-validation run, the dataset was split into a training set (80%) and a validation set (20%). Accuracy was used to assess the performance of each model and was estimated as follows:where is the correlation between the EBV of a given model (predicted for the validation set using information from the training set) and the actual phenotype, while is the square root of the pedigree-based estimate of heritability (Correa ; Legarra 2008; Palaiokostas ; Tsai ). Finally, accuracies were calculated for each model and compared to those obtained with the PBLUP model.

Genome-wide association study

In order to identify associations between genetic markers and P. salmonis resistance, as DD or BIN, four genome-wide association methodologies were performed using the BLUPF90 set of programs (Misztal ). GBLUP, ssGBLUP, wssGBLUP and w3ssGBLUP models were used to analyze the DD and BIN traits, using a linear and a threshold model respectively (as described above in model 1). For the GBLUP model, the pedigree-based relationship matrix (A) was replaced by a genomic matrix (G). For the ssGBLUP, wssGBLUP and w3ssGBLUP GWAS models the H matrix was used as described above. SNPs were weighted equally (and given a weight of one), for the ssGBLUP model. For the wssGBLUP and w3ssGBLUP methods, weights were determined based on individual marker variances that were estimated using both the marker effects, which were calculated previously for the ssGBLUP model, and marker allele frequencies (Wang ). For the GBLUP GWAS model, only genotyped animals, which passed quality control (n = 580), were analyzed. For the ssGBLUP, wssGBLUP and w3ssGBLUP GWAS models, all the genotyped animals passing quality control (n = 580), and all the phenotyped fish (n = 2,606) were used. Due to practical reasons, the molecular markers anchored to the different scaffolds not placed in chromosomes, were assigned as chromosome 31. The parents of the challenged individuals were not included in the GWAS analysis because they did not have associated phenotype information, as they were not submitted to the challenge experiment.

Data availability

Table S1 contains genotypic data (available at the public dryad digital repository https://doi.org/10.5061/dryad.b273q6p), Table S2 contains phenotypic data, and Table S3 contains the pedigree information. Table S4 contains the full list of genes located within the top ten 1-Mbp windows proximate to each SNP associated with P. salmonis resistance for DD and BIN identified through wssGBLUP. Table S5 contains information of the top ten markers which explain the highest percentage of the genetic variance for each method and trait. Table S6 contains results from the 10 replicated CV.

Results

Challenge test

Mortality began on the 10th day after the P. salmonis challenge, with evident symptoms of SRS and pathological lesions typical of SRS. These signs include swollen kidney, splenomegaly and yellowish liver tone and coloration (Rozas and Enríquez 2014). Challenged families showed considerable phenotypic variation for P. salmonis resistance. Average mortality of all the 107 families reached 38.53% during the 50-day challenge. The average cumulative mortality rate among the 17 best and 16 worst families, selected for genotyping, reached 19% and 63%, respectively (Figure 1).
Figure 1

Kaplan-Meier curves for Piscirickettsia salmonis experimental challenge in coho salmon. Average mortality curves for the 107 full-sib families, and the 17 best and 16 worst families.

Kaplan-Meier curves for Piscirickettsia salmonis experimental challenge in coho salmon. Average mortality curves for the 107 full-sib families, and the 17 best and 16 worst families.

ddRAD sequencing

Prior to quality control (QC), per base quality (Phred score) was evaluated. The average quality score ranged from 36 to 38 among libraries, indicating high quality of data. Illumina sequencing, including parents, yielded an average of 156,058,078 (± 16 million) of raw sequences per library. After initial QC, which included the removal of low quality sequences and reads with either missing or ambiguous barcodes, an average of 31,660,024 of the reads were removed. In parallel with QC, reads were trimmed to 134 bp. Thus, 79% of the raw reads were retained for subsequent analysis. To create a set of all possible alleles in the population, data sets of the parental samples were used to create a STACKS catalog. This catalog consisted of 106,309 unique ddRAD loci from which 20,068 markers from 757 individuals were identified. Quality filtering reduced this to 9,389 putative bi allelic SNPs (see Table S1) with an average sequencing depth of 38x ranging from 11x to 501x. These markers were identified segregating along the genome of 580 individuals (see Table S1).

Genome-wide association analysis

Four genome-wide association methodologies were performed either for DD and BIN. These approximations include GBLUP, ssGBLUP, wssGBLUP and w3ssGBLUP. For both traits, all the models showed a similar association pattern. In the case of the ssGBLUP methodology, the GWAS plots become less noisy as the iterations progress, and the peaks associated with the traits become more distinct (Figure 2 and Figure 3). For DD, a marker potentially associated with P. salmonis resistance was located on chromosome 11 (Figure 2). This marker was identified in all of the four models and was within the top ten markers explaining most of the percentage of the genetic variance (Table 2).
Figure 2

Genomic association analyses for resistance against Piscirickettsia salmonis in a coho salmon population (defined as day of death) for four different models; GBLUP (A), ssGBLUP (B), wssGBLUP (C) and w3ssGBLUP (D). The gray area highlights the SNPs (in green), which are among the top ten markers explaining a high percentage of the genetic variance in the four models.

Figure 3

Genomic association analyses for resistance against Piscirickettsia salmonis in a coho salmon population (for survival as a binary trait) for four different models; GBLUP (A), ssGBLUP (B), wssGBLUP (C) and w3ssGBLUP (D). The gray area highlights SNPs (in green) that were among the top ten markers explaining a high percentage of the genetic variance in the four models.

Table 2

Top ten markers associated with Piscirickettsia salmonis resistance defined as DD and BIN in coho salmon, using wssGBLUP method

RankingNameChrcPos (BP)dPEVeGenesf
DD
124987_12711a305259215.302PIK3AP1, TIAL1, PCBD1
26135_833371367383.225NOXA1, UBAC1, KLHL20
37914_474182816193.056LRP5, NTRK3, KLHL25
425096_12011330477882.744VA, INPP5A, CSAD
534697_4315322612312.081CCDC153, KMT2A, LXN
652922_94256567731.802NHLRC2, NRG3, LRRC4
741979_1818618182851.750ROBO2, KCNJ1, LCE
822393_11410249967551.451SLC34A2, SH3RF1, FYB
924553_7011196051731.294VOS41, CHMP5, FAM49B
1058185_4129223632921.211TSC1, GFI1B, STOM
BIN
158185_4129a223632924.270PHPT1, GSN, GS
266451_6531ab124062.076PSMD14, CLDN10, CTSM
368326_7931b2116762.073ROBO1, PIK3CB, KCNJ1
445949_12721172890662.073FLVCR1, VTA1, HIVEP2
536367_1516150055551.807SEC24D, KACNIP4, MYOZ2
66135_833371367381.291NOXA1, UBAC1, KLHL20
737641_8617199827351.184NR0B2, HIVEP3, EDN2
823665_6110554798971.170HDAC5, CADM1, ICAM1
947149_11222172155621.133PPARA, CDKN1B, KCNQ1
1018750_958267511491.094GRID2, SMARCAD1, TSPAN3,

Markers in common within the top ten along the four models.

Salmo salar used as reference specie.

Chromosome.

Position in coho salmon reference genome.

Percentage of Phenotypic variance.

Summary of the genes located within 1-Mb window are in supporting information Table S4.

Genomic association analyses for resistance against Piscirickettsia salmonis in a coho salmon population (defined as day of death) for four different models; GBLUP (A), ssGBLUP (B), wssGBLUP (C) and w3ssGBLUP (D). The gray area highlights the SNPs (in green), which are among the top ten markers explaining a high percentage of the genetic variance in the four models. Genomic association analyses for resistance against Piscirickettsia salmonis in a coho salmon population (for survival as a binary trait) for four different models; GBLUP (A), ssGBLUP (B), wssGBLUP (C) and w3ssGBLUP (D). The gray area highlights SNPs (in green) that were among the top ten markers explaining a high percentage of the genetic variance in the four models. Markers in common within the top ten along the four models. Salmo salar used as reference specie. Chromosome. Position in coho salmon reference genome. Percentage of Phenotypic variance. Summary of the genes located within 1-Mb window are in supporting information Table S4. The availability of a high quality coho salmon reference genome (GenBank accession number MPVK00000000.1), made it possible to identify genes near this marker. Within ∼55 Kbp of this marker is the phosphoinositide-3-kinase adaptor protein 1 (pik3ap1) gene; a gene related with innate host defense through B-cells development (Aiba ; Herzog ). In the case of BIN, two molecular markers, explaining most of the genetic variance, were identified in all of the four models (Table 2). One of these markers is located on chromosome 29, while the second marker was identified only at a scaffold level (Scaffold04124) (Figure 3). For BIN, host immune response related genes were not found proximate to any of the suggestive molecular markers. However, some genes within a 1-Mb window of these markers have been suggested to be involved with P. salmonis infection. The 14 KDa Phosphohistidine phosphatase-like (PHPT1), gelsolin-like (GSN), and glutamine synthase-like (GS) genes are located on chromosome 29 and near associated markers. Claudin-10 was found on scaffold04124. These genes have been previously identified as being up-regulated in Salmo salar individuals with low susceptibility to P. salmonis (). Moreover, the retinoic acid receptor RXR-alpha-A-like (RXRA) gene, located on chromosome 29, has previously been identified as a molecular biomarker for P. salmonis infections, and has been found to be down-regulated in macrophages during infection (Rise ). A full list of genes that are located within a 1-Mbp window proximate to the suggestive markers associated with P. salmonis resistance for Okis11, Okis29 and at a scaffold level, identified through wssGBLUP model, is shown in Table S4. In the case of the marker located on the scaffold, the surrounding sequence was blasted against the Salmo salar reference genome (NC_027300.1).

Genetic parameters and predictions

Significant additive genetic variation was estimated for both DD and BIN when using all the data from challenged individuals from the 107 maternal, full-sib families (Table 1). Using the pedigree-based model without genomic data, estimates of the narrow sense heritability for DD and BIN were equal to 0.14 (± 0.034) and 0.27 (± 0.043), respectively. Based on a fivefold cross validation, the accuracy of the PBLUP model was slightly lower for DD (0.271) than for BIN (0.316) (Figure 4). When genomic data were included, accuracies for DD and BIN were higher than those achieved using only phenotypic data. However, there is considerable variation between models and trait definitions. The accuracies for the different models ranged from 0.299 (ssGBLUP) to 0.529 (GBLUP) for the DD trait, and from 0.314 (ssGBLUP) to 0.807 (GBLUP) for the BIN. For DD, all the models with genomic data outperformed the pedigree-based model. The relative increase in accuracy ranged from 10 (ssGBLUP) to 95% (GBLUP) (Figure 4). In the case of BIN, the relative increase in accuracy ranged from 20 (wssGBLUP) to 155% (GBLUP). However, one of the genomic models (ssGBLUP) had a similar accuracy to the PBLUP model, with a relative accuracy 1% lower than the pedigree-based model.
Figure 4

Comparison of predicted accuracies (R) for Piscirickettsia salmonis resistance in a coho salmon population comparing between PBLUP and models with genomic data for DD (red bars) and BIN (blue bars).

Comparison of predicted accuracies (R) for Piscirickettsia salmonis resistance in a coho salmon population comparing between PBLUP and models with genomic data for DD (red bars) and BIN (blue bars). Interestingly, the accuracies obtained with GBLUP were higher than ssGBLUP and wssGBLUP for both traits. These accuracy values reached 0.529, 0.299 and 0.417 for DD and 0.807, 0.314 and 0.3797 for BIN. For both traits, the models with better performance were, GBLUP > Bayes C > wssGBLUP > ssGBLUP.

Discussion

Significant genetic variation for P. salmonis resistance was detected in the present study. Moderate heritabilities were estimated using different trait definitions, either for DD or BIN. Estimated heritabilities were higher for resistance as a binary trait when compared to DD. Previously, Yáñez , estimated a heritability of 0.16 for resistance against P. salmonis, defined as day of death (from the same coho salmon population) through a bivariate linear model. Our study estimated a similar heritability value (0.14) in the same population. Differences in the estimations are likely due to the univariate model we used instead of a bivariate model. We also estimated heritability for P. salmonis resistance using a threshold model for the binary trait, which was higher (0.27) than the value reported for DD. These results are consistent with previous findings for resistance against P. salmonis in Atlantic salmon using pedigree information (Yáñez ). When resistance, defined as day of death, was analyzed using a linear model, heritability was estimated as 0.18 (0.03). When using a threshold model to analyze resistance as a binary trait, a heritability of 0.24 (0.04) was calculated (Yáñez , 2014b). The genetic variation and heritability values for P. salmonis resistance are in accord with different studies that have also found significant genetic resistance to other bacterial diseases in salmonid species (Gjøen ; Ødegård ; Vallejo ). Bacterial disease resistance has been suggested to be a polygenic trait in aquaculture species. For example, Palaiokostas suggested that resistance against Photobacterium damselae subsp. has a polygenic architecture in Gilthead Sea Bream (Sparus aurata). Using a 50K SNP genotyping array, it was possible to elucidate a moderately polygenic architecture of P. salmonis resistance in Atlantic salmon (Correa ). In the current study, and using 9K SNPs, a similar genetic architecture for resistance against P. salmonis in coho salmon population was found. However, it is likely that the moderate number of individuals could limit the power to detect QTL of larger effect controlling P. salmonis resistance in coho salmon. Among the top ten genetic markers for DD and BIN, one and two markers were identified among all four models, respectively. The availability of an annotated, coho salmon genome made it possible to identify the phosphoinositide-3-kinase adaptor protein 1 (pik3ap1) (also known as the adaptor protein B-cell PI3K adaptor (BCAP)), a gene that is related with B-cell development (Herzog ), proximate to the genetic marker found to be associated with resistance. The role of B-cells, through the humoral response, has been widely investigated and elucidated (reviewed in Janeway ). B-cells may also have phagocytic activity in both rainbow trout (Li ) and Atlantic salmon (Øverland ). These cells are capable of ingesting large particles and bacteria; killing them through phagolysosome fusion (Li ). Studies in vitro, showed that from the total phagocytic leukocytes isolated from Atlantic salmon head kidney, 37% were B-cells. Additionally, 77% were B-cells when leukocytes were isolated from peripheral blood (PB). The phagocytic ability of B-cells was three times higher than those observed in neutrophils in head kidney, while in PB no differences were observed (Øverland ). We hypothesize that B-cell development could help in the immune response against P. salmonis in coho salmon population through its phagocytic activity. Further studies are needed in order to have a better understanding of the role of these cells in the resistance against P. salmonis, and its ability to digest and kill bacteria. For resistance defined as the BIN trait, no genes related with immune response were identified near genetic markers associated with this trait. A possible explanation of this observation, is that the region near the genetic marker acts as a regulatory sequence. Also, it could be due to the relative small sample size for the GWAS, which ideally should be over 1000 animals (limiting the resolution of the GWAS). The underlying genetic basis for this trait may have an important impact on the accuracy of genomic predictions. Thus, performance comparisons of different algorithms is required for either GS and GWAS when a complex trait is studied for the first time within a population (Vallejo ). The current study is the first to evaluate the genetic architecture of resistance against Piscirickettsia salmonis in coho salmon through different algorithms (to our knowledge). We used a genomic model that assumes that genetic variances are controlled by an infinite number of markers with minimum effects on the trait (i.e., GBLUP). The GBLUP method calculates genomic relationship (G matrix) using all the genotyped markers for this reason. An extension of this method, which combines genomic (G) and pedigree-based (A) relationship information into the H relationship matrix (Aguilar ; Legarra ) was also evaluated. This method, called single-step GBLUP (ssGBLUP), mimics a Bayesian selection model in that it only fits SNPs that explain moderate to large genetic variance of the trait (Wang ). Results of these different genomic-wide association analyses suggest that resistance against Piscirickettsia salmonis has a polygenic architecture, with no major QTL (i.e., explaining >= 10% of the genetic variation) segregating in the current population (Figure 2 and Figure 3). When resistance was defined as DD, the accuracy for the PBLUP model was slightly lower than resistance measured as BIN (0.271 and 0.316, respectively). These values are consistent with the results obtained for resistance against the bacterial disease pasteurellosis in S. aurata defined as day of death, authors reached an accuracy up to 0.30 through a pedigree-based model (Palaiokostas ). However, both accuracy values are slightly lower when compared with values obtained for resistance against sea lice in Atlantic salmon. In this regard, Correa reached an accuracy of 0.41 for resistance against Caligus rogercresseyi, while Tsai reached a prediction accuracy ∼0.5 for resistance against Lepeophtheirus salmonis using PBLUP. The results from our genomic predictions are in agreement with previous studies that showed higher estimated accuracies using GS than with PBLUP in the same half/full-sib family structure in salmon breeding programs (Nielsen ; Lillehammer ; Bangera ). The GS model that showed the best performance in terms of accuracy of predictions for DD was GBLUP. This was followed by Bayes C, wssGBLUP and ssGBLUP. The addition of genomic information allowed an improvement up to 95% in accuracy for this trait. The current study showed a relative accuracy prediction improvement of 54% when comparing PBLUP to wssGBLUP for this trait. This improvement is different to that obtained by Vallejo for resistance against Bacterial Cold Water Disease (BCWD), defined as day of death, in rainbow trout (Oncorhynchus mykiss). The authors reported a reduction in the predictive ability (PA) of 20 and 26% using wssGBLUP, either with a chip array (40K) or through RAD sequencing (10K) respectively. The authors attributed this pronounced reduction to stochastic fluctuations due to a small training group of their study (n = 583). However, when the number of individuals was increased, this model outperformed the pedigree-based model, reaching a relative increase in accuracy up to 108%, for the same trait definition (Vallejo ). In the case of BIN, the use of the ssGBLUP model, did not show an improvement in the accuracy prediction (with a reduction of 1% in comparison to the accuracy obtained though the pedigree-based model). However, there was an estimated relative increase in accuracy ranging from 20 to 155% when comparing PBLUP to the others genomic models. In this regard, Bayes C and GBLUP showed an increase in accuracy of 140% and 155%, respectively. These high values are similar to the improvements seen with resistance against BCWD defined as a binary trait, which reached a relative increase up to 97% (Vallejo ). The relative improvements ranged in accuracy from -1 to 155% for BIN and from 10 to 95% for DD are greater to those obtained for other diseases resistance studies in Atlantic salmon; even with lower marker numbers. Using an identical random selection design as in the current study, sea lice resistance showed a relative improvement in accuracy of 22% relative to PBLUP when using 37K SNPs (Correa ). For reliability, an improvement up to 52% with 220K SNPs was reached (Ødegård ). Tsai reported that when a non-full siblings design was used, an improvement in accuracy of 250% and 500% was reached in two different populations compared to PBLUP. However, when the methodology was changed to a random selection scheme this improvement only reached up to 27% using 33K SNPs. In case of Piscirickettsia salmonis resistance, and using the same cross-validation scheme in our study, the relative reliability was increased by 25% and 30% for resistance, defined as day of death or as a binary trait, respectively with 50K SNPs (Bangera ). In the current study we evaluated a wide range of different models of GS for their potential implementation in aquaculture. The performance of each implemented model varied according to the underlying genetic architecture of the trait (Meuwissen ; Daetwyler ). Thus, it is valuable to perform these comparisons to identify the best performing method using real data. Similar accuracies among PBLUP and ssGLBUP models could be due to the predicted GEBV as both models are overrepresented by polygenic EBV (Bangera ). GBLUP estimated genetic relationships using genotype and pedigree data rather than just average relationship as PBLUP (Habier ). This allows a more accurate genetic relationship matrix and provides an increase in performance, as seen in the current data, due to the close family relationship. Moreover, GBLUP had significantly better performance when resistance was defined as a linear or binary trait compared to the other evaluated genomic models. We suggest that this could be an effect of only genotyping families from the opposite sides of the mortality distribution (i.e., most resistant and most susceptible), and not all the challenged individuals. Bangera , reported similar accuracies among GBLUP and Bayesian methods between GBLUP and Bayes C using 10K SNPs in Atlantic salmon. Similar results that have been seen in dairy cattle for most traits (de Roos ; Goddard 2009) The greater improvement in accuracy for the binary trait BIN, compared with the linear trait DD, could be due a better fit of the threshold model for BIN than the fit of the linear model for DD. It could also be due to the higher estimated heritability. We hypothesize that the large improvement values seen in the current study are likely due to an increased level of linkage disequilibrium (LD) found within this farmed coho salmon population. Additional studies are needed to elucidate the minimum number of markers necessary for GS. Our results, using ddRAD sequencing, are in agreement with other genomic studies, which utilize GBS techniques with aquaculture species. Using RAD sequencing, some authors have previously performed genomic-wide association studies in rainbow trout looking for associations with disease resistance. From a total of 4K identified SNPs, 31 markers were significantly associated with either BCWD or Infectious Hematopoietic Necrosis Virus (IHNV) resistance as a binary trait (Campbell ). These authors also showed the potential of these RAD markers to predict an animal’s phenotype. In the case of BCWD resistance, defined as a linear and binary trait, Palti identified suggestive and significant SNPs in two different families and candidate genes associated with this trait using ∼5K markers per family. Similar numbers of SNPs were used by Liu to significantly associate 18 SNPs. Genomic selection predictions are in accordance with studies aimed to evaluate GS using others GBS techniques, in both, relative increase in accuracy and number of discovered SNPs. Dou predicted higher accuracies for shell height and shell width using GBLUP and Bayes methods in Yesso scallop (Patinopecten yessoensis) using 2K SNPs identified by 2b-RAD. Identical methodology allowed Palaiokostas to reach a relative increase in accuracy up to 53% with 12K SNPs using Bayesian methods compared to PBLUP. A study in rainbow trout using RAD sequencing identified 10K SNPs. Even then, the accuracies were similar with GS models compared to PBLUP, and the authors predicted that increasing the number of individuals could lead to a relative increase in accuracy up to 69% (Vallejo ). The genotyping strategy was aimed at the i) evaluation of genomic selection methods; and ii) allowing the identification of molecular markers associated with the trait by means of GWAS. The aim was maximizing the phenotypic variance within the sample while keeping a balanced representation of fish per family. Thus, genotyping strategy was not totally random, but specifically focused on the most extreme families; 17 resistant and 16 susceptible families. We aimed at genotyping all the fish belonging to each selected family. Thus, each family was represented within the sample with an average of 23 (ranging from 11 to 43) fish/family. The availability of dense SNP arrays for coho salmon, as it is already the case for Atlantic salmon (Houston ; Yáñez ) and rainbow trout (Palti ), may increase the accuracy for predicting genomic breeding values and the power for the determination of the genetic factors involved in economically-important traits. It is also expected that in the near future, further functional studies for a better understanding of P. salmonis resistance and other complex traits in salmonids will be facilitated by the international initiative on the Functional Annotation of All Salmonid Genomes, FAASG (Macqueen ). We have evaluated different GS models, and demonstrated that the use of genomic prediction is a feasible strategy for the improvement of breeding value prediction. This information could be used for the implementation of genomic information in genetic programs for Piscirickettsia salmonis resistance in farmed coho salmon populations.

Conclusions

Moderate significant genetic variation was estimated for resistance against Piscirickettsia salmonis in coho salmon, using either pedigree or genomic information. These results highlight the feasibility of including this trait into genetic improvement programs. Our study shows that genomic prediction methods, using ddRAD genotypes (including 9K SNPs), has a substantial advantage in terms of accuracy when compared to pedigree-based model for either DD or BIN. The improvement was up to 95 and 155% respectively in the current population. The association analyses were used to identify a gene related with B-cell development, which could also be involved in resistance against P. salmonis. To our knowledge, this is the first study aimed at dissecting the genetic architecture of P. salmonis resistance in a coho salmon population.

Supplementary Material

Supplemental Material is available online at www.g3journal.org/lookup/suppl/doi:10.1534/g3.118.200053/-/DC1 Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file.
  50 in total

1.  The impact of genetic architecture on genome-wide evaluation methods.

Authors:  Hans D Daetwyler; Ricardo Pong-Wong; Beatriz Villanueva; John A Woolliams
Journal:  Genetics       Date:  2010-04-20       Impact factor: 4.562

2.  Genome-wide association and genomic selection in animal breeding.

Authors:  Ben Hayes; Mike Goddard
Journal:  Genome       Date:  2010-11       Impact factor: 2.166

3.  Regulation of B-cell development by BCAP and CD19 through their binding to phosphoinositide 3-kinase.

Authors:  Yuichi Aiba; Megumi Kameyama; Tetsuo Yamazaki; Thomas F Tedder; Tomohiro Kurosaki
Journal:  Blood       Date:  2007-11-19       Impact factor: 22.113

4.  Efficient methods to compute genomic predictions.

Authors:  P M VanRaden
Journal:  J Dairy Sci       Date:  2008-11       Impact factor: 4.034

5.  Performance of genomic selection in mice.

Authors:  Andrés Legarra; Christèle Robert-Granié; Eduardo Manfredi; Jean-Michel Elsen
Journal:  Genetics       Date:  2008-08-30       Impact factor: 4.562

6.  Characterisation of QTL-linked and genome-wide restriction site-associated DNA (RAD) markers in farmed Atlantic salmon.

Authors:  Ross D Houston; John W Davey; Stephen C Bishop; Natalie R Lowe; Jose C Mota-Velasco; Alastair Hamilton; Derrick R Guy; Alan E Tinch; Marian L Thomson; Mark L Blaxter; Karim Gharbi; James E Bron; John B Taggart
Journal:  BMC Genomics       Date:  2012-06-15       Impact factor: 3.969

7.  Genomic selection models double the accuracy of predicted breeding values for bacterial cold water disease resistance compared to a traditional pedigree-based model in rainbow trout aquaculture.

Authors:  Roger L Vallejo; Timothy D Leeds; Guangtu Gao; James E Parsons; Kyle E Martin; Jason P Evenhuis; Breno O Fragomeni; Gregory D Wiens; Yniv Palti
Journal:  Genet Sel Evol       Date:  2017-02-01       Impact factor: 4.297

8.  Functional Annotation of All Salmonid Genomes (FAASG): an international initiative supporting future salmonid research, conservation and aquaculture.

Authors:  Daniel J Macqueen; Craig R Primmer; Ross D Houston; Barbara F Nowak; Louis Bernatchez; Steinar Bergseth; William S Davidson; Cristian Gallardo-Escárate; Tom Goldammer; Yann Guiguen; Patricia Iturra; James W Kijas; Ben F Koop; Sigbjørn Lien; Alejandro Maass; Samuel A M Martin; Philip McGinnity; Martin Montecino; Kerry A Naish; Krista M Nichols; Kristinn Ólafsson; Stig W Omholt; Yniv Palti; Graham S Plastow; Caird E Rexroad; Matthew L Rise; Rachael J Ritchie; Simen R Sandve; Patricia M Schulte; Alfredo Tello; Rodrigo Vidal; Jon Olav Vik; Anna Wargelius; José Manuel Yáñez
Journal:  BMC Genomics       Date:  2017-06-27       Impact factor: 3.969

9.  Rapid SNP discovery and genetic mapping using sequenced RAD markers.

Authors:  Nathan A Baird; Paul D Etter; Tressa S Atwood; Mark C Currey; Anthony L Shiver; Zachary A Lewis; Eric U Selker; William A Cresko; Eric A Johnson
Journal:  PLoS One       Date:  2008-10-13       Impact factor: 3.240

10.  Genome-wide association analysis reveals loci associated with resistance against Piscirickettsia salmonis in two Atlantic salmon (Salmo salar L.) chromosomes.

Authors:  Katharina Correa; Jean P Lhorente; María E López; Liane Bassini; Sudhir Naswa; Nader Deeb; Alex Di Genova; Alejandro Maass; William S Davidson; José M Yáñez
Journal:  BMC Genomics       Date:  2015-10-24       Impact factor: 3.969

View more
  27 in total

1.  Genomic Selection in Aquaculture Species.

Authors:  François Allal; Nguyen Hong Nguyen
Journal:  Methods Mol Biol       Date:  2022

2.  Temporal genome-wide DNA methylation signature of post-smolt Pacific salmon challenged with Piscirickettsia salmonis.

Authors:  Francisco Leiva; Scarlet Bravo; Killen Ko Garcia; Javier Moya; Osiel Guzman; Nicolás Bascuñan; Rodrigo Vidal
Journal:  Epigenetics       Date:  2020-12-31       Impact factor: 4.528

3.  Exploring genetic resistance to infectious salmon anaemia virus in Atlantic salmon by genome-wide association and RNA sequencing.

Authors:  O Gervais; A Barria; A Papadopoulou; R L Gratacap; B Hillestad; A E Tinch; S A M Martin; D Robledo; R D Houston
Journal:  BMC Genomics       Date:  2021-05-13       Impact factor: 3.969

4.  Genome-wide association and genomic prediction of resistance to viral nervous necrosis in European sea bass (Dicentrarchus labrax) using RAD sequencing.

Authors:  Christos Palaiokostas; Sophie Cariou; Anastasia Bestin; Jean-Sebastien Bruant; Pierrick Haffray; Thierry Morin; Joëlle Cabon; François Allal; Marc Vandeputte; Ross D Houston
Journal:  Genet Sel Evol       Date:  2018-06-08       Impact factor: 4.297

5.  Mapping and Sequencing of a Significant Quantitative Trait Locus Affecting Resistance to Koi Herpesvirus in Common Carp.

Authors:  Christos Palaiokostas; Diego Robledo; Tomas Vesely; Martin Prchal; Dagmar Pokorova; Veronika Piackova; Lubomir Pojezdal; Martin Kocour; Ross D Houston
Journal:  G3 (Bethesda)       Date:  2018-11-06       Impact factor: 3.154

6.  Whole Genome Linkage Disequilibrium and Effective Population Size in a Coho Salmon (Oncorhynchus kisutch) Breeding Population Using a High-Density SNP Array.

Authors:  Agustín Barría; Kris A Christensen; Grazyella Yoshida; Ana Jedlicki; Jong S Leong; Eric B Rondeau; Jean P Lhorente; Ben F Koop; William S Davidson; José M Yáñez
Journal:  Front Genet       Date:  2019-05-22       Impact factor: 4.599

7.  Genome-Wide Association Analysis for Resistance to Infectious Pancreatic Necrosis Virus Identifies Candidate Genes Involved in Viral Replication and Immune Response in Rainbow Trout (Oncorhynchus mykiss).

Authors:  Francisco H Rodríguez; Raúl Flores-Mara; Grazyella M Yoshida; Agustín Barría; Ana M Jedlicki; Jean P Lhorente; Felipe Reyes-López; José M Yáñez
Journal:  G3 (Bethesda)       Date:  2019-09-04       Impact factor: 3.154

Review 8.  Genomic Selection in Aquaculture: Application, Limitations and Opportunities With Special Reference to Marine Shrimp and Pearl Oysters.

Authors:  Kyall R Zenger; Mehar S Khatkar; David B Jones; Nima Khalilisamani; Dean R Jerry; Herman W Raadsma
Journal:  Front Genet       Date:  2019-01-23       Impact factor: 4.599

9.  Development and Validation of 58K SNP-Array and High-Density Linkage Map in Nile Tilapia (O. niloticus).

Authors:  Rajesh Joshi; Mariann Árnyasi; Sigbjørn Lien; Hans Magnus Gjøen; Alejandro Tola Alvarez; Matthew Kent
Journal:  Front Genet       Date:  2018-10-15       Impact factor: 4.599

10.  Genomic Selection for Growth Traits in Pacific Oyster (Crassostrea gigas): Potential of Low-Density Marker Panels for Breeding Value Prediction.

Authors:  Alejandro P Gutierrez; Oswald Matika; Tim P Bean; Ross D Houston
Journal:  Front Genet       Date:  2018-09-19       Impact factor: 4.599

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.