| Literature DB >> 30283492 |
Mohamed Salem1,2, Rafet Al-Tobasei2,3, Ali Ali1, Daniela Lourenco4, Guangtu Gao5, Yniv Palti5, Brett Kenney6, Timothy D Leeds5.
Abstract
Detection of coding/functional SNPs that change the biological function of a gene may lead to identification of putative causative alleles within QTL regions and discovery of genetic markers with large effects on phenotypes. This study has two-fold objectives, first to develop, and validate a 50K transcribed gene SNP-chip using RNA-Seq data. To achieve this objective, two bioinformatics pipelines, GATK and SAMtools, were used to identify ~21K transcribed SNPs with allelic imbalances associated with important aquaculture production traits including body weight, muscle yield, muscle fat content, shear force, and whiteness in addition to resistance/susceptibility to bacterial cold-water disease (BCWD). SNPs ere identified from pooled RNA-Seq data collected from ~620 fish, representing 98 families from growth- and 54 families from BCWD-selected lines with divergent phenotypes. In addition, ~29K transcribed SNPs without allelic-imbalances were strategically added to build a 50K Affymetrix SNP-chip. SNPs selected included two SNPs per gene from 14K genes and ~5K non-synonymous SNPs. The SNP-chip was used to genotype 1728 fish. The average SNP calling-rate for samples passing quality control (QC; 1,641 fish) was ≥ 98.5%. The second objective of this study was to test the feasibility of using the new SNP-chip in GWA (Genome-wide association) analysis to identify QTL explaining muscle yield variance. GWA study on 878 fish (representing 197 families from 2 consecutive generations) with muscle yield phenotypes and genotyped for 35K polymorphic markers (passing QC) identified several QTL regions explaining together up to 28.40% of the additive genetic variance for muscle yield in this rainbow trout population. The most significant QTLs were on chromosomes 14 and 16 with 12.71 and 10.49% of the genetic variance, respectively. Many of the annotated genes in the QTL regions were previously reported as important regulators of muscle development and cell signaling. No major QTLs were identified in a previous GWA study using a 57K genomic SNP chip on the same fish population. These results indicate improved detection power of the transcribed gene SNP-chip in the target trait and population, allowing identification of large-effect QTLs for important traits in rainbow trout.Entities:
Keywords: GWAS; SNP-chip; fillet yield; muscle; trout
Year: 2018 PMID: 30283492 PMCID: PMC6157414 DOI: 10.3389/fgene.2018.00387
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
SNP chip Metric summary.
| Poly High Resolution | 32,273 | 64.5 |
| Other | 8,395 | 16.7 |
| Mono High Resolution | 3,458 | 6.9 |
| No Minor Hom | 2,725 | 5.4 |
| Call Rate Below Threshold | 2,705 | 5.4 |
| Off target variant | 450 | 0.9 |
SNP chip Sample QC Summary.
| Number of input samples | 1,728 |
| Samples passing DQC | 1,722 |
| Samples passing DQC and QC CR | 1,641 |
| Samples passing DQC, QC CR and Plate QC | 1,641 (94.9%) |
| Number of failing samples | 87 |
| Number of Samples Genotyped | 1,641 |
| Average QC CR for the passing samples | 99.66 |
Figure 1Minor allele frequency distribution of the polymorphic high-resolution SNPs in the SNP chip.
Figure 2Number of SNPs per chromosome and SNP density distribution (SNP/100K nucleotide).
Figure 3Percentage of SNP effects by gene region.
Selected SNP markers explaining the largest proportion of genetic variance (>5%) for muscle yield in chromosome 14 using 50 adjacent SNP windows.
| 5.95 | 14 | 60291342 | 16113 | + | etfdh | Electron Transfer Flavoprotein Dehydrogenase | CDS/nonSyn |
| 7.86 | 14 | 60307455 | 366 | + | etfdh | Electron Transfer Flavoprotein Dehydrogenase | CDS/syn |
| 10.36 | 14 | 60307821 | 8 | + | etfdh | Electron Transfer Flavoprotein Dehydrogenase | 3′UTR |
| 10.79 | 14 | 60307829 | 2256 | + | etfdh | Electron Transfer Flavoprotein Dehydrogenase | 3′UTR |
| 10.84 | 14 | 60310085 | 163538 | - | ppid | Peptidylprolyl Isomerase D | CDS/nonSyn |
| 10.90 | 14 | 60473623 | 421210 | + | rapgef2 | Rap Guanine Nucleotide Exchange Factor 2 | 3′UTR |
| 10.96 | 14 | 60894833 | 295302 | NA | NA | NA | NA |
| 11.00 | 14 | 61190135 | 558 | + | LOC110488945 | Prominin-1-A | 3′UTR |
| 11.00 | 14 | 61190693 | 7552 | + | LOC110488945 | Prominin-1-A | 3′UTR |
| 2.24 | 14 | 61198245 | 76178 | + | LOC110488947 | Fibroblast Growth Factor-Binding Protein 1 | CDS/nonSyn |
| 12.23 | 14 | 61274423 | 13691 | – | LOC110488948 | Cyclin-A2 | 3′UTR |
| 12.29 | 14 | 61288114 | 762 | – | LOC110488950 | Transmembrane Protein 33 | 3′UTR |
| 12.35 | 14 | 61288876 | 528124 | – | LOC110488950 | Transmembrane Protein 33 | CDS/syn |
| 12.36 | 14 | 61817000 | 18067 | – | LOC110488956 | Protein Farnesyltransferase/Geranylgeranyltransferase Type-1 Subunit Alpha | 3′UTR |
| 12.30 | 14 | 61835067 | 6866 | – | LOC110488957 | Glutathione S-Transferase P | 3′UTR |
| 12.26 | 14 | 61841933 | 319532 | – | LOC110488957 | Glutathione S-Transferase P | CDS/syn |
| 12.24 | 14 | 62161465 | 1101 | NA | NA | NA | NA |
| 12.45 | 14 | 62162566 | 79441 | NA | NA | NA | NA |
| 12.71 | 14 | 62242007 | 38699 | – | LOC110488962 | Inositol Polyphosphate 5-Phosphatase Ocrl-1 | CDS/nonSyn |
| 12.71 | 14 | 62280706 | 12616 | + | LOC110488963 | Chloride Intracellular Channel Protein 2 | 3′UTR |
| 12.70 | 14 | 62293322 | 4394 | + | LOC110488964 | C1Galt1-Specific Chaperone 1 | 3′UTR |
| 12.65 | 14 | 62297716 | 9021 | – | mcts1 | Mcts1, Re-Initiation And Release Factor | 3′UTR |
| 11.85 | 14 | 62306737 | 36808 | – | mcts1 | Mcts1, Re-Initiation And Release Factor | 5′UTR |
| 11.84 | 14 | 62343545 | 586 | + | lamp2 | Lysosomal Associated Membrane Protein 2 | CDS/nonSyn |
| 11.85 | 14 | 62344131 | 2211 | + | lamp2 | Lysosomal Associated Membrane Protein 2 | CDS/nonSyn |
| 11.78 | 14 | 62346342 | 306 | + | lamp2 | Lysosomal Associated Membrane Protein 2 | Intronic |
| 11.78 | 14 | 62346648 | 579 | + | lamp2 | Lysosomal Associated Membrane Protein 2 | Intronic |
| 11.77 | 14 | 62347227 | 29198 | + | lamp2 | Lysosomal Associated Membrane Protein 2 | Intronic |
| 11.66 | 14 | 62376425 | 304 | + | tmem255a | Transmembrane Protein 255A | CDS/syn |
| 10.92 | 14 | 62376729 | 3620 | + | tmem255a | Transmembrane Protein 255A | CDS/syn |
| 10.87 | 14 | 62380349 | 282 | + | tmem255a | Transmembrane Protein 255A | 3′UTR |
| 10.86 | 14 | 62380631 | 31094 | + | tmem255a | Transmembrane Protein 255A | 3′UTR |
| 10.86 | 14 | 62411725 | 1632 | + | upf3b | Upf3B, Regulator Of Nonsense Mediated Mrna Decay | CDS/nonSyn |
| 10.86 | 14 | 62413357 | 1931 | + | upf3b | Upf3B, Regulator Of Nonsense Mediated Mrna Decay | 3′UTR |
| 10.92 | 14 | 62415288 | 26359 | + | LOC110488974 | 60S Ribosomal Protein L39 | 3′UTR |
| 10.90 | 14 | 62441647 | 10087 | + | LOC110488975 | Septin-6 | CDS/syn |
| 10.88 | 14 | 62451734 | 10231 | + | LOC110488975 | Septin-6 | CDS/syn |
| 10.88 | 14 | 62461965 | 6983 | + | LOC110488975 | Septin-6 | 3′UTR |
| 10.75 | 14 | 62468948 | 89647 | NA | NA | NA | NA |
| 10.72 | 14 | 62558595 | 7052 | + | LOC110488979 | Ets-Related Transcription Factor Elf-1 | 3′UTR |
| 10.66 | 14 | 62565647 | 66310 | – | LOC110488980 | Tenomodulin | 3′UTR |
| 10.67 | 14 | 62631957 | 1503911 | – | LOC110488980 | Tenomodulin | CDS/nonSyn |
| 10.83 | 14 | 64135868 | 6948 | + | gla | Galactosidase Alpha | CDS/nonSyn |
| 9.18 | 14 | 64142816 | 2581 | – | LOC110488986 | 60S Ribosomal Protein L36A | CDS/syn |
| 7.03 | 14 | 64145397 | 20716 | – | LOC110488986 | 60S Ribosomal Protein L36A | CDS/nonSyn |
| 5.17 | 14 | 64166113 | + | btk | Bruton Tyrosine Kinase | CDS/nonSyn |
Color intensities reflect changes in additive genetic variance (green is the highest and red is the lowest).
Selected SNP markers explaining the largest proportion of genetic variance (>5%) for muscle yield in chromosome 16 using 50 adjacent SNP windows.
| 4.62 | 16 | 39953311 | 12000 | + | tnfrsf5a | Tnf Receptor Superfamily Member 5A Precursor | 5'UTR |
| 5.09 | 16 | 39965311 | 3 | + | tnfrsf5a | Tnf Receptor Superfamily Member 5A Precursor | CDS/nonSyn |
| 6.03 | 16 | 39965314 | 689 | + | tnfrsf5a | Tnf Receptor Superfamily Member 5A Precursor | CDS/nonSyn |
| 6.83 | 16 | 39966003 | 608 | + | tnfrsf5a | Tnf Receptor Superfamily Member 5A Precursor | CDS/nonSyn |
| 7.79 | 16 | 39966611 | 666 | + | tnfrsf5a | Tnf Receptor Superfamily Member 5A Precursor | 3'UTR |
| 8.47 | 16 | 39967277 | 149527 | + | tnfrsf5a | Tnf Receptor Superfamily Member 5A Precursor | 3'UTR |
| 8.76 | 16 | 40116804 | 438 | NA | NA | NA | NA |
| 9.06 | 16 | 40117242 | 5021 | – | LOC110492067 | Kelch Protein 21 | CDS/syn |
| 9.04 | 16 | 40122263 | 471 | – | LOC110492067 | Kelch Protein 21 | CDS/syn |
| 9.04 | 16 | 40122734 | 206269 | – | LOC110492067 | Kelch Protein 21 | CDS/syn |
| 8.97 | 16 | 40329003 | 423 | + | LOC110492070 | 45 Kda Calcium-Binding Protein | 3'UTR |
| 8.88 | 16 | 40329426 | 430961 | + | LOC110492070 | 45 Kda Calcium-Binding Protein | 3'UTR |
| 8.88 | 16 | 40760387 | 133719 | + | LOC100136676 | Caspase-9 | CDS/syn |
| 8.87 | 16 | 40894106 | 16043 | + | LOC110491067 | Basement Membrane-Specific Heparan Sulfate Proteoglycan Core Protein | CDS/syn |
| 8.81 | 16 | 40910149 | 15660 | + | LOC110491067 | Basement Membrane-Specific Heparan Sulfate Proteoglycan Core Protein | CDS/nonSyn |
| 8.81 | 16 | 40925809 | 134 | NA | NA | NA | NA |
| 8.82 | 16 | 40925943 | 328 | NA | NA | NA | NA |
| 8.88 | 16 | 40926271 | 36300 | NA | NA | NA | NA |
| 8.88 | 16 | 40962571 | 1603 | + | LOC110492082 | Cdp-Diacylglycerol–Serine O-Phosphatidyltransferase | 3'UTR |
| 8.88 | 16 | 40964174 | 1011 | NA | NA | NA | NA |
| 8.89 | 16 | 40965185 | 134 | NA | NA | NA | NA |
| 8.89 | 16 | 40965319 | 214946 | NA | NA | NA | NA |
| 9.33 | 16 | 41180265 | 15995 | + | LOC110492084 | Membrane-Associated Guanylate Kinase, Ww And Pdz Domain-Containing Protein 3 | CDS/syn |
| 9.37 | 16 | 41196260 | 49825 | – | LOC110492085 | Tyrosine-Protein Phosphatase Non-Receptor Type 12 | CDS/syn |
| 9.82 | 16 | 41246085 | 3112 | + | LOC100136105 | Complement Receptor | CDS/syn |
| 9.82 | 16 | 41249197 | 474 | + | LOC100136105 | Complement Receptor | 3'UTR |
| 9.83 | 16 | 41249671 | 30475 | + | LOC100136105 | Complement Receptor | 3'UTR |
| 9.95 | 16 | 41280146 | 574 | + | c4bp | C4B-Binding Protein Alpha Chain | 3'UTR |
| 9.93 | 16 | 41280720 | 774 | + | c4bp | C4B-Binding Protein Alpha Chain | 3'UTR |
| 10.15 | 16 | 41281494 | 229 | NA | NA | NA | NA |
| 10.29 | 16 | 41281723 | 24001 | NA | NA | NA | NA |
| 10.33 | 16 | 41305724 | 20095 | – | LOC110492088 | Uncharacterized Loc110492088 | NA |
| 10.36 | 16 | 41325819 | 685099 | – | cd34a | Cd34A Molecule | 3'UTR |
| 10.44 | 16 | 42010918 | 5137 | – | slc26a9 | Solute Carrier Family 26 Member 9 | CDS/nonSyn |
| 10.45 | 16 | 42016055 | 176696 | – | slc26a9 | Solute Carrier Family 26 Member 9 | CDS/syn |
| 10.49 | 16 | 42192751 | 41683 | – | LOC110492098 | Cysteine/Serine-Rich Nuclear Protein 2 | CDS/syn |
| 9.58 | 16 | 42234434 | 23274 | + | LOC110492102 | Daz-Associated Protein 2 | 3'UTR |
| 9.68 | 16 | 42257708 | 1026 | – | LOC110492103 | Rac Gtpase-Activating Protein 1 | 3'UTR |
| 9.45 | 16 | 42258734 | 38505 | – | LOC110492103 | Rac Gtpase-Activating Protein 1 | 3'UTR |
| 8.01 | 16 | 42297239 | 2891 | + | LOC110492108 | Citrate Synthase, Mitochondrial | CDS/nonSyn |
| 8.01 | 16 | 42300130 | 5927 | + | LOC110492108 | Citrate Synthase, Mitochondrial | CDS/nonSyn |
| 7.38 | 16 | 42306057 | 101 | + | LOC110492108 | Citrate Synthase, Mitochondrial | 3'UTR |
| 6.57 | 16 | 42306158 | 92 | + | LOC110492108 | Citrate Synthase, Mitochondrial | 3'UTR |
| 6.01 | 16 | 42306250 | 1 | + | LOC110492108 | Citrate Synthase, Mitochondrial | 3'UTR |
| 5.19 | 16 | 42306251 | 60 | + | LOC110492108 | Citrate Synthase, Mitochondrial | 3'UTR |
| 4.54 | 16 | 42306311 | 303 | + | LOC110492108 | Citrate Synthase, Mitochondrial | 3'UTR |
| 3.90 | 16 | 42306614 | 605 | + | LOC110492108 | Citrate Synthase, Mitochondrial | 3'UTR |
| 3.19 | 16 | 42307219 | 57 | + | LOC110492108 | Citrate Synthase, Mitochondrial | 3'UTR |
| 2.53 | 16 | 42307276 | + | LOC110492108 | Citrate Synthase, Mitochondrial | 3'UTR |
Color intensities reflect changes in additive genetic variance (green is the highest and red is the lowest).
Figure 4Manhattan plot of GWA analysis performed with WssGBLUP and showing association between SNP genomic sliding windows of 50 SNPs and muscle yield. Chromosomes 14 and 16 showed the highest peaks with genomic loci, explaining together up to 23.2% of the genetic variance. The blue line shows a threshold of 1% of additive genetic variance explained by SNPs.
Figure 5Correlation coefficient between muscle yield and CS activity in 96 samples. (A) The regression coefficient R2 value between the muscle yield and CS activity was 0.092 (p-value 0.002). (B) CS had 1.43-fold increase in the high-ranked fish compared to the low ranked ones.