| Literature DB >> 25473852 |
Fazli Alpay1, Yalda Zare2, Mamat H Kamalludin3, Xixia Huang4, Xianwei Shi2, George E Shook5, Michael T Collins6, Brian W Kirkpatrick7.
Abstract
Paratuberculosis, or Johne's disease, is a chronic, granulomatous, gastrointestinal tract disease of cattle and other ruminants caused by the bacterium Mycobacterium avium, subspecies paratuberculosis (MAP). Control of Johne's disease is based on programs of testing and culling animals positive for infection with MAP while concurrently modifying management to reduce the likelihood of infection. The current study is motivated by the hypothesis that genetic variation in host susceptibility to MAP infection can be dissected and quantifiable associations with genetic markers identified. For this purpose, a case-control, genome-wide association study was conducted using US Holstein cattle phenotyped for MAP infection using a serum ELISA and/or fecal culture test. Cases included cows positive for either serum ELISA, fecal culture or both. Controls consisted of animals negative for the serum ELISA test or both serum ELISA and fecal culture when both were available. Controls were matched by herd and proximal birth date with cases. A total of 856 cows (451 cases and 405 controls) were used in initial discovery analyses, and an additional 263 cows (159 cases and 104 controls) from the same herds were used as a validation data set. Data were analyzed in a single marker analysis controlling for relatedness of individuals (GRAMMAR-GC) and also in a Bayesian analysis in which multiple marker effects were estimated simultaneously (GenSel). For the latter, effects of non-overlapping 1 Mb marker windows across the genome were estimated. Results from the two discovery analyses were generally concordant; however, discovery results were generally not well supported in analysis of the validation data set. A combined analysis of discovery and validation data sets provided strongest support for SNPs and 1 Mb windows on chromosomes 1, 2, 6, 7, 17 and 29.Entities:
Mesh:
Year: 2014 PMID: 25473852 PMCID: PMC4256300 DOI: 10.1371/journal.pone.0111704
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
SNP associations from GRAMMAR-GC analysis.
| Discovery | Validation | Combined | ||||||||||
| SNP | BTA | bp | alleles | MAF | nominal p-value | FDR | effect ± SE | nominal p-value | effect ± SE | nominal p-value | FDR | effect ± SE |
| ARS-BFGL-NGS-7756 | 7 | 70,988,849 | G/A | 0.47 | 3.53×10−5 | 0.59 | 0.088±0.022 | 0.14 | −0.056±0.038 | 2.32×10−5 | 0.78 | 0.078±0.019 |
| ARS-BFGL-NGS-36375 | 2 | 15,709,188 | A/G | 0.28 | 2.16×10−2 | 1.03 | 0.054±0.024 | 3.08×10−6 | 0.194±0.041 | 2.84×10−5 | 0.48 | 0.086±0.021 |
| ARS-BFGL-NGS-43717 | 17 | 9,392,845 | A/G | 0.36 | 9.79×10−4 | 0.89 | 0.070±0.022 | 0.02 | 0.093±0.039 | 4.86×10−5 | 0.35 | 0.076±0.019 |
| ARS-BFGL-NGS-2069 | 6 | 526,736 | G/A | 0.23 | 1.39×10−4 | 1.16 | 0.085±0.023 | 0.08 | 0.061±0.035 | 5.35×10−5 | 0.37 | 0.078±0.019 |
| Hapmap38264-BTA-96587 | 7 | 102,398,654 | C/A | 0.46 | 5.31×10−4 | 0.89 | 0.074±0.022 | 0.04 | 0.077±0.038 | 5.78×10−5 | 0.39 | 0.075±0.019 |
| ARS-BFGL-NGS-12309 | 29 | 32,671,085 | G/A | 0.45 | 2.85×10−4 | 0.96 | −0.077±0.022 | 0.04 | 0.078±0.038 | 6.06×10−5 | 0.34 | −0.074±0.019 |
| ARS-BFGL-NGS-110386 | 15 | 66,653,797 | G/A | 0.22 | 1.62×10−5 | 0.54 | 0.110±0.027 | 0.59 | −0.024±0.045 | 4.18×10−4 | 0.87 | 0.079±0.023 |
False discovery rate.
Figure 1Manhattan plot for single marker (GRAMMAR-GC/GenABLE) analysis of combined discovery and validation data sets.
Each dot represents the results from the test of association for a single SNP. Minus log10 of the p-value is indicated on the y-axis and map location of the SNP is indicated on the x-axis.
Most significant 1 Mb windows from Bayes C analysis of discovery and combined discovery and validation data.
| Discovery | Combined Discovery and Validation | |||||
| BTA | Starting location (bp) | Number of SNPs in 1 Mb window | Percent of total SNP variance | p>0 | Percent of total SNP variance | p>0 |
| 6 | 202,769 | 10 | 1.26 | 0.12 | 29.50 | 0.92 |
| 7 | 70,299,314 | 9 | 4.63 | 0.29 | 5.44 | 0.54 |
| 2 | 1,039,834 | 21 | 0.37 | 0.06 | 3.71 | 0.47 |
| 2 | 15,001,586 | 19 | 0.05 | 0.02 | 2.36 | 0.29 |
| 29 | 32,033,056 | 16 | 0.52 | 0.06 | 2.14 | 0.24 |
| 17 | 9,027,765 | 18 | 0.32 | 0.05 | 1.98 | 0.24 |
| 1 | 128,031,876 | 19 | 0.48 | 0.07 | 1.44 | 0.26 |
| 15 | 66,042,287 | 14 | 2.93 | 0.25 | 0.58 | 0.08 |
| 8 | 113,012,644 | 8 | 2.87 | 0.22 | 0.04 | 0.01 |
Proportion of iterations in which the window accounted for a proportion of genetic variation greater than zero.
Figure 2Manhattan plot for 1 Mb window (Bayes C/GenSel) analysis of combined discovery and validation data sets.
Each dot represents the percent of genetic variance explained by multiple SNPs within a 1 Mb window. Percent of variance is indicated on the y-axis and map location of the SNP is indicated on the x-axis.
Figure 3Receiver Operating Characteristics (ROC) curve for five-fold cross-validation and for application of results from discovery data to validation data.
Models were developed using a Bayes C analysis implemented in GenSel. Each curve represents one model, with the black line in the figure on the left representing the average of the five-fold cross-validations. Area under the ROC curve is equivalent to the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance. A diagonal line from the lower left to upper right corner would represent a model with no predictive ability.