| Literature DB >> 30262522 |
Joseph L Gage1, Natalia de Leon2, Murray K Clayton3.
Abstract
Increasing popularity of high-throughput phenotyping technologies, such as image-based phenotyping, offer novel ways for quantifying plant growth and morphology. These new methods can be more or less accurate and precise than traditional, manual measurements. Many large-scale phenotyping efforts are conducted to enable genome-wide association studies (GWAS), but it is unclear exactly how alternative methods of phenotyping will affect GWAS results. In this study we simulate phenotypes that are controlled by the same set of causal loci but have differing heritability, similar to two different measurements of the same morphological character. We then perform GWAS with the simulated traits and create receiver operating characteristic (ROC) curves from the results. The areas under the ROC curves (AUCs) provide a metric that allows direct comparisons of GWAS results from different simulated traits. We use this framework to evaluate the effects of heritability and the number of causative loci on the AUCs of simulated traits; we also test the differences between AUCs of traits with differing heritability. We find that both increasing the number of causative loci and decreasing the heritability reduce a trait's AUC. We also find that when two traits are controlled by a greater number of causative loci, they are more likely to have significantly different AUCs as the difference between their heritabilities increases. When simulation results are applied to measures of tassel morphology, we find no significant difference between AUCs from GWAS using manual and image-based measurements of typical maize tassel characters. This finding indicates that both measurement methods have similar ability to identify genetic associations. These results provide a framework for deciding between competing phenotyping strategies when the ultimate goal is to generate and use phenotype-genotype associations from GWAS.Entities:
Keywords: AUC; GWAS; ROC; heritability
Mesh:
Year: 2018 PMID: 30262522 PMCID: PMC6222562 DOI: 10.1534/g3.118.200700
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1Correlations between manual and image-based phenotypic values. Scatter plots of best linear unbiased predictors (BLUPs) for manual vs. image-based measurements of tassel length (TL; A), spike length (SL; B), branch number (BN; C), and tassel weight (TW; D). Manually measured BLUPs are along the x-axis, while image-based measurements are on the y-axis. Values in the upper left corner of each plot are the Pearson correlation coefficients for each trait.
Comparison of manual and image-based trait heritabilities
| Trait Name (unit) | Abbreviation | Heritability | ||
|---|---|---|---|---|
| Manual | Image-Based | Manual | Image-Based | |
| TL | TLp | 0.95 | 0.79 | |
| SL | SLp | 0.95 | 0.79 | |
| BN | BNp | 0.97 | 0.82 | |
| TW | TWp | 0.96 | 0.86 | |
Heritabilities for four different tassel morphological traits, measured both manually and using image-based methods. TL, SL, and BN were measured in three environments, whereas TW, TLp, SLp, BNp, and TWp were measured in one environment.
Figure 2Receiver operating characteristic curves for GWAS results of simulated traits controlled by 10 (A), 100 (B), or 1000 (C) causal loci. For each number of causal loci, simulation of traits with heritabilities ranging from 0.1 to 0.9 were replicated 10 times each. Each curve represents the average of the ten replications for each combination of causal loci and heritability. TPR: true positive rate; FPR: false positive rate; h2: heritability.
Figure 3Results of testing area under the curve for simulated phenotypes. The Z score for testing the difference of two AUCs is plotted against the absolute difference in heritability (D) between the two traits. Traits are controlled by 10 (A), 100 (B), or 1000 (C) causal loci. Small gray dots represent the Z score from a single pairwise test between simulated traits, while horizontal gray bars represent the median Z score for a given D. Larger colored dots represent the D estimates for real traits, plotted along the line that best fits the Z scores of the simulated data. Dashed lines represent the thresholds for significance at α=0.05 (i.e., 2.5th and 97.5th percentiles), calculated from the empirical distribution of Z when D = 0. BN: branch number; SL: spike length; TL: tassel length; TW: tassel weight.