| Literature DB >> 35444688 |
Zhijian Yang1,2, Wenzheng Xu1, Ranran Zhai1,2, Ting Li1,2, Zheng Ning1,3, Yudi Pawitan3, Xia Shen1,2,3,4,5.
Abstract
Integrating genome-wide association studies (GWAS) with transcriptomic data, human complex traits and diseases have been linked to relevant tissues and cell types using different methods. However, different results from these methods generated confusion while no gold standard is currently accepted, making it difficult to evaluate the discoveries. Here, applying three methods on the same data source, we estimated the sensitivity and specificity of these methods in the absence of a gold standard. We established a more specific tissue-trait association atlas by combining the information captured by different methods. Our triangulation strategy improves the performance of existing methods in establishing tissue-trait associations. The results provide better etiological and functional insights for the tissues underlying different human complex traits and diseases.Entities:
Keywords: genome-wide association studies (GWAS); likelihood inference; omics data integration; tissue specificity; tissue-trait association; transcriptomics
Year: 2022 PMID: 35444688 PMCID: PMC9014299 DOI: 10.3389/fgene.2022.798269
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.772
FIGURE 1Simulations assessing the maximum likelihood estimation of operating characteristics in the absence of a gold standard. Three methods (represented by squares, circles, and triangles, respectively) with known true sensitivities and specificities (red dots) were simulated, applied on different numbers of tests in total (n) with a pre-defined prevalence of true positives. The estimated (blue dots) prevalence (ρ), sensitivity (ϕ), and specificity (ψ′) parameters are visualized for 100 simulation repeats. TPR: true positive rate. FPR: false positive rate.
FIGURE 2ROC for three distinct methods. For the real data application testing 27 traits v.s. 44 tissues, we estimated the operating characteristics of each method (eQTL, LDSC, and RolyPoly). Each point represents the mean of 99 bootstrap estimates. The whiskers give the standard errors of TPRs and FPRs based on the bootstrap estimates. Each point was evaluated under a particular p-value threshold: 0.01, 0.02, … , 0.09, 0.1, 0.2, … , 0.9. TPR: true positive rate. FPR: false positive rate.
FIGURE 3Tissue-trait association scoring combining specificity estimates of three distinct methods. Association between 27 traits and 44 tissues were quantified by association score. The association score sums the binary results from different methods, weighted by the specificity estimates. The p-value thresholds are 0.05 for all three methods (eQTL, LDSC, and RolyPoly). * represent 0.05 for the combined false discovery rate (FDR) of the association score.
FIGURE 4The overall performance of three methods and the combined results in terms of FDR v.s. the number of claimed significant discoveries. We calculated the FDR for each method and the overall FDR for the combined result based on the estimated operating characteristics. Each point was evaluated under a particular p-value threshold: 0.01, 0.02, … , 0.09, 0.1, 0.2, … , 0.9. Each curve was fitted for the exponential model of Y = 1 − e −. FDR: false discovery rate.