| Literature DB >> 35860413 |
Yin-Chun Lin1, Yu-Jen Liang1, Hsin-Chou Yang1,2,3,4.
Abstract
Meta-analysis is a method for enhancing statistical power through the integration of information from multiple studies. Various methods for integrating p-values (i.e., statistical significance), including Fisher's method under an independence assumption, the permutation method, and the decorrelation method, have been broadly used in bioinformatics and computational biotechnology studies. However, these methods have limitations related to statistical assumption, computing efficiency, and accuracy of statistical significance estimation. In this study, we proposed a numerical integration method and examined its theoretical properties. Simulation studies were conducted to evaluate its Type I error, statistical power, computational efficiency, and estimation accuracy, and the results were compared with those of other methods. The results demonstrate that our proposed method performs well in terms of Type I error, statistical power, computing efficiency (regardless of sample size), and statistical significance estimation accuracy. P-value data from multiple large-scale genome-wide association studies (GWASs) and transcriptome-wise association studies (TWASs) were analyzed. The results demonstrate that our proposed method can be used to identify critical genomic regions associated with rheumatoid arthritis and asthma, increase statistical significance in individual GWASs and TWASs, and control for false-positives more effectively than can Fisher's method under an independence assumption. We created the software package Pbine, available at GitHub (https://github.com/Yinchun-Lin/Pbine).Entities:
Keywords: Decorrelation; Fisher’s method; GWAS, Genome-Wide Association Study; Genome-wide association study; MHC, Major Histocompatibility Commplex; Meta-analysis; NARAC, North American Rheumatoid Arthritis Consortium; P-value combination; Permutation; SNP, Single Nucleotide Polymorphism; TWAS, Transcriptome-Wide Association Study; Transcriptome-wise association study; WTCCC, Wellcome Trust Case Control Consortium
Year: 2022 PMID: 35860413 PMCID: PMC9283883 DOI: 10.1016/j.csbj.2022.06.055
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 6.155
Fig. 1(A) Type I error for the p-value combination methods. The x-axis represents correlation . The y-axis represents Type I error. (B) Statistical power for the p-value combination methods. The x-axis represents correlation . The y-axis represents statistical power. (C) Computation time of the proposed method () and the permutation method (). The x-axis indicates number of thousand permutations (K). The y-axis indicates computation time. Red, green, and blue lines denote gene sizes G of 10,000, 20,000, and 30,000, respectively. The squares, triangles, and circles denote sample sizes N of 3,000, 5,000, and 7,000, respectively. Solid, dotted, and dot-dashed lines represent correlation coefficients of 0.1, 0.5, and 0.9, respectively. (D)–(F) Estimation accuracy of p-value. The x-axis represents the p-values of the benchmark method (). The y-axis indicates the p-values of the other p-value combination methods: , brown; , blue; , green). The results based on correlation coefficients 0.3, 0.5, and 0.7 are arranged from left to right.
Fig. 2Manhattan plots for chromosome 6 in the This meta-GWAS contained 4,963 SNPs on chromosome 6. The Fisher’s method () and our method () were employed. Each point indicates a SNP. The x-axis indicates physical position of a SNP. The y-axis indicates p-value in a scale of –log10. The green lines indicate false-positive events identified by the Fisher’s method () but not by our method (); they involved six SNPs: rs11752073, rs2394102, rs3130014, rs3818528, rs9394169, and rs12697946 (green line). The orange lines indicate false-positive events identified by both of the Fisher’s method () and our method (). The light blue line indicates the SNP known to be associated with rheumatoid arthritis and identified by the Fisher’s method () and our method (), but not by either of the two studies (Fig. S4). The red dashed line indicates the significance level after Bonferroni correction for multiple testing.
Fig. 3Manhattan plots for 22 autosomes in the This meta-TWAS contained 3,175 genes overlapping between the two studied TWAS datasets. The Fisher’s method () and our method () were employed. Each point indicates a gene. The x-axis indicates the physical position of a gene in an autosome. The y-axis indicates p-value in a scale of –log10. The 11 purple lines indicate the genes for which Fisher’s method () reported genetic association but our method () did not. All genes, ANKRD55, were false-positively detected. The red dashed line indicates the significance level after Bonferroni correction for multiple testing.
Our method performs better than Fisher’s method when combining more than two p-values. Ten asthma related genes and five asthma unrelated gene were analyzed in this meta-TWAS. Adjusted p-values of Fisher’s method (), our method with equal weights (), and our method with unequal weights () are provided. All the p-values were adjusted using Bonferroni’s correction for multiple testing. The numbers marked in bold indicate they are statistically significant.
| Gene name | Chromosome | ||||
|---|---|---|---|---|---|
| Asthma | Chr. 6 | ||||
| Related | Chr. 6 | ||||
| Genes | Chr. 6 | 0.0689027 | |||
| Chr. 6 | |||||
| Chr. 6 | |||||
| Chr. 6 | |||||
| Chr. 6 | |||||
| Chr. 6 | |||||
| Chr. 6 | |||||
| Chr. 6 | |||||
| Asthma | Chr. 6 | 0.0986147 | 0.2299088 | ||
| Unrelated | Chr. 11 | 0.0915084 | |||
| Genes | Chr. 20 | 0.0805022 | 0.1086624 | ||
| Chr. 17 | 0.1123372 | ||||
| Chr. 19 | 0.1422346 | 0.1601654 |