| Literature DB >> 30891314 |
Mohamed N Saad1, Mai S Mabrouk2, Ayman M Eldeib3, Olfat G Shaker4.
Abstract
The human genome, which includes thousands of genes, represents a big data challenge. Rheumatoid arthritis (RA) is a complex autoimmune disease with a genetic basis. Many single-nucleotide polymorphism (SNP) association methods partition a genome into haplotype blocks. The aim of this genome wide association study (GWAS) was to select the most appropriate haplotype block partitioning method for the North American Rheumatoid Arthritis Consortium (NARAC) dataset. The methods used for the NARAC dataset were the individual SNP approach and the following haplotype block methods: the four-gamete test (FGT), confidence interval test (CIT), and solid spine of linkage disequilibrium (SSLD). The measured parameters that reflect the strength of the association between the biomarker and RA were the P-value after Bonferroni correction and other parameters used to compare the output of each haplotype block method. This work presents a comparison among the individual SNP approach and the three haplotype block methods to select the method that can detect all the significant SNPs when applied alone. The GWAS results from the NARAC dataset obtained with the different methods are presented. The individual SNP, CIT, FGT, and SSLD methods detected 541, 1516, 1551, and 1831 RA-associated SNPs respectively, and the individual SNP, FGT, CIT, and SSLD methods detected 65, 156, 159, and 450 significant SNPs respectively, that were not detected by the other methods. Three hundred eighty-three SNPs were discovered by the haplotype block methods and the individual SNP approach, while 1021 SNPs were discovered by all three haplotype block methods. The 383 SNPs detected by all the methods are promising candidates for studying RA susceptibility. A hybrid technique involving all four methods should be applied to detect the significant SNPs associated with RA in the NARAC dataset, but the SSLD method may be preferred because of its advantages when only one method was used.Entities:
Keywords: Confidence interval test; Four-gamete test; Genome-wide association study; NARAC; Rheumatoid arthritis; Solid spine of linkage disequilibrium
Year: 2019 PMID: 30891314 PMCID: PMC6403413 DOI: 10.1016/j.jare.2019.01.006
Source DB: PubMed Journal: J Adv Res ISSN: 2090-1224 Impact factor: 10.479
Fig. 1Snapshot of the NARAC dataset showing 10 samples with their corresponding 3 SNPs. The first column represents the individuals’ IDs. The second column refers to the affection status (0: case, 1: control). The third column shows the sex (F: female, M: male). The next columns correspond to the SNPs, with the first row providing the SNP ID. In each SNP cell, two identical alleles represent a homozygote, whereas two different alleles represent a heterozygote.
Fig. 2Summary of the proposed system for the NARAC dataset.
Results of the median block size (in bp) by all three block methods for the general blocks and the significantly associated blocks with RA.
| 8489 | 9547 | 13,549 | 64,634 | 47,700 | 34,467 | |
| 8495 | 9645 | 14,342 | 24,123 | 11,756 | 23,312 | |
| 7938 | 9240 | 13,544 | 7513 | 11,854 | 13,800 | |
| 9947 | 11,083 | 13,544 | 3279 | 3279 | 0 | |
| 8641 | 9697 | 14,102 | 22,052 | 15,381 | 18,456 | |
| 8457 | 9583 | 13,944 | 8672 | 7448 | 10,123 | |
| 8235 | 9008 | 13,869 | 27,949 | 4326 | 32,616 | |
| 7149 | 7971 | 12,262 | 15,280 | 14,404 | 10,115 | |
| 6324 | 7166 | 10,297 | 10,662 | 15,473 | 13,315 | |
| 7464 | 8392 | 12,231 | 2462 | 669 | 9719 | |
| 7764 | 8634 | 12,455 | 9746 | 9504 | 0 | |
| 8043 | 8898 | 13,281 | 5705 | 5705 | 10,091 | |
| 8346 | 9134 | 13,410 | 9913 | 4663 | 32,705 | |
| 7458 | 8443 | 12,747 | 18,225 | 12,316 | 18,225 | |
| 6151 | 7336 | 10,451 | 9321 | 11,213 | 14,822 | |
| 4912 | 5562 | 8984 | 24,155 | 6893 | 64,712 | |
| 6263 | 7535 | 9997 | 12,690 | 57,213 | 18,594 | |
| 6811 | 7962 | 11,379 | 0 | 8210 | 11,265 | |
| 6760 | 7930 | 10,833 | 9571 | 10,633 | 18,621 | |
| 6413 | 6933 | 10,563 | 7448 | 6133 | 21,323 | |
| 6784 | 7552 | 10,871 | 13,020 | 11,817 | 4704 | |
| 5272 | 5986 | 8381 | 9298 | 10,650 | 24,936 |
Fig. 3Comparison of the RA-associated results obtained by the three haplotype block partitioning methods. (a) The total number of significant blocks for each Chr. (b) The total number of associated SNPs for each Chr. (c) The total significant blocks size in bp for each Chr.
Results of the individual SNP approach compared to all three block methods.
| Chr no. | Total no. of significant SNPs obtained by the individual SNP method | No. of significant SNPs obtained by only the individual SNP method | No. of significant SNPs obtained by all three block methods | No. of significant SNPs obtained by all four methods |
|---|---|---|---|---|
| 4 | 3 | 8 | 1 | |
| 2 | 2 | 0 | 0 | |
| 5 | 3 | 7 | 0 | |
| 5 | 2 | 0 | 0 | |
| 6 | 4 | 8 | 2 | |
| 432 | 12 | 916 | 367 | |
| 7 | 3 | 2 | 0 | |
| 11 | 3 | 14 | 1 | |
| 11 | 4 | 16 | 7 | |
| 5 | 2 | 0 | 0 | |
| 2 | 1 | 0 | 0 | |
| 3 | 1 | 6 | 0 | |
| 0 | 0 | 0 | 0 | |
| 5 | 2 | 11 | 1 | |
| 3 | 3 | 5 | 0 | |
| 7 | 5 | 0 | 0 | |
| 4 | 2 | 11 | 0 | |
| 3 | 1 | 0 | 0 | |
| 5 | 2 | 13 | 2 | |
| 8 | 5 | 3 | 1 | |
| 7 | 2 | 0 | 0 | |
| 6 | 3 | 1 | 1 |
Fig. 4Number of RA biomarkers detected by each method – “all” biomarkers detected by the method or detected “only” by one method.
Fig. 5Manhattan plot showing the associations between the whole NARAC SNPs and RA susceptibility using the individual SNP approach. The genes with P-values lower than the genome-wide significance threshold are shown above the plot area.
The highly significant SNPs (with P-values lower than the genome-wide significance threshold) discovered by the individual SNP approach with the corresponding haplotype blocks.
| SNP ID | Chr | Position (bp) | Assoc. Allele | AAF | Gene/Nearest Genes | Haplotype Block (Method, | Haplotype Block Position (bp) (Start, End, Size) | Previously Studied in | |
|---|---|---|---|---|---|---|---|---|---|
| 1 | 3,352,541 | G | 0.956, 0.881 | 1.56 E-14 | Not detected by any method | – | |||
| 1 | 114,089,610 | A | 0.155, 0.084 | 1.12 E-12 | FGT, 8.5 E-13, 8 | 114075501, 114132504, 57,004 | |||
| CIT, 1.01 E-11, 10 | 114050631, 114141503, 90,873 | ||||||||
| SSLD, 1.03 E-10, 33 | 113787838, 114132504, 344,667 | ||||||||
| 2 | 37,860,221 | G | 0.994, 0.964 | 1.12 E-09 | Not detected by any method | – | – | ||
| 2 | 198,949,233 | G | 0.989, 0.956 | 2.94 E-09 | Not detected by any method | – | – | ||
| 3 | 58,957,115 | G | 0.995, 0.956 | 8.43 E-13 | FGT, 1.51 E-07, 20 | 58754521, 59072633, 318,113 | – | ||
| SSLD, 2.51 E-11, 9 | 58957115, 59057595, 100,481 | ||||||||
| 4 | 12,775,151 | G | 0.195, 0.125 | 3.7 E-09 | Not detected by any method | – | |||
| 4 | 113,564,881 | G | 0.966, 0.923 | 3.84 E-08 | Not detected by any method | – | – | ||
| 5 | 71,792,426 | G | 0.930, 0.865 | 3.22 E-10 | Not detected by any method | – | – | ||
| 5 | 133,075,674 | G | 0.820, 0.738 | 1.77 E-09 | FGT, 3.51 E-06, 9 | 133065358, 133094704, 29,347 | |||
| CIT, 2.95 E-06, 9 | 133057095, 133094704, 37,610 | ||||||||
| SSLD, 2.1 E-07, 6 | 133075674, 133094129, 18,456 | ||||||||
| 7 | 129,556,365 | G | 0.990, 0.948 | 5.95 E-12 | Not detected by any method | – | – | ||
| 7 | 63,170,795 | A | 0.996, 0.963 | 1.47 E-11 | SSLD, 3.6 E-11, 4 | 63138417, 63170795, 32,379 | – | ||
| 7 | 100,536,496 | G | 0.991, 0.960 | 8.12 E-09 | SSLD, 7.17 E-08, 2 | 100522057, 100536496, 14,440 | – | ||
| 8 | 131,021,293 | G | 0.982, 0.938 | 2.18 E-10 | Not detected by any method | – | – | ||
| 8 | 20,402,898 | G | 0.916, 0.860 | 3.9 E-08 | FGT, 1.21 E-07, 6 | 20385189, 20404428, 19,240 | |||
| 9 | 123,233,908 | G | 0.993, 0.940 | 2.25 E-16 | Not detected by any method | – | |||
| 9 | 81,666,969 | G | 0.959, 0.906 | 1.42 E-09 | FGT, 1.69 E-08, 2 | 81666969, 81670581, 3613 | |||
| CIT, 1.08 E-07, 2 | 81662684, 81666969, 4286 | ||||||||
| SSLD, 1.21 E-07, 3 | 81662684, 81670581, 7898 | ||||||||
| 9 | 120,785,936 | A | 0.390, 0.303 | 6.24 E-09 | FGT, 4.66 E-08, 14 | 120720054, 120810962, 90,909 | |||
| CIT, 8.03 E-08, 8 | 120720054, 120807548, 87,495 | ||||||||
| SSLD, 4.5 E-08, 12 | 120720054, 120807548, 87,495 | ||||||||
| 9 | 120,769,793 | G | 0.468, 0.380 | 1.24 E-08 | FGT, 4.66 E-08, 14 | 120720054, 120810962, 90,909 | |||
| CIT, 8.03 E-08, 8 | 120720054, 120807548, 87,495 | ||||||||
| SSLD, 4.5 E-08, 12 | 120720054, 120807548, 87,495 | ||||||||
| 9 | 120,732,452 | A | 0.388, 0.304 | 2.27 E-08 | FGT, 4.66 E-08, 14 | 120720054, 120810962, 90,909 | |||
| CIT, 8.03 E-08, 8 | 120720054, 120807548, 87,495 | ||||||||
| SSLD, 4.5 E-08, 12 | 120720054, 120807548, 87,495 | ||||||||
| 9 | 120,720,054 | A | 0.387, 0.304 | 2.76 E-08 | FGT, 4.66 E-08, 14 | 120720054, 120810962, 90,909 | |||
| CIT, 8.03 E-08, 8 | 120720054, 120807548, 87,495 | ||||||||
| SSLD, 4.5 E-08, 12 | 120720054, 120807548, 87,495 | ||||||||
| 9 | 120,781,544 | G | 0.475, 0.389 | 3.78 E-08 | FGT, 4.66 E-08, 14 | 120720054, 120810962, 90,909 | |||
| CIT, 8.03 E-08, 8 | 120720054, 120807548, 87,495 | ||||||||
| SSLD, 4.5 E-08, 12 | 120720054, 120807548, 87,495 | ||||||||
| 10 | 105,403,030 | G | 0.958, 0.897 | 6.12 E-11 | Not detected by any method | – | – | ||
| 10 | 49,767,825 | A | 0.677, 0.592 | 2.66 E-08 | SSLD, 4.84 E-08, 6 | 49767825, 49777543, 9719 | |||
| 10 | 71,550,864 | A | 0.976, 0.939 | 4.16 E-08 | FGT, 1.91 E-06, 2 | 71550196, 71550864, 669 | – | ||
| 12 | 46,702,024 | C | 0.907, 0.819 | 3 E-12 | FGT, 1.23 E-07, 3 | 46700325, 46703575, 3251 | – | ||
| 12 | 119,263,543 | G | 0.943, 0.888 | 1.72 E-08 | Not detected by any method | – | – | ||
| 14 | 104,050,531 | G | 0.997, 0.973 | 1.94 E-08 | FGT, 5.69 E-06, 8 | 104045894, 104062173, 16,280 | – | ||
| 16 | 82,588,153 | G | 0.516, 0.405 | 1.16 E-09 | Not detected by any method | – | – | ||
| 16 | 1,481,462 | G | 0.954, 0.904 | 1.77 E-08 | Not detected by any method | – | – | ||
| 17 | 73,740,166 | C | 0.817, 0.714 | 7.38 E-11 | Not detected by any method | – | – | ||
| 18 | 44,295,753 | G | 0.924, 0.865 | 7.13 E-09 | Not detected by any method | – | – | ||
| 20 | 35,485,260 | G | 0.956, 0.888 | 3.55 E-13 | Not detected by any method | – | |||
| 20 | 57,826,397 | C | 0.852, 0.779 | 6.53 E-09 | FGT, 1 E-08, 2 | 57826397, 57832814, 6418 | |||
| SSLD, 1 E-08, 2 | 57826397, 57832814, 6418 | ||||||||
| 22 | 20,321,624 | G | 0.930, 0.854 | 6.04 E-12 | FGT, 5.08 E-08, 7 | 20264229, 20321624, 57,396 | – | ||
| CIT, 1.09 E-08, 3 | 20313153, 20321624, 8472 | ||||||||
| SSLD, 1.09 E-06, 3 | 20321624, 20346559, 24,936 | ||||||||
| 22 | 18,112,909 | G | 0.844, 0.767 | 4.08 E-08 | FGT, 1.02 E-05, 2 | 18112175, 18112909, 735 | – | ||
| CIT, 1.02 E-05, 2 | 18112175, 18112909, 735 |
Assoc. Allele: Associated Allele.
AAF: Associated Allele Frequency.
P-values are calculated based on the chi-squared test.
Fig. 6Comparison for the CIT and SSLD methods on the same significant haplotype block in the PHF19-TRAF1-C5 region. (a) LD plot showing CIT block comprising eight biomarkers. (b) LD plot for SSLD block including twelve biomarkers.
Disease enrichment analysis for the genes of the “never been reported” biomarkers.
| Gene name | Region | Functional pathway related to RA | Diseases affected by the gene |
|---|---|---|---|
| 2p21 | Induces pseudopodia formation in fibroblasts | Schizophrenia | |
| 2p22.2 | Lung cancer | ||
| 2q33 | Affects the bone density and the level of osteocalcin | Osteoporosis, hip bone size variation in females | |
| 2q33 | Affects the activity of osteoblasts and the differentiation of immunocytes, plays a role in immune regulation, and elevations in the level of alkaline phosphatase | Cleft palate | |
| 3p14.2 | |||
| 4q25 | Plays a role in the activation of IL-1, TRAF6, and IKK, affects the activation of NF-kappa-B | ||
| 5q13.2 | Plays a role in regulating the expression of genes in response to estrogen, affects the differentiation of dendritic cells and the production of IL-4, IL-10, IL-12, and NF-kappa-B | Osteoporosis | |
| 7q32 | Benign hypertrophic prostate, prostate cancer | ||
| 7q11.21 | |||
| 7q22.1 | Alzheimer's disease | ||
| 7q22.1 | |||
| 8q24.21 | Endometriosis | ||
| 10q24.33 | Affects the activity of osteoclast | Breast cancer, melanoma | |
| 10q22.1 | Ovarian cancer, retinoblastoma | ||
| 12q13.11 | Plays a role in the activation of IL-6, Osteoarthritis, chondrodysplasia, epiphyseal dysplasia, joint deformity, spondyloepiphyseal dysplasia | Stickler and Wagner syndromes | |
| 12q13.1 | Plays a role in the activation of IL-6 | Prostate cancer | |
| 12q24.1-q24.31 | Liver cancer, hepatoma, glioma and melanoma | ||
| 14q32.33 | |||
| 14q32.33 | |||
| 16q23.3 | |||
| 16p13.3 | |||
| 16p13.3 | Glioma | ||
| 17q25.3 | Cataract | ||
| 18q21.1 | Sepsis | ||
| 18q21.1 | Hearing function | ||
| 22q11.21 | Insulinoma | ||
| 22q11.21 | Involved in cytokinesis | Juvenile parkinsonism | |
| 22q11.21-q11.23 | Bernard-Soulier syndrome | ||
| 22q11.21 | expands T lymphocytes activity, affects the activity of fibroblastic growth factor | DiGeorge syndrome, pharyngeal and aortic arch defects |
Fig. 7LD plot for the TBX1 region showing a biomarker in this study (rs1005133) and a previously detected biomarker (rs4819522).
Block similarity among the haplotype block methods for the twenty-two Chrs.
| Chr no. | CIT | FGT | SSLD |
|---|---|---|---|
| 88% | 21% | 23% | |
| 39% | 0% | 0% | |
| 34% | 45% | 20% | |
| 100% | 0% | 0% | |
| 40% | 21% | 30% | |
| 76% | 74% | 71% | |
| 9% | 32% | 6% | |
| 39% | 30% | 34% | |
| 49% | 29% | 25% | |
| 0% | 0% | 0% | |
| 53% | 0% | 0% | |
| 74% | 18% | 21% | |
| 71% | 0% | 0% | |
| 17% | 36% | 24% | |
| 39% | 33% | 23% | |
| 0% | 0% | 54% | |
| 52% | 51% | 35% | |
| 0% | 0% | 0% | |
| 64% | 52% | 43% | |
| 50% | 18% | 27% | |
| 75% | 0% | 11% | |
| 53% | 2% | 4% |
The ability of each haplotype block method to capture the significant SNPs the determined with individual SNP approach.
| Chr no. | Individual SNP | CIT | FGT | SSLD |
|---|---|---|---|---|
| 4 | 1 | 1 | 1 | |
| 2 | 0 | 0 | 0 | |
| 5 | 1 | 2 | 1 | |
| 5 | 3 | 3 | 0 | |
| 6 | 2 | 2 | 2 | |
| 432 | 381 | 387 | 415 | |
| 7 | 0 | 2 | 3 | |
| 11 | 6 | 6 | 2 | |
| 11 | 7 | 7 | 7 | |
| 5 | 0 | 1 | 2 | |
| 2 | 1 | 1 | 0 | |
| 3 | 0 | 1 | 1 | |
| 0 | 0 | 0 | 0 | |
| 5 | 2 | 2 | 1 | |
| 3 | 0 | 0 | 0 | |
| 7 | 0 | 1 | 1 | |
| 4 | 1 | 2 | 1 | |
| 3 | 0 | 2 | 0 | |
| 5 | 2 | 2 | 3 | |
| 8 | 1 | 3 | 3 | |
| 7 | 5 | 4 | 0 | |
| 6 | 2 | 3 | 1 |