| Literature DB >> 22474574 |
Ricardo A Cifuentes1, Daniel Restrepo-Montoya, Juan-Manuel Anaya.
Abstract
There is genetic evidence of similarities and differences among autoimmune diseases (AIDs) that warrants looking at a general panorama of what has been published. Thus, our aim was to determine the main shared genes and to what extent they contribute to building clusters of AIDs. We combined a text-mining approach to build clusters of genetic concept profiles (GCPs) from the literature in MedLine with knowledge of protein-protein interactions to confirm if genes in GCP encode proteins that truly interact. We found three clusters in which the genes with the highest contribution encoded proteins that showed strong and specific interactions. After projecting the AIDs on a plane, two clusters could be discerned: Sjögren's syndrome-systemic lupus erythematosus, and autoimmune thyroid disease-type1 diabetes-rheumatoid arthritis. Our results support the common origin of AIDs and the role of genes involved in apoptosis such as CTLA4, FASLG, and IL10.Entities:
Year: 2012 PMID: 22474574 PMCID: PMC3303588 DOI: 10.1155/2012/792106
Source DB: PubMed Journal: Autoimmune Dis ISSN: 2090-0430
Examples of literature-based knowledge discovery tools.
| Tool | Mined data | URL |
|---|---|---|
| ANNI | MedLine |
|
| Arrowsmith1, | MedLine, OVID |
|
| UMLS concepts in | ||
| Arrowsmith2 | title words (MedLine) |
|
| BITOLA | MeSH and LocusLink |
|
| LitLinker | UMLS |
|
| FACTA | MedLine |
|
| FAUN | MedLine |
|
1 University of Chicago
2 University of Illinois at Chicago
For more information about biomedical text mining tools visit
http://arrowsmith.psych.uic.edu/arrowsmith_uic/tools.html.
Examples of tools to analyze biological pathways.
| Tool | Analyzed data | URL |
|---|---|---|
| Cytoscape | 220 diverse databases. |
|
| BIANA | uniprot, GenBank, IntAct, |
|
| KEGG and PFAM. | ||
| Pathway studio | MedLine. |
|
| Patika | Reactome, UniProt, Entrez |
|
| Gene, and GO. | ||
| Genes2networks | BIND, DIP, IntAct, MINT, |
|
| pdzbase, SAVI, Stelzl, vidal, ncbi hprd, and KEGG mammalian |
Figure 1Flowchart of the methodology. AITD: autoimmune thyroid disease, SS: primary Sjögren's syndrome, SLE: namely systemic lupus erythematosus, MS: multiple sclerosis, RA: rheumatoid arthritis, T1D: type 1 diabetes, and SSc: systemic sclerosis.
Figure 2Clustering of seven autoimmune diseases. SLE: systemic lupus erithematosus, SS: Sjögren's syndrome, T1D: type 1 diabetes, AITD: autoimmune thyroid disease, RA: rheumatoid arthritis, MS: multiple sclerosis, SSc: systemic sclerosis.
Genes with a contribution higher than 0.1% to the found clusters of the studied autoimmune diseases.
| Cluster 1. SLE-SS | Cluster 2. T1D-AITD | Cluster 4. RA-MS | |||
|---|---|---|---|---|---|
| Gene | % | Gene | % | Gene | % |
|
| 27.91 |
| 32.4 |
| 39.5 |
|
| 27.46 |
| 28.6 |
| 20.7 |
|
| 19.8 |
| 6.7 |
| 5.2 |
|
| 6.6 |
| 6.7 |
| 2.2 |
|
| 2.7 |
| 6.4 |
| 0.6 |
|
| 2.6 |
| 4.6 |
| 0.6 |
|
| 1.0 |
| 3.6 |
| 0.6 |
|
| 0.9 |
| 1.7 |
| 0.5 |
|
| 0.8 |
| 1.5 |
| 0.5 |
|
| 0.6 |
| 0.5 |
| 0.5 |
|
| 0.5 |
| 0.5 |
| 0.4 |
|
| 0.5 |
| 0.5 |
| 0.4 |
|
| 0.4 |
| 0.4 |
| 0.4 |
|
| 0.4 |
| 0.3 |
| 0.3 |
|
| 0.2 |
| 0.2 |
| 0.3 |
|
| 0.2 |
| 0.2 |
| 0.3 |
|
| 0.2 |
| 0.2 |
| 0.3 |
|
| 0.2 |
| 0.2 |
| 0.2 |
|
| 0.2 |
| 0.2 |
| 0.2 |
|
| 0.2 |
| 0.2 |
| 0.2 |
|
| 0.2 |
| 0.2 | ||
|
| 0.2 | ||||
|
| 0.2 | ||||
|
| 0.2 | ||||
SLE: systemic lupus erithematosus, SS: Sjögren's syndrome, T1D: type 1 diabetes, AITD: autoimmune thyroid disease, RA: rheumatoid arthritis, MS: multiple sclerosis, %: percentage of contribution to the cluster.
Figure 3Network analysis of the genes that contribute to the clusters of autoimmune diseases. Solid squares: genes with a contribution higher than 0.1% that are shared by more than one cluster. Dotted squares: genes with a contribution higher than 0.1% from the SLE-SS cluster. Solid ovales: genes with a contribution higher than 0.1% from the T1D-AITD cluster. Dotted ovales: genes with a contribution higher than 0.1% from the RA-MS cluster. The other nodes correspond to significant intermediary ones (the asterisk indicates a nonsignificant intermediary node).
Significance of intermediates sorted by z-score.
| Gene name | Link | Link in background | Links to seed | Links in subnetwork | z-score |
|---|---|---|---|---|---|
| HLA-DQA2 | 3 | 11429 | 2 | 60 | 15,852 |
| DARC | 4 | 11429 | 2 | 60 | 13,692 |
| LCK | 67 | 11429 | 6 | 60 | 9,548 |
| PRTN3 | 9 | 11429 | 2 | 60 | 9,007 |
| APCS | 10 | 11429 | 2 | 60 | 8,522 |
| FN1 | 62 | 11429 | 5 | 60 | 8,215 |
| IGFBP7 | 11 | 11429 | 2 | 60 | 8,103 |
| PTPN13 | 12 | 11429 | 2 | 60 | 7,737 |
| CASP1 | 18 | 11429 | 2 | 60 | 6,215 |
| A2M | 24 | 11429 | 2 | 60 | 5,293 |
| DCN | 25 | 11429 | 2 | 60 | 5,171 |
| NCL | 30 | 11429 | 2 | 60 | 4,655 |
| C3 | 31 | 11429 | 2 | 60 | 4,566 |
| JAK2 | 116 | 11429 | 4 | 60 | 4,356 |
| PTPRC | 35 | 11429 | 2 | 60 | 4,248 |
| THBS1 | 37 | 11429 | 2 | 60 | 4,108 |
| ARRB1 | 44 | 11429 | 2 | 60 | 3,690 |
| TRADD | 63 | 11429 | 2 | 60 | 2,910 |
| PIK3R1 | 133 | 11429 | 3 | 60 | 2,761 |
| FYN | 153 | 11429 | 3 | 60 | 2,457 |
Relevance on autoimmunity GWAS of the genes with a contribution higher than 1% to two or more clusters of the studied autoimmune diseases.
| Gene | Full name | Location | GWAS catalogue | Reference |
|---|---|---|---|---|
|
| Major histocompatibility complex, class II, DQ beta 1 | 6p21.3 | MS, PBC, RA, SSc, CD, UC, CrD | [ |
|
| CD4 molecule | 12pter-p12 | — | — |
|
| Cytotoxic T-lymphocyte-associated protein 4 | 2q33 | T1D, RA, MS, SLE, CD | [ |
|
| Fas ligand (TNF superfamily, member 6) | 1q23 | CD, CrD | — |
|
| Interleukin 1, beta | 2q14 | — | — |
|
| Interleukin 10 | 1q31-q32 | T1D, SLE, UC, CrD | [ |
MS: multiple sclerosis, PBC: primary biliar cirrhosis, RA: rheumatoid arthitis, SSc: systemic sclerosis, CD: celiac disease, CrD: crohn disease, T1D: Type 1 diabetes, SLE: systemic lupus erithematosus, UC: ulcerative colitis, PSO: Psoriasis.
Figure 4Projection of the seven studied autoimmune diseases on a plane. This figure shows the shared space of the genetic concept profiles from the studied AIDs (underlined), according to the matching value of their genetic concept profiles. We can see the genes with a contribution to clustering higher than 0.2%, the asterisk indicates the genes shared by two clusters.