| Literature DB >> 22323062 |
Alper Uzun1, Alyse Laliberte, Jeremy Parker, Caroline Andrew, Emily Winterrowd, Surendra Sharma, Sorin Istrail, James F Padbury.
Abstract
Genome-wide association studies (GWAS) query the entire genome in a hypothesis-free, unbiased manner. Since they have the potential for identifying novel genetic variants, they have become a very popular approach to the investigation of complex diseases. Nonetheless, since the success of the GWAS approach varies widely, the identification of genetic variants for complex diseases remains a difficult problem. We developed a novel bioinformatics approach to identify the nominal genetic variants associated with complex diseases. To test the feasibility of our approach, we developed a web-based aggregation tool to organize the genes, genetic variations and pathways involved in preterm birth. We used semantic data mining to extract all published articles related to preterm birth. All articles were reviewed by a team of curators. Genes identified from public databases and archives of expression arrays were aggregated with genes curated from the literature. Pathway analysis was used to impute genes from pathways identified in the curations. The curated articles and collected genetic information form a unique resource for investigators interested in preterm birth. The Database for Preterm Birth exemplifies an approach that is generalizable to other disorders for which there is evidence of significant genetic contributions.Entities:
Mesh:
Year: 2012 PMID: 22323062 PMCID: PMC3275764 DOI: 10.1093/database/bar069
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.(A) Workflow for retrieval of articles, curation and extraction of genes from literature, microarray data and gene interpolation for pathway analysis. (B) Total number of genes, their associated original sources and number of unique pathways represented.
Top 15 Journals with articles extracted for curation in dbPTB
| Journal | Number of articles for curation |
|---|---|
| 1. | 84 |
| 2. | 46 |
| 3. | 34 |
| 4. | 32 |
| 5. | 17 |
| 6. | 14 |
| 7 | 13 |
| 8. | 13 |
| 9. | 13 |
| 10. | 12 |
| 11. | 12 |
| 12. | 11 |
| 13. | 11 |
| 14. | 10 |
| 15. | 9 |
Figure 2.Number of genes among chromosomes identified from curated articles, databases and pathway analysis.
Figure 3.Representative examples of chromosomal location of genes for chromosomes 6 and 11.
Top functions of genes identified by pathway analysis
| Function | Number of networks |
|---|---|
| Inflammatory Response | 6 |
| Small Molecule Biochemistry | 5 |
| Cellular Development | 4 |
| Hematological System Development and Function | 4 |
| Cardiovascular Disease | 3 |
| Cellular Function and Maintenance | 3 |
| Connective Tissue Development and Function | 3 |
| Drug Metabolism | 3 |
| Genetic Disorder | 3 |
| Cell Signaling | 2 |
| Cellular Assembly and Organization | 2 |
| Connective Tissue Disorders | 2 |
| Embryonic Development | 2 |
| Hematological Disease | 2 |
| Infectious Disease | 2 |
| Inflammatory Disease | 2 |
| Lipid Metabolism | 2 |
| Molecular Transport | 2 |
| Amino Acid Metabolism | 1 |
| Antigen Presentation | 1 |
| Antimicrobial Response | 1 |
| Carbohydrate Metabolism | 1 |
| Cardiovascular System Development and Function | 1 |
| Cell Cycle | 1 |
| Cell Death | 1 |
| Cell-mediated Immune Response | 1 |
| Cell-To-Cell Signaling and Interaction | 1 |
| Cellular Compromise | 1 |
| Cellular Growth and Proliferation | 1 |
| Dermatological Diseases and Conditions | 1 |
| DNA Replication | 1 |
| Hematopoiesis | 1 |
| Infection Mechanism | 1 |
| Nucleic Acid Metabolism | 1 |
| Organismal Functions | 1 |
| Organismal Injury and Abnormalities | 1 |
| Organismal Survival | 1 |
| Organ Morphology | 1 |
| Recombination and Repair | 1 |
| Skeletal and Muscular Disorders | 1 |
| Skeletal and Muscular System Development and Function | 1 |
| Tissue Morphology | 1 |
The number of times each gene was included in different networks is also shown.