| Literature DB >> 23197989 |
Abstract
Identifying molecular biomarkers has become one of the important tasks for scientists to assess the different phenotypic states of cells or organisms correlated to the genotypes of diseases from large-scale biological data. In this paper, we proposed a text-mining-based method to discover biomarkers from PubMed. First, we construct a database based on a dictionary, and then we used a finite state machine to identify the biomarkers. Our method of text mining provides a highly reliable approach to discover the biomarkers in the PubMed database.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23197989 PMCID: PMC3502861 DOI: 10.1155/2012/135780
Source DB: PubMed Journal: Comput Math Methods Med ISSN: 1748-670X Impact factor: 2.238
Figure 1The flow chart of the biomarker discovery.
The dictionary of the biomarkers.
| Gene | Protein | Pathway | Disease |
|---|---|---|---|
| P53 | P53 | Ras | Diabetes |
| APC | APC | Wnt | Breast cancer |
| MDM2 | Pten | Death receptor pathway | Liver cancer |
| Ras | HCC | Ether lipid metabolism | Huntington |
| Axin-1 | HPR | Thiamine metabolism | Liver cirrhosis |
| LCE2B | Porphyrin and chlorophyll | Prostate cancer | |
| AXIN1 | Metabolism | Leukemia | |
| SLC22A1 |
Figure 2The identification of the biomarker using the finite state machine.
The list of biomarker-disease associations mined from PubMed.
| EntrezID | Gene name | Symbol |
|---|---|---|
| 11914 | ALPHA 1,4-GALACTOSYLTRANSFERASE | A4GALT |
| 3558 | ACETOACETYL-COA SYNTHETASE | AACS |
| 5758 | ABHYDROLASE DOMAIN CONTAINING 1 | ABHD1 |
| 18925 | ACYL-COA THIOESTERASE 12 | ACOT12 |
| 18925 | ACYL-COA THIOESTERASE 12 | ACOT12 |
| 17809 | ACYL-COA THIOESTERASE 2 | ACOT2 |
| 17766 | ACYL-COA THIOESTERASE 4 | ACOT4 |
| 15426 | ACYL-COA SYNTHETASE BUBBLEGUM FAMILY MEMBER 1 | ACSBG1 |
| 11191 | ACYL-COA SYNTHETASE BUBBLEGUM FAMILY MEMBER 2 | ACSBG2 |
Figure 3A diseases-related gene associated network. Green nodes are genes, and the nodes in other colors are diseases.