| Literature DB >> 28587637 |
Şenay Kafkas1,2, Ian Dunham3,4, Johanna McEntyre3,4.
Abstract
BACKGROUND: We present the Europe PMC literature component of Open Targets - a target validation platform that integrates various evidence to aid drug target identification and validation. The component identifies target-disease associations in documents and ranks the documents based on their confidence from the Europe PMC literature database, by using rules utilising expert-provided heuristic information. The confidence score of a given document represents how valuable the document is in the scope of target validation for a given target-disease association by taking into account the credibility of the association based on the properties of the text. The component serves the platform regularly with the up-to-date data since December, 2015.Entities:
Keywords: Document ranking; Information retrieval; Target validation; Target-disease associations; Text mining
Mesh:
Year: 2017 PMID: 28587637 PMCID: PMC5461726 DOI: 10.1186/s13326-017-0131-3
Source DB: PubMed Journal: J Biomed Semantics
Comparison on the target-disease association data in the Target Validation Platform (release 1.2)
| Evidence Type | Total number of distinct target-disease associations | Overlapping target-disease association | Total number of exclusively identified associations | |||||
|---|---|---|---|---|---|---|---|---|
| Gene | Genetic | Affected | Animal Models | Somatic Mutations | Known Drugs | |||
| Literature Mining | 1,168,365 | 197,943 | 56,228 | 2506 | 99,836 | 19,801 | 19,811 | 850,179 |
| Gene Expression | 909,960 | X | 18,945 | 901 | 35,616 | 32,795 | 9913 | 669,330 |
| Genetic Associations | 129,826 | X | X | 1912 | 26,504 | 3626 | 2133 | 62,999 |
| Affected Pathways | 3613 | X | X | X | 1045 | 310 | 163 | 714 |
| Animal Models | 602,995 | X | X | X | X | 2965 | 4421 | 486,167 |
| Somatic Mutations | 58,941 | X | X | X | X | X | 1845 | 16,197 |
| Known Drugs | 57,319 | X | X | X | X | X | X | 33,005 |
Total number of distinct target-disease associations in the platform is 2,485,000
Sentence location weights in abstracts
| Sentence Location | Weight |
|---|---|
| First or second | 2 |
| Last | 5 |
| Other | 3 |
Section weights in full text articles
| Section | Weight |
|---|---|
| Title | 10 |
| Abstract | See Table |
| Results, Figure, Table | 5 |
| Discussion, Conclusion | 2 |
| Introduction, Case Study, Appendix, Other | 1 |
Comparison of the associations by disease in the Target Validation Platform (release 1.2)
| Evidence Type | Total number of distinct associations by disease | Overlapping associations by disease | Total number of exclusively identified associations by disease | |||||
|---|---|---|---|---|---|---|---|---|
| Gene | Genetic | Affected | Animal Models | Somatic Mutations | Known | |||
| Literature Mining | 5801 | 405 | 3546 | 504 | 3403 | 494 | 1489 | 1304 |
| Gene Expression | 723 | X | 520 | 196 | 309 | 460 | 328 | 25 |
| Genetic Associations | 5912 | X | X | 527 | 3725 | 530 | 1193 | 1336 |
| Pathways | 567 | X | X | X | 443 | 168 | 310 | 9 |
| Animal Models | 4 | X | X | X | X | 281 | 752 | 811 |
| Somatic Mutations | 919 | X | X | X | X | X | 354 | 113 |
| Known Drugs | 1800 | X | X | X | X | X | X | 179 |
Total number of distinct associations by disease in the platform is 9426
Comparison of the associations by target data in the Target Validation Platform (release 1.2)
| Evidence Type | Total number of associations by target | Overlapping associations by target | Total number of exclusively identified associations by target | |||||
|---|---|---|---|---|---|---|---|---|
| Gene | Genetic | Affected Pathways | Animal Models | Somatic Mutations | Known Drugs | |||
| Literature Mining | 14,728 | 14,217 | 8670 | 664 | 5187 | 3903 | 736 | 321 |
| Gene Expression | 29,842 | X | 9817 | 671 | 5449 | 4125 | 743 | 14,148 |
| Genetic Associations | 10,200 | X | X | 561 | 4072 | 3165 | 569 | 217 |
| Pathways | 690 | X | X | X | 379 | 324 | 70 | 4 |
| Animal Models | 5497 | X | X | X | X | 3744 | 484 | 8 |
| Somatic Mutations | 4138 | X | X | X | X | X | 330 | 2 |
| Known Drugs | 756 | X | X | X | X | X | X | 1 |
Total number of distinct associations by target in the platform is 30,592
Fig. 1The CTGF and male breast carcinoma association
Fig. 2The ST3GAL4 and diabetes mellitus association