Literature DB >> 19364541

Assigning roles to protein mentions: the case of transcription factors.

Hui Yang1, John Keane, Casey M Bergman, Goran Nenadic.   

Abstract

Transcription factors (TFs) play a crucial role in gene regulation, and providing structured and curated information about them is important for genome biology. Manual curation of TF related data is time-consuming and always lags behind the actual knowledge available in the biomedical literature. Here we present a machine-learning text mining approach for identification and tagging of protein mentions that play a TF role in a given context to support the curation process. More precisely, the method explicitly identifies those protein mentions in text that refer to their potential TF functions. The prediction features are engineered from the results of shallow parsing and domain-specific processing (recognition of relevant appearing in phrases) and a phrase-based Conditional Random Fields (CRF) model is used to capture the content and context information of candidate entities. The proposed approach for the identification of TF mentions has been tested on a set of evidence sentences from the TRANSFAC and FlyTF databases. It achieved an F-measure of around 51.5% with a precision of 62.5% using 5-fold cross-validation evaluation. The experimental results suggest that the phrase-based CRF model benefits from the flexibility to use correlated domain-specific features that describe the dependencies between TFs and other entities. To the best of our knowledge, this work is one of the first attempts to apply text-mining techniques to the task of assigning semantic roles to protein mentions.

Mesh:

Substances:

Year:  2009        PMID: 19364541     DOI: 10.1016/j.jbi.2009.04.001

Source DB:  PubMed          Journal:  J Biomed Inform        ISSN: 1532-0464            Impact factor:   6.317


  3 in total

1.  Latent Semantic Indexing of PubMed abstracts for identification of transcription factor candidates from microarray derived gene sets.

Authors:  Sujoy Roy; Kevin Heinrich; Vinhthuy Phan; Michael W Berry; Ramin Homayouni
Journal:  BMC Bioinformatics       Date:  2011-10-18       Impact factor: 3.169

2.  Navigating the Functional Landscape of Transcription Factors via Non-Negative Tensor Factorization Analysis of MEDLINE Abstracts.

Authors:  Sujoy Roy; Daqing Yun; Behrouz Madahian; Michael W Berry; Lih-Yuan Deng; Daniel Goldowitz; Ramin Homayouni
Journal:  Front Bioeng Biotechnol       Date:  2017-08-28

3.  FlyTF: improved annotation and enhanced functionality of the Drosophila transcription factor database.

Authors:  Ulrike Pfreundt; Daniel P James; Susan Tweedie; Derek Wilson; Sarah A Teichmann; Boris Adryan
Journal:  Nucleic Acids Res       Date:  2009-11-01       Impact factor: 16.971

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.