Literature DB >> 30815185

Utility of General and Specific Word Embeddings for Classifying Translational Stages of Research.

Vincent Major1, Alisa Surkis1, Yindalon Aphinyanaphongs1.   

Abstract

Conventional text classification models make a bag-of-words assumption reducing text into word occurrence counts per document. Recent algorithms such as word2vec are capable of learning semantic meaning and similarity between words in an entirely unsupervised manner using a contextual window and doing so much faster than previous methods. Each word is projected into vector space such that similar meaning words such as "strong" and "powerful" are projected into the same general Euclidean space. Open questions about these embeddings include their utility across classification tasks and the optimal properties and source of documents to construct broadly functional embeddings. In this work, we demonstrate the usefulness of pre-trained embeddings for classification in our task and demonstrate that custom word embeddings, built in the domain and for the tasks, can improve performance over word embeddings learnt on more general data including news articles or Wikipedia.

Entities:  

Mesh:

Year:  2018        PMID: 30815185      PMCID: PMC6371342     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  2 in total

1.  A comparative evaluation of off-the-shelf distributed semantic representations for modelling behavioural data.

Authors:  Francisco Pereira; Samuel Gershman; Samuel Ritter; Matthew Botvinick
Journal:  Cogn Neuropsychol       Date:  2016 May-Jun       Impact factor: 2.468

2.  Classifying publications from the clinical and translational science award program along the translational research spectrum: a machine learning approach.

Authors:  Alisa Surkis; Janice A Hogle; Deborah DiazGranados; Joe D Hunt; Paul E Mazmanian; Emily Connors; Kate Westaby; Elizabeth C Whipple; Trisha Adamus; Meridith Mueller; Yindalon Aphinyanaphongs
Journal:  J Transl Med       Date:  2016-08-05       Impact factor: 5.531

  2 in total
  3 in total

1.  Identifying translational science through embeddings of controlled vocabularies.

Authors:  Qing Ke
Journal:  J Am Med Inform Assoc       Date:  2019-06-01       Impact factor: 4.497

2.  Word embeddings trained on published case reports are lightweight, effective for clinical tasks, and free of protected health information.

Authors:  Zachary N Flamholz; Andrew Crane-Droesch; Lyle H Ungar; Gary E Weissman
Journal:  J Biomed Inform       Date:  2021-12-14       Impact factor: 6.317

Review 3.  Methods for identifying biomedical translation: a systematic review.

Authors:  Javier Padilla-Cabello; Antonio Santisteban-Espejo; Ruben Heradio; Manuel J Cobo; Miguel A Martin-Piedra; Jose A Moral-Munoz
Journal:  Am J Transl Res       Date:  2022-04-15       Impact factor: 3.940

  3 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.