Literature DB >> 25954431

Stochastic Gradient Descent and the Prediction of MeSH for PubMed Records.

W John Wilbur1, Won Kim1.   

Abstract

Stochastic Gradient Descent (SGD) has gained popularity for solving large scale supervised machine learning problems. It provides a rapid method for minimizing a number of loss functions and is applicable to Support Vector Machine (SVM) and Logistic optimizations. However SGD does not provide a convenient stopping criterion. Generally an optimal number of iterations over the data may be determined using held out data. Here we compare stopping predictions based on held out data with simply stopping at a fixed number of iterations and show that the latter works as well as the former for a number of commonly studied text classification problems. In particular fixed stopping works well for MeSH(®) predictions on PubMed(®) records. We also surveyed the published algorithms for SVM learning on large data sets, and chose three for comparison: PROBE, SVMperf, and Liblinear and compared them with SGD with a fixed number of iterations. We find SGD with a fixed number of iterations performs as well as these alternative methods and is much faster to compute. As an application we made SGD-SVM predictions for all MeSH terms and used the Pool Adjacent Violators (PAV) algorithm to convert these predictions to probabilities. Such probabilistic predictions lead to ranked MeSH term predictions superior to previously published results on two test sets.

Mesh:

Year:  2014        PMID: 25954431      PMCID: PMC4419959     

Source DB:  PubMed          Journal:  AMIA Annu Symp Proc        ISSN: 1559-4076


  16 in total

1.  The NLM Indexing Initiative.

Authors:  A R Aronson; O Bodenreider; H F Chang; S M Humphrey; J G Mork; S J Nelson; T C Rindflesch; W J Wilbur
Journal:  Proc AMIA Symp       Date:  2000

2.  Automatic early stopping using cross validation: quantifying the criteria.

Authors:  Lutz Prechelt
Journal:  Neural Netw       Date:  1998-06

3.  Reflective random indexing for semi-automatic indexing of the biomedical literature.

Authors:  Vidya Vasuki; Trevor Cohen
Journal:  J Biomed Inform       Date:  2010-04-09       Impact factor: 6.317

4.  Multiple approaches to fine-grained indexing of the biomedical literature.

Authors:  Aurelie Neveol; Sonya E Shooshan; Susanne M Humphrey; Thomas C Rindflesh; Alan R Aronson
Journal:  Pac Symp Biocomput       Date:  2007

5.  Fine-grained indexing of the biomedical literature: MeSH subheading attachment for a MEDLINE indexing tool.

Authors:  Aurélie Névéol; Sonya E Shooshan; James G Mork; Alan R Aronson
Journal:  AMIA Annu Symp Proc       Date:  2007-10-11

6.  MEDRank: using graph-based concept ranking to index biomedical texts.

Authors:  Jorge R Herskovic; Trevor Cohen; Devika Subramanian; M Sriram Iyengar; Jack W Smith; Elmer V Bernstam
Journal:  Int J Med Inform       Date:  2011-03-25       Impact factor: 4.046

7.  A recent advance in the automatic indexing of the biomedical literature.

Authors:  Aurélie Névéol; Sonya E Shooshan; Susanne M Humphrey; James G Mork; Alan R Aronson
Journal:  J Biomed Inform       Date:  2008-12-30       Impact factor: 6.317

8.  Recommending MeSH terms for annotating biomedical articles.

Authors:  Minlie Huang; Aurélie Névéol; Zhiyong Lu
Journal:  J Am Med Inform Assoc       Date:  2011-05-25       Impact factor: 4.497

9.  MeSH indexing based on automatically generated summaries.

Authors:  Antonio J Jimeno-Yepes; Laura Plaza; James G Mork; Alan R Aronson; Alberto Díaz
Journal:  BMC Bioinformatics       Date:  2013-06-26       Impact factor: 3.169

10.  PubMed related articles: a probabilistic topic-based model for content similarity.

Authors:  Jimmy Lin; W John Wilbur
Journal:  BMC Bioinformatics       Date:  2007-10-30       Impact factor: 3.169

View more
  4 in total

1.  Predicting MeSH Beyond MEDLINE.

Authors:  Adam K Kehoe; Vetle I Torvik; Matthew B Ross; Neil R Smalheiser
Journal:  Proc 1st Workshop Sch Web Min (2017)       Date:  2017-02

2.  Automatic Assignment of Non-Leaf MeSH Terms to Biomedical Articles.

Authors:  Ramakanth Kavuluru; Anthony Rios
Journal:  AMIA Annu Symp Proc       Date:  2015-11-05

3.  12 years on - Is the NLM medical text indexer still useful and relevant?

Authors:  James Mork; Alan Aronson; Dina Demner-Fushman
Journal:  J Biomed Semantics       Date:  2017-02-23

4.  MeSH Now: automatic MeSH indexing at PubMed scale via learning to rank.

Authors:  Yuqing Mao; Zhiyong Lu
Journal:  J Biomed Semantics       Date:  2017-04-17
  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.