Literature DB >> 8725772

An analysis of statistical term strength and its use in the indexing and retrieval of molecular biology texts.

W J Wilbur1, Y Yang.   

Abstract

The biological literature presents a difficult challenge to information processing in its complexity, diversity, and in its sheer volume. Much of the diversity resides in its technical terminology, which has also become voluminous. In an effort to deal more effectively with this large vocabulary and improve information processing, a method of focus has been developed which allows one to classify terms based on a measure of their importance in describing the content of the documents in which they occur. The measurement is called the strength of a term and is a measure of how strongly the term's occurrences correlate with the subjects of documents in the database. If term occurrences are random then there will be no correlation and the strength will be zero, but if for any subject, the term is either always present or never present its strength will be one. We give here a new, information theoretical interpretation of term strength, review some of its uses in focusing the processing of documents for information retrieval and describe new results obtained in document categorization.

Mesh:

Year:  1996        PMID: 8725772     DOI: 10.1016/0010-4825(95)00055-0

Source DB:  PubMed          Journal:  Comput Biol Med        ISSN: 0010-4825            Impact factor:   4.589


  23 in total

1.  Including biological literature improves homology search.

Authors:  J T Chang; S Raychaudhuri; R B Altman
Journal:  Pac Symp Biocomput       Date:  2001

2.  UMLS concept indexing for production databases: a feasibility study.

Authors:  P Nadkarni; R Chen; C Brandt
Journal:  J Am Med Inform Assoc       Date:  2001 Jan-Feb       Impact factor: 4.497

3.  The NLM Indexing Initiative.

Authors:  A R Aronson; O Bodenreider; H F Chang; S M Humphrey; J G Mork; S J Nelson; T C Rindflesch; W J Wilbur
Journal:  Proc AMIA Symp       Date:  2000

4.  What's related? Generalizing approaches to related articles in medicine.

Authors:  H R Strasberg; C D Manning; T C Rindfleisch; K L Melmon
Journal:  Proc AMIA Symp       Date:  2000

5.  Use of general-purpose negation detection to augment concept indexing of medical documents: a quantitative study using the UMLS.

Authors:  P G Mutalik; A Deshpande; P M Nadkarni
Journal:  J Am Med Inform Assoc       Date:  2001 Nov-Dec       Impact factor: 4.497

6.  Update on XplorMed: A web server for exploring scientific literature.

Authors:  Carolina Perez-Iratxeta; Antonio J Pérez; Peer Bork; Miguel A Andrade
Journal:  Nucleic Acids Res       Date:  2003-07-01       Impact factor: 16.971

Review 7.  Natural Language Processing methods and systems for biomedical ontology learning.

Authors:  Kaihong Liu; William R Hogan; Rebecca S Crowley
Journal:  J Biomed Inform       Date:  2010-07-18       Impact factor: 6.317

8.  Semi-automatic indexing of full text biomedical articles.

Authors:  Clifford W Gay; Mehmet Kayaalp; Alan R Aronson
Journal:  AMIA Annu Symp Proc       Date:  2005

9.  A document clustering and ranking system for exploring MEDLINE citations.

Authors:  Yongjing Lin; Wenyuan Li; Keke Chen; Ying Liu
Journal:  J Am Med Inform Assoc       Date:  2007-06-28       Impact factor: 4.497

10.  From episodes of care to diagnosis codes: automatic text categorization for medico-economic encoding.

Authors:  Patrick Ruch; Julien Gobeilla; Imad Tbahritia; Antoine Geissbühlera
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.