Literature DB >> 10566444

Extracting noun phrases for all of MEDLINE.

N A Bennett1, Q He, K Powell, B R Schatz.   

Abstract

A natural language parser that could extract noun phrases for all medical texts would be of great utility in analyzing content for information retrieval. We discuss the extraction of noun phrases from MEDLINE, using a general parser not tuned specifically for any medical domain. The noun phrase extractor is made up of three modules: tokenization; part-of-speech tagging; noun phrase identification. Using our program, we extracted noun phrases from the entire MEDLINE collection, encompassing 9.3 million abstracts. Over 270 million noun phrases were generated, of which 45 million were unique. The quality of these phrases was evaluated by examining all phrases from a sample collection of abstracts. The precision and recall of the phrases from our general parser compared favorably with those from three other parsers we had previously evaluated. We are continuing to improve our parser and evaluate our claim that a generic parser can effectively extract all the different phrases across the entire medical literature.

Mesh:

Year:  1999        PMID: 10566444      PMCID: PMC2232564     

Source DB:  PubMed          Journal:  Proc AMIA Symp        ISSN: 1531-605X


  7 in total

1.  Extending a natural language parser with UMLS knowledge.

Authors:  A T McCray
Journal:  Proc Annu Symp Comput Appl Med Care       Date:  1991

2.  Taming MEDLINE with concept spaces.

Authors:  J Alper
Journal:  Science       Date:  1998-09-18       Impact factor: 47.728

3.  Validation of clinical problems using a UMLS-based semantic parser.

Authors:  H S Goldberg; C Hsu; V Law; C Safran
Journal:  Proc AMIA Symp       Date:  1998

Review 4.  Information retrieval in digital libraries: bringing search to the net.

Authors:  B R Schatz
Journal:  Science       Date:  1997-01-17       Impact factor: 47.728

5.  A general natural-language text processor for clinical radiology.

Authors:  C Friedman; P O Alderson; J H Austin; J J Cimino; S B Johnson
Journal:  J Am Med Inform Assoc       Date:  1994 Mar-Apr       Impact factor: 4.497

6.  Associating semantic grammars with the SNOMED: processing medical language and representing clinical facts into a language-independent frame.

Authors:  B Do Amaral Marcio; Y Satomura
Journal:  Medinfo       Date:  1995

7.  Computer auditing of surgical operative reports written in English.

Authors:  J M Lamiell; Z M Wojcik; J Isaacks
Journal:  Proc Annu Symp Comput Appl Med Care       Date:  1993
  7 in total
  6 in total

1.  Corpus-based statistical screening for phrase identification.

Authors:  W Kim; W J Wilbur
Journal:  J Am Med Inform Assoc       Date:  2000 Sep-Oct       Impact factor: 4.497

2.  Finding UMLS Metathesaurus concepts in MEDLINE.

Authors:  Suresh Srinivasan; Thomas C Rindflesch; William T Hole; Alan R Aronson; James G Mork
Journal:  Proc AMIA Symp       Date:  2002

3.  Identifying well-formed biomedical phrases in MEDLINE® text.

Authors:  Won Kim; Lana Yeganova; Donald C Comeau; W John Wilbur
Journal:  J Biomed Inform       Date:  2012-06-08       Impact factor: 6.317

4.  Improved identification of noun phrases in clinical radiology reports using a high-performance statistical natural language parser augmented with the UMLS specialist lexicon.

Authors:  Yang Huang; Henry J Lowe; Dan Klein; Russell J Cucina
Journal:  J Am Med Inform Assoc       Date:  2005-01-31       Impact factor: 4.497

5.  ChemTok: A New Rule Based Tokenizer for Chemical Named Entity Recognition.

Authors:  Abbas Akkasi; Ekrem Varoğlu; Nazife Dimililer
Journal:  Biomed Res Int       Date:  2016-01-28       Impact factor: 3.411

6.  PubMed Phrases, an open set of coherent phrases for searching biomedical literature.

Authors:  Sun Kim; Lana Yeganova; Donald C Comeau; W John Wilbur; Zhiyong Lu
Journal:  Sci Data       Date:  2018-06-12       Impact factor: 6.444

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.