Literature DB >> 15037748

From paragraph to graph: latent semantic analysis for information visualization.

Thomas K Landauer1, Darrell Laham, Marcia Derr.   

Abstract

Most techniques for relating textual information rely on intellectually created links such as author-chosen keywords and titles, authority indexing terms, or bibliographic citations. Similarity of the semantic content of whole documents, rather than just titles, abstracts, or overlap of keywords, offers an attractive alternative. Latent semantic analysis provides an effective dimension reduction method for the purpose that reflects synonymy and the sense of arbitrary word combinations. However, latent semantic analysis correlations with human text-to-text similarity judgments are often empirically highest at approximately 300 dimensions. Thus, two- or three-dimensional visualizations are severely limited in what they can show, and the first and/or second automatically discovered principal component, or any three such for that matter, rarely capture all of the relations that might be of interest. It is our conjecture that linguistic meaning is intrinsically and irreducibly very high dimensional. Thus, some method to explore a high dimensional similarity space is needed. But the 2.7 x 10(7) projections and infinite rotations of, for example, a 300-dimensional pattern are impossible to examine. We suggest, however, that the use of a high dimensional dynamic viewer with an effective projection pursuit routine and user control, coupled with the exquisite abilities of the human visual system to extract information about objects and from moving patterns, can often succeed in discovering multiple revealing views that are missed by current computational algorithms. We show some examples of the use of latent semantic analysis to support such visualizations and offer views on future needs.

Entities:  

Mesh:

Year:  2004        PMID: 15037748      PMCID: PMC387298          DOI: 10.1073/pnas.0400341101

Source DB:  PubMed          Journal:  Proc Natl Acad Sci U S A        ISSN: 0027-8424            Impact factor:   11.205


  3 in total

1.  Mixed-membership models of scientific publications.

Authors:  Elena Erosheva; Stephen Fienberg; John Lafferty
Journal:  Proc Natl Acad Sci U S A       Date:  2004-03-12       Impact factor: 11.205

2.  An unsupervised method for the extraction of propositional information from text.

Authors:  Simon Dennis
Journal:  Proc Natl Acad Sci U S A       Date:  2004-03-15       Impact factor: 11.205

3.  Finding scientific topics.

Authors:  Thomas L Griffiths; Mark Steyvers
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-10       Impact factor: 11.205

  3 in total
  18 in total

1.  Evolution of document networks.

Authors:  Filippo Menczer
Journal:  Proc Natl Acad Sci U S A       Date:  2004-01-27       Impact factor: 11.205

2.  Mapping knowledge domains: characterizing PNAS.

Authors:  Kevin W Boyack
Journal:  Proc Natl Acad Sci U S A       Date:  2004-02-12       Impact factor: 11.205

Review 3.  Empirical distributional semantics: methods and biomedical applications.

Authors:  Trevor Cohen; Dominic Widdows
Journal:  J Biomed Inform       Date:  2009-02-14       Impact factor: 6.317

4.  CUR matrix decompositions for improved data analysis.

Authors:  Michael W Mahoney; Petros Drineas
Journal:  Proc Natl Acad Sci U S A       Date:  2009-01-12       Impact factor: 11.205

5.  Exploring MEDLINE space with random indexing and pathfinder networks.

Authors:  Trevor Cohen
Journal:  AMIA Annu Symp Proc       Date:  2008-11-06

6.  Bioinformatic analysis reveals cRel as a regulator of a subset of interferon-stimulated genes.

Authors:  Lai Wei; Meiyun Fan; Lijing Xu; Kevin Heinrich; Michael W Berry; Ramin Homayouni; Lawrence M Pfeffer
Journal:  J Interferon Cytokine Res       Date:  2008-09       Impact factor: 2.607

7.  Altered selection during language processing in individuals at high risk for psychosis.

Authors:  Teresa Vargas; Hannah Snyder; Marie Banich; Rae Newberry; Stewart A Shankman; Gregory P Strauss; Vijay Anand Mittal
Journal:  Schizophr Res       Date:  2018-06-20       Impact factor: 4.939

8.  Clustering more than two million biomedical publications: comparing the accuracies of nine text-based similarity approaches.

Authors:  Kevin W Boyack; David Newman; Russell J Duhon; Richard Klavans; Michael Patek; Joseph R Biberstine; Bob Schijvenaars; André Skupin; Nianli Ma; Katy Börner
Journal:  PLoS One       Date:  2011-03-17       Impact factor: 3.240

9.  Functional cohesion of gene sets determined by latent semantic indexing of PubMed abstracts.

Authors:  Lijing Xu; Nicholas Furlotte; Yunyue Lin; Kevin Heinrich; Michael W Berry; Ebenezer O George; Ramin Homayouni
Journal:  PLoS One       Date:  2011-04-14       Impact factor: 3.240

10.  Literature aided determination of data quality and statistical significance threshold for gene expression studies.

Authors:  Lijing Xu; Cheng Cheng; E Olusegun George; Ramin Homayouni
Journal:  BMC Genomics       Date:  2012-12-17       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.