Literature DB >> 18249786

Self organization of a massive document collection.

T Kohonen1, S Kaski, K Lagus, J Salojarvi, J Honkela, V Paatero, A Saarela.   

Abstract

This article describes the implementation of a system that is able to organize vast document collections according to textual similarities. It is based on the self-organizing map (SOM) algorithm. As the feature vectors for the documents statistical representations of their vocabularies are used. The main goal in our work has been to scale up the SOM algorithm to be able to deal with large amounts of high-dimensional data. In a practical experiment we mapped 6,840,568 patent abstracts onto a 1,002,240-node SOM. As the feature vectors we used 500-dimensional vectors of stochastic figures obtained as random projections of weighted word histograms.

Year:  2000        PMID: 18249786     DOI: 10.1109/72.846729

Source DB:  PubMed          Journal:  IEEE Trans Neural Netw        ISSN: 1045-9227


  14 in total

1.  Data processing for 3D mass spectrometry imaging.

Authors:  Xingchuang Xiong; Wei Xu; Livia S Eberlin; Justin M Wiseman; Xiang Fang; You Jiang; Zejian Huang; Yukui Zhang; R Graham Cooks; Zheng Ouyang
Journal:  J Am Soc Mass Spectrom       Date:  2012-03-03       Impact factor: 3.109

2.  A New-Fangled FES-k-Means Clustering Algorithm for Disease Discovery and Visual Analytics.

Authors:  Tonny J Oyana
Journal:  EURASIP J Bioinform Syst Biol       Date:  2010-06-27

3.  Allostatic Load as a Complex Clinical Construct: A Case-Based Computational Modeling Approach.

Authors:  J Galen Buckwalter; Brian Castellani; Bruce McEwen; Arun S Karlamangla; Albert A Rizzo; Bruce John; Kyle O'Donnell; Teresa Seeman
Journal:  Complexity       Date:  2015-12-23       Impact factor: 2.833

4.  A formal algorithm for verifying the validity of clustering results based on model checking.

Authors:  Shaobin Huang; Yuan Cheng; Dapeng Lang; Ronghua Chi; Guofeng Liu
Journal:  PLoS One       Date:  2014-03-07       Impact factor: 3.240

5.  Statistical sleep pattern modelling for sleep quality assessment based on sound events.

Authors:  Hongle Wu; Takafumi Kato; Masayuki Numao; Ken-Ichi Fukui
Journal:  Health Inf Sci Syst       Date:  2017-10-30

6.  Performance study of the application of Artificial Neural Networks to the completion and prediction of data retrieved by underwater sensors.

Authors:  Carlos Baladrón; Javier M Aguiar; Lorena Calavia; Belén Carro; Antonio Sánchez-Esguevillas; Luis Hernández
Journal:  Sensors (Basel)       Date:  2012-02-02       Impact factor: 3.576

Review 7.  Analytical methods in untargeted metabolomics: state of the art in 2015.

Authors:  Arnald Alonso; Sara Marsal; Antonio Julià
Journal:  Front Bioeng Biotechnol       Date:  2015-03-05

8.  On the Use of Self-Organizing Map for Text Clustering in Engineering Change Process Analysis: A Case Study.

Authors:  Massimo Pacella; Antonio Grieco; Marzia Blaco
Journal:  Comput Intell Neurosci       Date:  2016-12-04

9.  Random selection of factors preserves the correlation structure in a linear factor model to a high degree.

Authors:  Antti J Tanskanen; Jani Lukkarinen; Kari Vatanen
Journal:  PLoS One       Date:  2018-12-21       Impact factor: 3.240

10.  Group analysis of self-organizing maps based on functional MRI using restricted Frechet means.

Authors:  Arnaud P Fournel; Emanuelle Reynaud; Michael J Brammer; Andrew Simmons; Cedric E Ginestet
Journal:  Neuroimage       Date:  2013-03-25       Impact factor: 6.556

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.