Literature DB >> 10568749

Large-scale clustering of cDNA-fingerprinting data.

R Herwig1, A J Poustka, C Müller, C Bull, H Lehrach, J O'Brien.   

Abstract

Clustering is one of the main mathematical challenges in large-scale gene expression analysis. We describe a clustering procedure based on a sequential k-means algorithm with additional refinements that is able to handle high-throughput data in the order of hundreds of thousands of data items measured on hundreds of variables. The practical motivation for our algorithm is oligonucleotide fingerprinting-a method for simultaneous determination of expression level for every active gene of a specific tissue-although the algorithm can be applied as well to other large-scale projects like EST clustering and qualitative clustering of DNA-chip data. As a pairwise similarity measure between two p-dimensional data points, x and y, we introduce mutual information that can be interpreted as the amount of information about x in y, and vice versa. We show that for our purposes this measure is superior to commonly used metric distances, for example, Euclidean distance. We also introduce a modified version of mutual information as a novel method for validating clustering results when the true clustering is known. The performance of our algorithm with respect to experimental noise is shown by extensive simulation studies. The algorithm is tested on a subset of 2029 cDNA clones coming from 15 different genes from a cDNA library derived from human dendritic cells. Furthermore, the clustering of these 2029 cDNA clones is demonstrated when the entire set of 76,032 cDNA clones is processed.

Entities:  

Mesh:

Substances:

Year:  1999        PMID: 10568749      PMCID: PMC310829          DOI: 10.1101/gr.9.11.1093

Source DB:  PubMed          Journal:  Genome Res        ISSN: 1088-9051            Impact factor:   9.043


  19 in total

1.  Toward the gene catalogue of sea urchin development: the construction and analysis of an unfertilized egg cDNA library highly normalized by oligonucleotide fingerprinting.

Authors:  A J Poustka; R Herwig; A Krause; S Hennig; S Meier-Ewert; H Lehrach
Journal:  Genomics       Date:  1999-07-15       Impact factor: 5.736

2.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation.

Authors:  P Tamayo; D Slonim; J Mesirov; Q Zhu; S Kitareewan; E Dmitrovsky; E S Lander; T R Golub
Journal:  Proc Natl Acad Sci U S A       Date:  1999-03-16       Impact factor: 11.205

3.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures.

Authors:  S Liang; S Fuhrman; R Somogyi
Journal:  Pac Symp Biocomput       Date:  1998

4.  Preselection of shotgun clones by oligonucleotide fingerprinting: an efficient and high throughput strategy to reduce redundancy in large-scale sequencing projects.

Authors:  U Radelof; S Hennig; P Seranski; M Steinfath; J Ramser; R Reinhardt; A Poustka; F Francis; H Lehrach
Journal:  Nucleic Acids Res       Date:  1998-12-01       Impact factor: 16.971

5.  Large-scale temporal gene expression mapping of central nervous system development.

Authors:  X Wen; S Fuhrman; G S Michaels; D B Carr; S Smith; J L Barker; R Somogyi
Journal:  Proc Natl Acad Sci U S A       Date:  1998-01-06       Impact factor: 11.205

6.  Gene-representing cDNA clusters defined by hybridization of 57,419 clones from infant brain libraries with short oligonucleotide probes.

Authors:  S Drmanac; N A Stavropoulos; I Labat; J Vonau; B Hauser; M B Soares; R Drmanac
Journal:  Genomics       Date:  1996-10-01       Impact factor: 5.736

7.  Application of robotic technology to automated sequence fingerprint analysis by oligonucleotide hybridisation.

Authors:  E Maier; S Meier-Ewert; A R Ahmadi; J Curtis; H Lehrach
Journal:  J Biotechnol       Date:  1994-06-30       Impact factor: 3.307

8.  Comparative gene expression profiling by oligonucleotide fingerprinting.

Authors:  S Meier-Ewert; J Lange; H Gerst; R Herwig; A Schmitt; J Freund; T Elge; R Mott; B Herrmann; H Lehrach
Journal:  Nucleic Acids Res       Date:  1998-05-01       Impact factor: 16.971

9.  Discovering distinct genes represented in 29,570 clones from infant brain cDNA libraries by applying sequencing by hybridization methodology.

Authors:  A Milosavljevic; M Zeremski; Z Strezoska; D Grujic; H Dyanov; S Batus; D Salbego; T Paunesku; M B Soares; R Crkvenjakov
Journal:  Genome Res       Date:  1996-02       Impact factor: 9.043

10.  Cluster analysis and display of genome-wide expression patterns.

Authors:  M B Eisen; P T Spellman; P O Brown; D Botstein
Journal:  Proc Natl Acad Sci U S A       Date:  1998-12-08       Impact factor: 11.205

View more
  25 in total

1.  Region-specific transcriptional response to chronic nicotine in rat brain.

Authors:  J K Kane; T Barrett; M P Vawter; R Chang; J Z Ma; D M Donovan; B Sharp; K G Becker; M D Li
Journal:  Brain Res       Date:  2001-08-03       Impact factor: 3.252

2.  An oligonucleotide fingerprint normalized and expressed sequence tag characterized zebrafish cDNA library.

Authors:  M D Clark; S Hennig; R Herwig; S W Clifton; M A Marra; H Lehrach; S L Johnson
Journal:  Genome Res       Date:  2001-09       Impact factor: 9.043

3.  Local Context Finder (LCF) reveals multidimensional relationships among mRNA expression profiles of Arabidopsis responding to pathogen infection.

Authors:  Fumiaki Katagiri; Jane Glazebrook
Journal:  Proc Natl Acad Sci U S A       Date:  2003-09-05       Impact factor: 11.205

Review 4.  Microarray technology and its application on nicotine research.

Authors:  Ming D Li; Ozien Konu; Justin K Kane; Kevin G Becker
Journal:  Mol Neurobiol       Date:  2002-06       Impact factor: 5.590

5.  EXCAVATOR: a computer program for efficiently mining gene expression data.

Authors:  Dong Xu; Victor Olman; Li Wang; Ying Xu
Journal:  Nucleic Acids Res       Date:  2003-10-01       Impact factor: 16.971

6.  Analysis of microarray experiments of gene expression profiling.

Authors:  Adi L Tarca; Roberto Romero; Sorin Draghici
Journal:  Am J Obstet Gynecol       Date:  2006-08       Impact factor: 8.661

7.  Generation, annotation, evolutionary analysis, and database integration of 20,000 unique sea urchin EST clusters.

Authors:  Albert J Poustka; Detlef Groth; Steffen Hennig; Sabine Thamm; Andrew Cameron; Alfred Beck; Richard Reinhardt; Ralf Herwig; Georgia Panopoulou; Hans Lehrach
Journal:  Genome Res       Date:  2003-12       Impact factor: 9.043

8.  GPS-SNO: computational prediction of protein S-nitrosylation sites with a modified GPS algorithm.

Authors:  Yu Xue; Zexian Liu; Xinjiao Gao; Changjiang Jin; Longping Wen; Xuebiao Yao; Jian Ren
Journal:  PLoS One       Date:  2010-06-24       Impact factor: 3.240

9.  HAMSTER: visualizing microarray experiments as a set of minimum spanning trees.

Authors:  Raymond Wan; Larisa Kiseleva; Hajime Harada; Hiroshi Mamitsuka; Paul Horton
Journal:  Source Code Biol Med       Date:  2009-11-20

10.  Assessing functional annotation transfers with inter-species conserved coexpression: application to Plasmodium falciparum.

Authors:  Laurent Bréhélin; Isabelle Florent; Olivier Gascuel; Eric Maréchal
Journal:  BMC Genomics       Date:  2010-01-15       Impact factor: 3.969

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.