Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Text mining of DNA sequence homology searches.

Literature DB >> 15130818

Text mining of DNA sequence homology searches.

Abstract

Primary tasks in analysis and annotation of expressed sequence tag (EST) datasets are to identify similarity among sequences by unsupervised clustering and assign putative function based on BLAST homology searches. We investigated the usefulness of text mining as a simple approach for further higher-level clustering of EST datasets using IBM Intelligent Miner for Text v2.3 tools. Agglomerative and k-means clustering tools were used to cluster BLASTx homology search documents from two onion EST datasets and optimised by pre-processing and pruning. Subjective evaluation confirmed that these tools provided biologically useful and complementary views of the two libraries, provided new insights into their composition and revealed clusters previously identified by human experts. We compared BLASTx textual clusters for two gene families with their DNA sequence-based clusters and confirmed that these shared similar morphology.

Entities: Species

Mesh：

Year: 2003 PMID： 15130818

Source DB: PubMed Journal: Appl Bioinformatics ISSN： 1175-5636

Keyword Cloud
Cited

2 in total

1. Mapping SNP-anchored genes using high-resolution melting analysis in almond.

Authors: Shu-Biao Wu; Iraj Tavassolian; Gholamreza Rabiei; Peter Hunt; Michelle Wirthensohn; John P Gibson; Christopher M Ford; Margaret Sedgley
Journal: Mol Genet Genomics Date: 2009-06-14 Impact factor: 3.291

2. MCAM: multiple clustering analysis methodology for deriving hypotheses and insights from high-throughput proteomic datasets.

Authors: Kristen M Naegle; Roy E Welsch; Michael B Yaffe; Forest M White; Douglas A Lauffenburger
Journal: PLoS Comput Biol Date: 2011-07-21 Impact factor: 4.475

2 in total