Warning: Undefined array key "mm" in /www/wwwroot/www.ai-bt.com/si.php on line 10 Deprecated: trim(): Passing null to parameter #1 ($string) of type string is deprecated in /www/wwwroot/www.ai-bt.com/si.php on line 10 Gclust: trans-kingdom classification of proteins using automatic individual threshold setting.

Literature DB >> 19158159

Gclust: trans-kingdom classification of proteins using automatic individual threshold setting.

Abstract

MOTIVATION: Trans-kingdom protein clustering remained difficult because of large sequence divergence between eukaryotes and prokaryotes and the presence of a transit sequence in organellar proteins. A large-scale protein clustering including such divergent organisms needs a heuristic to efficiently select similar proteins by setting a proper threshold for homologs of each protein. Here a method is described using two similarity measures and organism count.
RESULTS: The Gclust software constructs minimal homolog groups using all-against-all BLASTP results by single-linkage clustering. Major points include (i) estimation of domain structure of proteins; (ii) exclusion of multi-domain proteins; (iii) explicit consideration of transit peptides; and (iv) heuristic estimation of a similarity threshold for homologs of each protein by entropy-optimized organism count method. The resultant clusters were evaluated in the light of power law. The software was used to construct protein clusters for up to 95 organisms. AVAILABILITY: Software and data are available at http://gclust.c.u-tokyo.ac.jp/Gclust_Download.html.

Mesh：

Substances：
Proteins

Year: 2009 PMID： 19158159 DOI： 10.1093/bioinformatics/btp047

Source DB: PubMed Journal: Bioinformatics ISSN： 1367-4803 Impact factor: 6.937

Keyword Cloud
Cited

21 in total

1. Sequencing and analysis of the complete organellar genomes of Parmales, a closely related group to Bacillariophyta (diatoms).

Authors: Naoyuki Tajima; Kenji Saitoh; Shusei Sato; Fumito Maruyama; Mutsuo Ichinomiya; Shinya Yoshikawa; Ken Kurokawa; Hiroyuki Ohta; Satoshi Tabata; Akira Kuwata; Naoki Sato
Journal: Curr Genet Date: 2016-04-18 Impact factor: 3.886

2. Diverse origins of enzymes involved in the biosynthesis of chloroplast peptidoglycan.

Authors: Naoki Sato; Hiroyoshi Takano
Journal: J Plant Res Date: 2017-04-05 Impact factor: 2.629

3. Lipid Pathway Databases with a Focus on Algae.

Authors: Naoki Sato; Takeshi Obayashi
Journal: Methods Mol Biol Date: 2021

4. Genomic structure of an economically important cyanobacterium, Arthrospira (Spirulina) platensis NIES-39.

Authors: Takatomo Fujisawa; Rei Narikawa; Shinobu Okamoto; Shigeki Ehira; Hidehisa Yoshimura; Iwane Suzuki; Tatsuru Masuda; Mari Mochimaru; Shinichi Takaichi; Koichiro Awai; Mitsuo Sekine; Hiroshi Horikawa; Isao Yashiro; Seiha Omata; Hiromi Takarada; Yoko Katano; Hiroki Kosugi; Satoshi Tanikawa; Kazuko Ohmori; Naoki Sato; Masahiko Ikeuchi; Nobuyuki Fujita; Masayuki Ohmori
Journal: DNA Res Date: 2010-03-04 Impact factor: 4.458

5. CyanoClust: comparative genome resources of cyanobacteria and plastids.

Authors: Naobumi V Sasaki; Naoki Sato
Journal: Database (Oxford) Date: 2010-01-08 Impact factor: 3.451

6. Elucidating genome structure evolution by analysis of isoapostatic gene clusters using statistics of variance of gene distances.

Authors: Naobumi V Sasaki; Naoki Sato
Journal: Genome Biol Evol Date: 2009-12-18 Impact factor: 3.416

10. Detection and characterization of phosphatidylcholine in various strains of the genus Chlamydomonas (Volvocales, Chlorophyceae).

Authors: Kenta Sakurai; Natsumi Mori; Naoki Sato
Journal: J Plant Res Date: 2014-06-20 Impact factor: 2.629