| Literature DB >> 17282075 |
Yongsheng Yuan, Lei Lin, Qiwen Dong, Xiaolong Wang, Minghui Li.
Abstract
In this paper a new method that uses Latent Semantic Analysis (LSA) to denote a protein sequence is proposed for researching the protein classification problem. A protein is vectorized according to its content of biological words: patterns and motifs, which are generated by utilizing TEIRESIAS algorithm and MEME/MAST system respectively. More precise description vectors of proteins are obtained through employing LSA. Those vectors are used to classify proteins combined with the Support Vector Machine (SVM). Experiments of family-level protein classification on Structural Classification of Proteins database show that the performance of this method is better than that of the other state-of-the-arts methods.Year: 2005 PMID: 17282075 DOI: 10.1109/IEMBS.2005.1616306
Source DB: PubMed Journal: Conf Proc IEEE Eng Med Biol Soc ISSN: 1557-170X