| Literature DB >> 24577312 |
Pufeng Du1, Shuwang Gu2, Yasen Jiao3.
Abstract
The general form pseudo-amino acid composition (PseAAC) has been widely used to represent protein sequences in predicting protein structural and functional attributes. We developed the program PseAAC-General to generate various different modes of Chou's general PseAAC, such as the gene ontology mode, the functional domain mode, and the sequential evolution mode. This program allows the users to define their own desired modes. In every mode, 544 physicochemical properties of the amino acids are available for choosing. The computing efficiency is at least 100 times that of existing programs, which makes it able to facilitate the extensive studies on proteins and peptides. The PseAAC-General is freely available via SourceForge. It runs on both Linux and Windows.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24577312 PMCID: PMC3975349 DOI: 10.3390/ijms15033495
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Comparison of program features.
| Program Functions | PseAAC-General | PseAAC-Builder | Propy | PseAAC Server |
|---|---|---|---|---|
| Physicochemical Properties | 544 | 544 | 8 | 6 |
|
| ||||
| Output Features | ||||
|
| ||||
| Type I PseAAC [ | Y | Y | Y | Y |
| Type II PseAAC [ | Y | Y | Y | Y |
| Amino acid composition | Y | Y | Y | Y |
| di-Peptide composition | Y | Y | Y | Y |
| tri-Peptide composition | Y | N | Y | N |
| Normalized Moreau-Broto autocorrelation [ | Y | N | Y | N |
| Moran autocorrelation [ | Y | N | Y | N |
| Geary autocorrelation [ | Y | N | Y | N |
| Composition-Transition-Distribution (CTD) [ | Y | N | Y | N |
| Quasi-sequence order [ | Y | N | Y | N |
| Gene ontology mode [ | Y | N | N | N |
| Functional domain mode [ | Y | N | N | N |
| Sequential evolution mode [ | Y | N | N | N |
| Other functions | ||||
| User defined | Y | N | N | N |
| Online updates | Y | N | N | N |
| Graphical User Interface (GUI) | Y | Y | N | Y |
| Execution efficiency | ~17,000 seqs/s | ~170 seqs/s | N.A. | ~15 seqs/s |
The program functions that were compared. There are three groups of functions, including the physicochemical properties, the sequence features that can be generated and the other function properties of the software. Y = YES; N = NO;
the execution time for PseAAC-General and PseAAC-Builder was tested on a dataset containing over 510,000 sequences by the wall-clock time. The execution time for PseAAC-Server was tested on a dataset containing 500 sequences due to the limitation of the service and the internet connection conditions. The execution time for Propy was not tested due the limitation of testing environments. Seqs/s means sequences per second.
Figure 1.The data flow of pseudo-amino acid composition (PseAAC)-General. The input data is FASTA format sequences. The output data is general form PseAAC. The mode of the general form PseAAC is chosen by the users. For the modes, which are implemented by Binary Extension Modules or Lua script modules, the corresponding modules should be loaded as well.