| Literature DB >> 26792120 |
Elo Leung1,2, Amy Huang3, Eithon Cadag4,5, Aldrin Montana6,7, Jan Lorenz Soliman8,9, Carol L Ecale Zhou10.
Abstract
BACKGROUND: Here we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools.Entities:
Mesh:
Year: 2016 PMID: 26792120 PMCID: PMC4721133 DOI: 10.1186/s12859-016-0887-y
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1PSAT web-based architecture: currently available computational sources for sequence annotations
Fig. 2Real-time messaging system between the Django web application, RabbitMQ, Celery workers, and MySQL database
List of predicted enzymes in the naphthalene degradation pathway
| EC Number | Enzyme Description | Number of Genes | |
|---|---|---|---|
| RV1423 | H. seropedicae | ||
| 1.1.1.1 | Alcohol dehydrogenase | 3 | 3 |
| 1.2.1.65 | Salicylaldehyde dehydrogenase | 2 | 0 |
| 1.3.1.29 | Cis-1,2-dihydro-1,2-dihydroxynaphthalene dehydrogenase | 1 | 0 |
| 1.13.11.56 | 1,2-dihydroxynaphthalene dioxygenase | 1 | 0 |
| 1.14.13.172 | Salicylate 1-monooxygenase | 5 | 0 |
| 1.14.12.12 | Naphthalene 1,2-dioxygenase | 3 | 0 |
| 4.1.2.45 | Trans-o-hydroxybenzylidenepyruvate hydratase-aldolase | 1 | 0 |
| 5.99.1.4 | 2-hydroxychromene-2-carboylate isomerase | 2 | 0 |
Genes with putative function (EC number and enzyme description) were identified in RV1423 along with the number of genes encoding these enzymes in the RV1423 and H. seropedicae genomes
Runtime analyses of back-end processing
| Genome | Number of Proteins | Parallel (HH:MM:SS) | Serial (HH:MM:SS) |
|---|---|---|---|
|
| 2755 | 1:57:16 | 2:27:00 |
|
| 3858 | 2:35:41 | 2:31:10 |
|
| 4741 | 3:33:57 | 2:24:24 |
|
| 4958 | 3:35:34 | 2:30:09 |
Jobs were run in parallel and serial modes using four different genomes and 100 proteins randomly selected from their corresponding genome, respectively