| Literature DB >> 16078990 |
Daniel P Depledge1, Andrew R Dalby.
Abstract
BACKGROUND: Single amino acid repeats make up a significant proportion in all of the proteomes that have currently been determined. They have been shown to be functionally and medically significant, and are associated with cancers and neuro-degenerative diseases such as Huntington's Chorea, where a poly-glutamine repeat is responsible for causing the disease. The COPASAAR database is a new tool to facilitate the rapid analysis of single amino acid repeats at a proteome level. The database aims to simplify the comparison of repeat distributions between proteomes in order to provide a better understanding of their function and evolution.Entities:
Mesh:
Substances:
Year: 2005 PMID: 16078990 PMCID: PMC1199582 DOI: 10.1186/1471-2105-6-196
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Dominantly inherited neurodegenerative diseases are associated with abnormally expanded tracts of glutamine residues.
| Huntington's disease protein | 1 SAAR (23-residues) | 2x Pro repeats (11-and 10-residues) |
| Spinocerebellar ataxin type 1 | 2 SAARs (15-and 12-residues) | The two Gln repeats are separated by 4 residues |
| Androgen receptor (Kennedy's disease) | 3 SAARs (21-, 6-and 5-residues) | 1x Pro repeat (8-residues) |
Legend: These SAARs are often accompanied by at least 2 other long SAARs of different amino acids. Gln – Glutamine, Pro – Proline, Ala – Alanine, Gly – Glycine, Glu – Glutamic acid.
Figure 1Database schema for COPASAAR. Note that each of the species_repeats, species_expected, protein_repeats and protein_expected tables will be repeated 20 times once for each amino acid.
Figure 2Example SQL script used to query the database for all proteins in humans with an alanine repeat of 6 amino acids.
The proportion of a proteome composed of SAARs and the percentage of proteomes in each kingdom with a greater number of SAARs than the mean. *The overall mean is 13.18%
| Eukaryotes | 14.5% | 95% |
| Archaea | 13.3% | 45% |
| Bacteria | 13.1% | 32% |
SAARs composition by amino acid.
| Arginine | 0.75% | 0.75% | 0.74% |
| Lysine | 1.0% | 1.05% | 0.72% |
| Glutamic Acid | 0.89% | ||
| Aspartic Acid | 0.67% | 0.65% | 0.55% |
| Glutamine | 0.57% | 0.18% | 0.42% |
| Asparagine | 0.63% | 0.37% | 0.35% |
| Histidine | 0.17% | 0.09% | 0.13% |
| Proline | 0.83% | 0.42% | 0.43% |
| Tyrosine | 0.23% | 0.37% | 0.21% |
| Tryptophan | 0.04% | 0.04% | 0.04% |
| Serine | 0.90% | 0.85% | |
| Threonine | 0.68% | 0.60% | 0.59% |
| Glycine | 0.90% | 1.10% | 1.09% |
| Alanine | 1.14% | ||
| Methionine | 0.10% | 0.11% | 0.12% |
| Cysteine | 0.10% | 0.03% | 0.04% |
| Phenylalanine | 0.38% | 0.38% | 0.37% |
| Leucine | |||
| Valine | 0.80% | 1.03% | |
| Isoleucine | 0.61% | 0.74% |
Legend: The dominant amino acids are leucine and serine (Eukaryotes), leucine and glutamic acid (Archaea), and leucine and alanine (Bacteria). Amino acids considered highly abundant are highlighted in bold.