| Literature DB >> 22389811 |
N Rathankar1, K A Nirmala, Varun Khanduja, H G Nagendra.
Abstract
High-throughput genome sequencing has led to data explosion in sequence databanks, with an imbalance of sequence-structure-function relationships, resulting in a substantial fraction of proteins known as hypothetical proteins. Functions of such proteins can be assigned based on the analysis and characterization of the domains that they are made up of. Domains are basic evolutionary units of proteins and most proteins contain multiple domains. A subset of multidomain proteins is fused domains (overlapping domains), wherein sequence overlaps between two or more domains occur. These fused domains are a result of gene fusion events and their implication in diseases is well established. Hence, an attempt has been made in this paper to identify the fused domain containing hypothetical proteins from human genome homologous to parkinsonian targets present in KEGG database. The results of this research identified 18 hypothetical proteins, with domains fused with ubiquitin domains and having homology with targets present in parkinsonian pathway.Entities:
Year: 2011 PMID: 22389811 PMCID: PMC3263550 DOI: 10.5402/2011/265253
Source DB: PubMed Journal: ISRN Neurol ISSN: 2090-5505
Figure 117 clusters in fused hypothetical proteins containing 36 domains.
Summary of 17 clusters along with their involvement in diseases and major functions. Words in bold indicate neurodegenerative disorders and the role of ubiquitin domains in these disorders.
| Cluster no. | No. of domains | Function of domains | No. of sequences | Disease implication |
|---|---|---|---|---|
| 1 | 36 |
| 106 |
|
| 2 | 11 | NTF2, | 11 | Fatty acid disorders |
| 3 | 20 | Myosin Motor | 135 | Familial hypertrophic cardiomyopathy, |
| 4 | 9 | Cyclophilin | 3 | Immunosuppression, antiviral activity |
| 5 | 16 | vWFA & PH, | 196 | Von Willebrand disease, thrombotic thrombocytopenic purpura (TTP) |
| 6 | 6 | PH & PTB | 70 | Cardiovascular diseases |
| 7 | 10 | tRNA synthase | 15 | Aminoacyl tRNA synthetase- charcot-Marie-Tooth disease type 2D, Mobius syndrome, cardiac disorders |
| 8 | 13 |
| 32 |
|
| 9 | 5 | RNA binding | 1 | Myxoid liposarcoma, sars |
| 10 | 5 | SIR 2 | 3 |
|
| 11 | 5 | HMG box | 10 |
|
| 12 | 4 | PI3K | 2 | Cancer, diabetes and respiratory |
| 13 | 4 | Sm & Sm-like | 8 | Inflammatory bowel disease, Salla disease, diabetes |
| 14 | 3 | Methyl-CpG and PH | 3 | Cancer |
| 15 | 3 | EVH1 | 1 | Wiskott-Aldrich syndrome |
| 16 | 3 | CGH & Ntn | 1 | Lysosomal storage disease and |
| 17 | 3 | Nidogen, thyroglobulin type 1 | 6 | Human gastrointestinal cancer, cancer, acute leukemia, heart diseases |
A description of genes, domains, and type of inheritance for Parkinson's disease (source: Nirit Lev and Melamed [1]).
| Gene/locus/assignment | Domains present | Inheritance | Age of onset |
|---|---|---|---|
| a-Synuclein/SNCA/PARK1 & 4 | Synuclein/ | Autosomal dominant/susceptibility | Early/late |
| Parkin/PRKN/PARK2 | Parkin and | Autosomal recessive/possible susceptibility | Juvenile/early |
| Ubiquitin C-terminal hydrolase/UCH-L1/PARK5 | Peptidase, | Autosomal dominant/susceptibility | Late |
| DJ-1/DJ-1/PARK7 | GATase/ | Autosomal recessive | Early |
Figure 2Parkinson's disease pathway from the KEGG disease database. Proteins encircled with red color are the ones having fused domains with ubiquitin domains (source: the KEGG disease pathway database).
Hypothetical proteins homologous to KEGG sequences with fused domains in Parkinson's disease pathway.
| Sl. no. | The KEGG protein ID | Gi ID | Cluster no. | No. of Hypothetical proteins | Gi ID |
|---|---|---|---|---|---|
| 1 | UB | 11024714 | 1 | 1 | 5912028 |
| 2 | PARK2 | 4758884 | 1 | 5 | 10241759 |
| 12052812 | |||||
| 44662819 | |||||
| 57997480 | |||||
| 37589137 | |||||
| 3 | UBA1 | 23510338 | 2 | 1 | 12053109 |
| 8 | 4 | 7018418 | |||
| 7018436 | |||||
| 34304594 | |||||
| 63994165 | |||||
| 4 | PINK1 | 14165272 | 5 | 7 | 1905906 |
| 3510234 | |||||
| 5912043 | |||||
| 12053281 | |||||
| 52545876 | |||||
| 57997093 | |||||
| 57997188 |
Figure 3Domains in cluster-1.
Conservation of domain fusions in Parkinson's disease targets and human hypothetical sequences.
| KEGG sequence | Hypothetical protein's Gi IDs | Region of domain fusion with the target |
Sequence identity in | |||||
|---|---|---|---|---|---|---|---|---|
| Cd00196 | Cd01769 | Cd01809 | UB | PARK2 | UB | PARK2 | ||
| UB |
| 14–82, | 14–82, | 11–82, | 4–72, | 98 | ||
| 90–158, | 90–158, | 87–158, | 4–72, | |||||
| 166–234 | 166–234 | 163–234 | 5–72 | |||||
|
| 50–107 | 52–107 | 50–105 | 4–72, | 30 | |||
| 4–72, | ||||||||
| 5–72 | ||||||||
| 44662819 | 4–72 | 4–70 | 3–70 | 4–72, | 30 | |||
| 4–72, | ||||||||
| 5–72 | ||||||||
| 37589137 | 4–72 | 4–70 | 3–70 | 4–72, | 30 | |||
| 4–72, | ||||||||
| 5–72 | ||||||||
| 57997480 | 21–87 | 20–87 | 17–87 | 4–72, | 36 | |||
| 4–72, | ||||||||
| 5–72 | ||||||||
| PARK2 |
| 14–82, | 14–82, | 11–82, | 4–72, | 30 | ||
| 90–158, | 90–58, | 87–58, | 4–72, | |||||
| 166–234 | 166–234 | 163–234 | 5–72 | |||||
|
| 50–107 | 52–107 | 50–105 | 4–72, | 29 | |||
| 4–72, | ||||||||
| 5–72 | ||||||||
| 10241759 | 1–67 | 1–67 | 1–65 | 4–72, | 30 | |||
| 4–72, | ||||||||
| 5–72 | ||||||||
Figure 4Multiple sequence alignment of the fused domains in 6 unique hypothetical proteins and their target sequences (PARK2 and ubiquitin sequence).
Figure 5PROSITE signature PS00299 comparison in hypothetical sequences for a ubiquitin domain. Square brackets in the signature indicate the presence of either of the residues at that position, whereas the x(3) indicates any three amino acids. The red-colored residue indicates the strictly conserved residues, blue-colored ones indicates the residues present in the regular expression patterns, and the orange-colored ones indicate the mutant residues as observed from the mutant database.
Mutational analysis of the hypothetical proteins with ubiquitin domains.
| Sl. no. | Gi ID | Mutational positions | Function as predicted by protein mutant database (PMD) |
|---|---|---|---|
| 1 | 5912028 | Nil | No change |
| 2 | 37589137 | I30V | Stability is retained. |
| G35K | Melting temperature at pH 3.0 decreases. | ||
| 3 | 44662819 | I30V | Stability is retained. |
| G35K | Melting temperature at pH 3.0 decreases. | ||
| 4 | 57997480 | G35S | Melting temperature at pH 3.0 decreases. |
| K48R | Increase in morphologic response of cells to canavanine, accumulation of high-molecular-weight ubiquitin conjugates and proteome substrates is observed. | ||
| 5 | 12052812 | R42L | Ubiquitin adenylate affinity for E1 protein decreases. |
| G35K | Melting temperature at pH 3.0 decreases. | ||
| 6 | 10241759 | R42L | Ubiquitin adenylate affinity for E1 protein decreases. |
| G35K | Melting temperature at pH 3.0 decreases. |
Figure 6Domains in cluster-2.
Conservation of domain fusions in Parkinson's disease targets and human hypothetical sequences.
| KEGG sequence |
Hypothetical | Region of domain fusion with the target | Sequence identity in the | ||
|---|---|---|---|---|---|
| Cd01492 | Cd01491 | UBA1 | |||
| UBA1 | 12053109 | 13–162 | 13–162 | 54–162 | 52/160 = 32% |
Figure 7Pairwise sequence alignment of the fused domains between the ubiquitin sequence and its homolog hypothetical sequence (gi:12053109).
Figure 8Domains in cluster-5.
Conservation of domain fusions in Parkinson's disease targets and human hypothetical sequences in cluster-5.
| KEGG sequence |
Hypothetical | Region of domain fusion with the target | Sequence identity in the | ||
|---|---|---|---|---|---|
| Cd00180 | Cd00192 | PINK1 | |||
| PINK1 | 57997188 | 14–288 | 19–273 | 271–501 | 17 |
| 5912043 | 58–340 | 63–337 | 271–501 | 8 | |
| 52545876 | 147–405 | 152–388 | 271–501 | 18 | |
| 1905906 | 53–303 | 57–299 | 271–501 | 19 | |
| 12053281 | 173–419 | 173–415 | 271–501 | 11 | |
| 57997093 | 199–438 | 193–433 | 271–501 | 19 | |
| 3510234 | 28–287 | 34–284 | 271–501 | 13 | |
Figure 9Multiple-sequence alignment of the fused domains in 7 unique hypothetical proteins and PINK1 sequence.
Figure 10Domains in cluster-8.
Conservation of domain fusions in Parkinson's disease targets and human hypothetical sequences.
| KEGG sequence |
Hypothetical | Region of domain fusion with the target | Sequence identity in the | |||
|---|---|---|---|---|---|---|
| Cd01488 | Cd01489 | Cd01490 | UBA1 | |||
| UBA1 | 63994165 | 198–377 | 198–397 | 197–512 | 470–671 | 53 |
| 34304594 | 1–168 | 1–169 | 1–303 | 470–671 | 48 | |
| 7018436 | 30–192 | 30–200 | 30–419 | 470–671 | 31 | |
| 7018418 | 71–368 | 71–331 | 71–342 | 470–671 | 25 | |
Figure 11Multiple-sequence alignment of the fused domains in 4 unique hypothetical proteins and their target sequence (UBA1).