| Literature DB >> 30397544 |
Bandana Kumari1, Ravindra Kumar1, Vipin Chauhan2,3, Manish Kumar1.
Abstract
BACKGROUND: In both prokaryotic and eukaryotic proteins, repeated occurrence of a single or a group of few amino acids are found. These regions are termed as low complexity regions (LCRs). It has been observed that amino acid bias in LCR is directly linked to their uncontrolled expansion and amyloid formation. But a comparative analysis of the behavior of LCR based on their constituent amino acids and their association with amyloidogenic propensity is not available.Entities:
Keywords: Amino acid runs; Amyloids; Functional annotation; Low complexity regions
Year: 2018 PMID: 30397544 PMCID: PMC6214233 DOI: 10.7717/peerj.5823
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Distribution of low complexity regions and amyloids predicted in them.
| Ala | Cys | Ile | Leu | Met | Val | Trp | Phe | Arg | Lys | Asn | Asp | Glu | Gln | Gly | His | Pro | Ser | Thr | Tyr | ||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| LCRs | 273 | 2 | 11 | 142 | 1 | 13 | 0 | 11 | 48 | 102 | 352 | 108 | 227 | 391 | 181 | 49 | 205 | 330 | 128 | 7 | |
| Proteins | 265 | 2 | 11 | 141 | 1 | 13 | 0 | 10 | 47 | 100 | 315 | 108 | 225 | 351 | 177 | 49 | 195 | 325 | 124 | 7 | |
| Residue | 2005 | 12 | 55 | 849 | 6 | 61 | 0 | 78 | 272 | 627 | 5967 | 851 | 1721 | 4789 | 1348 | 364 | 1481 | 2435 | 996 | 35 | |
| Proteins | 30 (123) | 1 (1) | 11 (11) | 42 (88) | 0 (0) | 5 (7) | 0 (0) | 10 (10) | 0 (13) | 1 (41) | 54 (253) | 1 (47) | 4 (80) | 39 (244) | 3 (87) | 1 (20) | 0 (93) | 15 (152) | 6 (71) | 7 (7) | |
| Residue | 76 (1092) | 1 (1) | 55 (202) | 171 (1580) | 0 (0) | 18 (60) | 0 (0) | 78 (121) | 0 (118) | 1 (1415) | 170 (11051) | 1 (695) | 4 (1098) | 90 (6891) | 3 (1392) | 1 (194) | 0 (1261) | 28 (2716) | 9 (1983) | 35 (111) | |
| LCRs | 211 | 629 | |||||||||||||||||||
| Proteins | 203 | 614 | |||||||||||||||||||
| Residue | 1360 | 5534 | |||||||||||||||||||
| Proteins | 1 (74) | 8 (224) | |||||||||||||||||||
| Residue | 1 (2140) | 9 (3011) | |||||||||||||||||||
| LCRs | 709 | 2756 | |||||||||||||||||||
| Proteins | 698 | 2390 | |||||||||||||||||||
| Residue | 6164 | 33000 | |||||||||||||||||||
| Proteins | 506 (590) | 174 (1189) | |||||||||||||||||||
| Residue | 4302 (10239) | 437 (28311) | |||||||||||||||||||
Note:
Values in paranthesis are total number of amyloid forming protein/residue under LCRs in total dataset. The categorization of amino acids on basis of physico-chemical properties are as follows: Positively charged: Arg and/or Lys; Negatively charged: Glu and/or Asp; Hydrophobic: any of Cys, Ile, Leu, Met, Phe, Trp, Val or their combination; Polar: any of Arg, Lys, Asn, Gln, Asp, Glu.
Figure 1Amino acid composition of experimentally characterized amyloids found in low complexity regions (Data source: AmyPro).
Figure 2Length-wise distribution of low complexity regions in protein sequences which are predicted to form amyloids by Waltz.
Figure 3Top 5 enriched (A) molecular functions (B) biological processes and (C) cellular components in homopolymeric repeats.
In Figure white color indicates the presence of GO terms while the red color indicates their absence. The description of GO terms is as follows: For Molecular Functions—GO:0000166, Nucleotide binding; GO:0001882, Nucleoside binding; GO:0001883, Purine nucleoside binding; GO:0003677, DNA binding; GO:0003700, Transcription factor activity; GO:0008270, Zinc ion binding; GO:0017076, Purine nucleotide binding; GO:0030528, Transcription regulator activity; GO:0030554, Adenyl nucleotide binding; GO:0043167, Ion binding; GO:0043169, Cation binding; GO:0043565, Sequence-specific DNA binding; GO:0046872, Metal ion binding; GO:0046914, Transition metal ion binding. For Biological Processes—GO:0000270, Peptidoglycan metabolic process; GO:0006259, DNA metabolic process; GO:0006350, Transcription; GO:0006351, Transcription, DNA-dependent; GO:0006355, Regulation of Transcription, DNA-dependent; GO:0006357, Regulation of Transcription from RNA polymerase II promoter; GO:0006396, RNA processing; GO:0006412, Translation; GO:0006418, tRNA aminoacylation for protein translation; GO:0006644, Phospholipid metabolic process; GO:0006796, Phosphate metabolic process; GO:0007010, Cytoskeleton organization; GO:0007049, Cell cycle; GO:0007166, Cell surface receptor linked signal transduction; GO:0008104, Protein localization; GO:0008610, Lipid biosynthetic process; GO:0008654, Phospholipid biosynthetic process; GO:0019637, Organophosphate metabolic process; GO:0022604, Regulation of cell morphogenesis; GO:0030203, Glycosaminoglycan metabolic process; GO:0032774, RNA biosynthetic process; GO:0033554, Cellular response to stress; GO:0034660, ncRNA metabolic process; GO:0043038, Amino acid activation; GO:0043039, tRNA aminoacylation; GO:0045449, Regulation of Transcription; GO:0051252, Regulation of RNA metabolic process; GO:0051276, Chromosome organization; GO:0051301, Cell division. For Cellular Components—GO:0000267, Cell fraction; GO:0005654, Nucleoplasm; GO:0005694, Chromosomal part; GO:0005856, Cytoskeleton; GO:0005886, Plasma membrane; GO:0009536, Plastid; GO:0016021, Integral to membrane; GO:0031090, Organelle membrane; GO:0031224, Intrinsic to membrane; GO:0031226, Intrinsic to plasma membrane; GO:0031974, Membrane-enclosed lumen; GO:0031981, Nuclear lumen; GO:0043228, Non-membrane-bounded organelle; GO:0043232, Intracellular non-membrane-bounded organelle; GO:0043233, Organelle lumen; GO:0044459, Plasma membrane part; GO:0070013, Intracellular organelle lumen.
Figure 4Top 5 enriched (A) molecular functions (B) biological processes and (C) cellular components in LCRs containing amino acids of similar physico-chemical properties.
In Figure white color indicates the presence of GO terms while the red color indicates their absence. The description of GO terms is as follows: For Molecular Functions—GO:0000166, Nucleotide binding; GO:0003677, DNA binding; GO:0003700, Transcription factor activity, sequence-specific DNA binding; GO:0004252, Serine-type endopeptidase activity; GO:0004674, Protein serine/threonine kinase activity; GO:0004872, Receptor activity; GO:0004930, G-protein coupled receptor activity; GO:0005509, Calcium ion binding; GO:0005515, Protein binding; GO:0005524, ATP binding; GO:0008270, Zinc ion binding; GO:0016787, Hydrolase activity; GO:0044822, poly(A) RNA binding; GO:0046872, Metal ion binding. Biological Processes—GO:0000122, Negative regulation of transcription from RNA polymerase II promoter; GO:0006281, DNA repair; GO:0006351, Transcription, DNA-templated; GO:0006355, Regulation of transcription, DNA-templated; GO:0006468, Protein phosphorylation; GO:0006508, Proteolysis; GO:0006810, Transport; GO:0007165, Signal transduction; GO:0007275, Multicellular organism development; GO:0015031, Protein transport; GO:0016310, Phosphorylation; GO:0045087, Innate immune response; GO:0045944, Positive regulation of transcription from RNA polymerase II promoter; GO:0055085, Transmembrane transport. Cellular Components—GO:0005576, Extracellular region; GO:0005622, Intracellular; GO:0005623, Cell; GO:0005634, Nucleus; GO:0005654, Nucleoplasm; GO:0005694, Chromosome; GO:0005730, Nucleolus; GO:0005737, Cytoplasm; GO:0005886, Plasma membrane; GO:0005887, Integral component of plasma membrane; GO:0016020, Membrane; GO:0016021, Integral component of membrane; GO:0030054, Cell junction.
Pathways and functional classes of human proteome with predicted aggregation tendency within low complexity regions.
| Category | Polar | Hydrophobic | ||
|---|---|---|---|---|
| 5HT2 type receptor mediated signaling pathway->SNARE Complex | – | – | – | |
| 5HT3 type receptor mediated signaling pathway->SNARE Complex | – | – | – | |
| 5HT4 type receptor mediated signaling pathway->SNARE Complex | – | – | – | |
| Adrenaline and noradrenaline biosynthesis->amine translocator | – | – | – | |
| Alzheimer disease-presenilin pathway->Matrix metalloprotease | – | – | – | |
| Angiogenesis->Phosphatidylinositol 3-kinase | – | – | – | |
| Apoptosis signaling pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| Axon guidance mediated by netrin->Phosphatidylinositol 3-kinase | – | – | – | |
| B cell activation->Phosphatidylinositol 3-kinase | – | – | – | |
| Beta1 adrenergic receptor signaling pathway->SNARE Complex | – | – | – | |
| Beta2 adrenergic receptor signaling pathway->SNARE Complex | – | – | – | |
| Beta3 adrenergic receptor signaling pathway->SNARE Complex | – | – | – | |
| Blood coagulation->Plasmin | – | – | – | |
| Blood coagulation->Plasminogen | – | – | – | |
| Cadherin signaling pathway->Cadherin | – | – | – | |
| CCKR signaling map->MMP9 | – | – | – | |
| CCKR signaling map->p110 | – | – | – | |
| Cortocotropin releasing factor receptor signaling pathway->SNARE Complex | – | – | – | |
| Dopamine receptor mediated signaling pathway->SNARE Complex | – | – | – | |
| EGF receptor signaling pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| Endothelin signaling pathway->Adenylate cyclase | – | – | – | |
| Endothelin signaling pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| FGF signaling pathway->fibroblast growth factor | – | – | – | |
| FGF signaling pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| GABA-B receptor II signaling->adenylate cyclase | – | – | – | |
| Heterotrimeric G-protein signaling pathway-Gi alpha and Gs alpha mediated pathway->Adenylyl cyclase | – | – | – | |
| Heterotrimeric G-protein signaling pathway-Gi alpha and Gs alpha mediated pathway->Gs-protein coupled receptor | – | – | – | |
| Heterotrimeric G-protein signaling pathway-Gi alpha and Gs alpha mediated pathway->Gi protein coupled receptor | – | – | – | |
| Heterotrimeric G-protein signaling pathway-Gq alpha and Go alpha mediated pathway->Go-protein coupled receptor | – | – | – | |
| Huntington disease->alpha-Adaptin | – | – | – | |
| Hypoxia response via HIF activation->Phosphatidylinositol 3-kinase | – | – | – | |
| Inflammation mediated by chemokine and cytokine signaling pathway->Chemokine receptor | – | – | – | |
| Inflammation mediated by chemokine and cytokine signaling pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| Insulin/IGF pathway-protein kinase B signaling cascade->Phosphatidylinositol 3-kinase | – | – | – | |
| Integrin signaling pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| Interleukin signaling pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| Ionotropic glutamate receptor pathway->N-ethylmaleimide-sensitive factor attachment protein receptor | – | – | – | |
| Metabotropic glutamate receptor group II pathway->N-ethylmaleimide-sensitive factor attachment protein receptor | – | – | – | |
| Metabotropic glutamate receptor group III pathway->N-ethylmaleimide-sensitive factor attachment protein receptor | – | – | – | |
| Muscarinic acetylcholine receptor 1 and 3 signaling pathway->N-ethylmaleimide-sensitive factor attachment protein receptor | – | – | – | |
| Muscarinic acetylcholine receptor 2 and 4 signaling pathway->N-ethylmaleimide-sensitive factor attachment protein receptor | – | – | – | |
| Nicotinic acetylcholine receptor signaling pathway->N-ethylmaleimide-sensitive factor attachment protein receptor | – | – | – | |
| Opioid proenkephalin pathway->SNARE Complex | – | – | – | |
| Opioid proopiomelanocortin pathway->SNARE Complex | – | – | – | |
| Opioid prodynorphin pathway->SNARE Complex | – | – | – | |
| Oxytocin receptor mediated signaling pathway->SNARE Complex | – | – | – | |
| p53 pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| PDGF signaling pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| PI3 kinase pathway->p110 | – | – | – | |
| PI3 kinase pathway->Activated p110 | – | – | – | |
| Plasminogen activating cascade->Plasmin | – | – | – | |
| Plasminogen activating cascade->Plasminogen | – | – | – | |
| Plasminogen activating cascade->pro-matrix metalloprotease 9 | – | – | – | |
| Ras Pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| T cell activation->Phosphatidylinositol 3-kinase | – | – | – | |
| TGF-beta signaling pathway->Transforming growth factor beta | – | – | – | |
| Thyrotropin-releasing hormone receptor signaling pathway->SNARE Complex | – | – | – | |
| VEGF signaling pathway->Phosphatidylinositol 3-kinase | – | – | – | |
| Wnt signaling pathway->Cadherin | – | – | – | |
| Wnt signaling pathway->secreted frizzled-related protein | – | – | – | |
| Apolipoprotein (PC00052) | – | – | – | |
| Aspartic protease (PC00053) | – | – | – | |
| Cation transporter (PC00068) | – | – | – | |
| Cell adhesion molecule (PC00069) | – | – | – | |
| Chemokine (PC00074) | – | – | ||
| DNA binding protein (PC00009) | – | – | – | |
| Enzyme modulator (PC00095) | – | – | – | |
| Glycosyltransferase (PC00111) | – | – | – | |
| G-protein coupled receptor (PC00021) | – | – | – | |
| G-protein modulator (PC00022) | – | – | – | |
| Growth factor (PC00112) | – | – | – | |
| Homeodomain transcription factor (PC00119) | – | – | – | |
| Ion channel (PC00133) | – | – | – | |
| Immunoglobulin receptor superfamily (PC00124) | – | – | – | |
| Intermediate filament binding protein (PC00130) | – | – | – | |
| Kinase (PC00137) | – | – | – | |
| Membrane-bound signaling molecule (PC00152) | – | – | – | |
| Metalloprotease (PC00153) | – | – | – | |
| Non-receptor serine/threonine protein kinase (PC00167) | – | – | – | |
| Protease (PC00190) | – | – | – | |
| Protease inhibitor (PC00191) | – | – | – | |
| Serine protease (PC00203) | – | – | – | |
| Signaling molecule (PC00207) | – | – | – | |
| Transmembrane receptor regulatory/adaptor protein (PC00226) | – | – | – | |
| Transporter (PC00227) | – | – | – | |
| Type I cytokine receptor (PC00231) | – | – | ||
| Voltage-gated sodium channel (PC00243) | – | – | – | |
| Winged helix/forkhead transcription factor (PC00246) | – | – | – | |