| Literature DB >> 30692846 |
Atul Kumar Upadhyay1,2, Ramanathan Sowdhamini1.
Abstract
Computational approaches to high-throughput data are gaining importance because of explosion of sequences in the post-genomic era. This explosion of sequence data creates a huge gap among the domains of sequence structure and function, since the experimental techniques to determine the structure and function are very expensive, time taking, and laborious in nature. Therefore, there is an urgent need to emphasize on the development of computational approaches in the field of biological systems. Engagement of proteins in quaternary arrangements, such as domain swapping, might be relevant for higher compatibility of such genes at stress conditions. In this study, the capacity to engage in domain swapping was predicted from mere sequence information in the whole genome of holy Basil (Ocimum tenuiflorum), which is well known to be an anti-stress agent. Approximately, one-fourth of the proteins of O tenuiflorum are predicted to undergo three-dimensional (3D)-domain swapping. Furthermore, function annotation was carried out on all the predicted domain-swap sequences from the O tenuiflorum and Arabidopsis thaliana for their distribution in different Pfam protein families and gene ontology (GO) terms. These domain-swapped protein sequences are associated with many Pfam protein families with a wide range of GO annotation terms. A comparative analysis of domain-swap-predicted sequences in O tenuiflorum with gene products in A thaliana reveals that around 26% (2522 sequences) are close homologues across the 2 genomes. Functional annotation of predicted domain-swapped sequences infers that predicted domain-swap sequences are involved in diverse molecular functions, such as in gene regulation of abiotic stress conditions and adaptation to different environmental niches. Finally, the positively predicted sequences of A thaliana and O tenuiflorum were also examined for their presence in stress regulome, as recorded in our STIFDB database, to check the involvement of these proteins in different abiotic stresses.Entities:
Keywords: Machine-learning approaches; Random Forest; genomes and proteome; protein sequences; three-dimensional-domain swapping
Year: 2019 PMID: 30692846 PMCID: PMC6335655 DOI: 10.1177/1177932218821362
Source DB: PubMed Journal: Bioinform Biol Insights ISSN: 1177-9322
Figure 1.Cartoon representation of 3D-domain swapping.
Prediction result of 3D-domain swapping by RF approach on different plant genomes.
| S. No. | Genomes | Total reviewed sequences | Positive prediction by RF |
|---|---|---|---|
| 1 |
| 12 033 | 7694 (64%) |
| 2 |
| 186 | 48 (26%) |
| 3 |
| 400 | 208 (52%) |
| 4 |
| 423 | 183 (43%) |
| 5 |
| 36 768 | 9419 (25%) |
Abbreviation: RF, Random Forest.
Figure 2.Distribution of homologues of predicted 3D-domain-swapped proteins of Tulsi in different plant genomes.
Figure 3.Venn diagram showing the number of common proteins to 3D-domain swapping and biotic and abiotic stresses in A thaliana.
Some of the plant protein crystal structures with 3D-domain swapping.
| S. No. | Monomer | Domain swap | Description |
|---|---|---|---|
| 1 | 1GNU | 1WZ3 | Ubiquitin-like |
| 2 | 1G6J | 1GJZ | Ubiquitin-like |
| 3 | 1KMZ | 1XY7 | Glyoxalase/bleomycin resistance |
| 4 | 1GQ9 | 1W77 | Nucleotide-diphospho-sugar transferases |
| 5 | 1KL7 | 1E5X | Threonine synthatase |
| 6 | 1X91 | 1X8Z | Plant invertase/pectin methylesterase |
| 7 | – | 1Z84 | Galactose-1-phosphate uridyltransferase-like |
| 8 | – | 1MLV | Ribulose-1,5 bisphosphate |
| 9 | 2A5V | 1EKJ | Beta-carbonic anhydrase |
| 10 | – | 1L3A | Plant transcriptional regulator pbf-2 |
| 11 | – | 3A8R | Respiratory burst NADPH oxidase |
| 12 | – | 1Z7W | Cysteine synthase |
| 13 | – | 2Q48 | Protein AT5G48480 |
| 14 | – | 2NTX | EMB|CAB41934.1 |
| 15 | – | 2AAO | Calcium-dependent protein kinase |
| 16 | – | 2Q4H | Nucleocapsid protein |
| 17 | – | 2066 | PII protein |
| 18 | – | 1Z7Y | Cysteine synthase |
| 19 | – | 2PC5 | DUTP pyrophosphate-like protein |
| 20 | – | 2P90 | DUTP pyrophosphate-like protein |
| 21 | – | 1MLV | Serine/threonine-protein kinase 10 |
| 22 | – | 2BHW | Chlorophyll a-b binding protein AB80 |