Literature DB >> 32828463

Intrinsically disordered proteins of viruses: Involvement in the mechanism of cell regulation and pathogenesis.

Pushpendra Mani Mishra1, Navneet Chandra Verma1, Chethana Rao1, Vladimir N Uversky2, Chayan Kanti Nandi3.   

Abstract

Intrinsically disordered proteins (IDPs) possess the property of inherent flexibility and can be distinguished from other proteins in terms of lack of any fixed structure. Such dynamic behavior of IDPs earned the name "Dancing Proteins." The exploration of these dancing proteins in viruses has just started and crucial details such as correlation of rapid evolution, high rate of mutation and accumulation of disordered contents in viral proteome at least understood partially. In order to gain a complete understanding of this correlation, there is a need to decipher the complexity of viral mediated cell hijacking and pathogenesis in the host organism. Further there is necessity to identify the specific patterns within viral and host IDPs such as aggregation; Molecular recognition features (MoRFs) and their association to virulence, host range and rate of evolution of viruses in order to tackle the viral-mediated diseases. The current book chapter summarizes the aforementioned details and suggests the novel opportunities for further research of IDPs senses in viruses.
© 2020 Elsevier Inc. All rights reserved.

Entities:  

Keywords:  Aggregation; Cell hijacking; Dancing proteins; IDPs; MoRFs; Viruses

Mesh:

Substances:

Year:  2020        PMID: 32828463      PMCID: PMC7129803          DOI: 10.1016/bs.pmbts.2020.03.001

Source DB:  PubMed          Journal:  Prog Mol Biol Transl Sci        ISSN: 1877-1173            Impact factor:   3.622


acidianus filamentous virus 1 anaphase-promoting complex/cyclosome Acanthamoeba polyphaga mimivirus adult T-cell leukemia CREB-binding protein cyclin-dependent kinases cyclophilin A dengue virus deoxyribonucleic acid differential scanning spectroscopy early transcription unit 1B equine Infectious anemia Virus early mitotic inhibitor protein-1 electron paramagnetic resonance enterovirus-71 foot-and-mouth disease virus fluorescence resonance energy transfer hemagglutinin hepatitis delta antigen hepatitis delta virus human immunodeficiency virus human immunodeficiency virus-1 human papillomavirus high-speed atomic force microscope human T-cell lymphotropic virus type 1 intrinsically disordered protein intrinsically disordered region interferon regulatory factor 3 Japanese encephalitis kinase-inducible domain kinase-inducible domain (KID) interacting domain lettuce mosaic virus microtubule-associated protein 2 mouse double minute 2 measles virus molten globule major histocompatibility complex molecular recognition feature movement proteins minute virus of mice nuclear-receptor co-activator binding domain non-structural protein 2 nucleocapsid like particle nuclear localization signal nuclear Magnetic Resonance C-terminal domain of nucleoprotein open reading frame phosphofurin acidic cluster sorting protein isoelectric point protein kinase B pre-molten globule N-terminal region of P protein phosphatase 2A protein-protein interaction predicted percentage of intrinsic disorder posttranslational modification potato virus A papillomavirus potato virus Y random coil RNA dependent RNA polymerase ribonucleic acid rev. response element rice yellow mottle virus severe acute respiratory syndrome coronavirus small-angle X-ray scattering sodium dodecyl sulfate SDS polyacrylamide gel electrophoresis Semliki forest virus Sulfolobus islandicus filamentous virus short linear motif single-molecule single-molecule fluorescence resonance energy transfer sesbania mosaic virus Sendai virus steroid receptor coactivator 1 transactivation domain transactivation response region triple gene black tobacco mosaic virus upstream regulatory region viral genome-linked protein vesicular stomatitis virus West Nile virus C-terminal X domain double-stranded DNA double-stranded RNA retinoblastoma protein single-stranded DNA single-stranded RNA Zika virus This book chapter entitled “Intrinsically disordered proteins of viruses: involvement in the mechanism of cell regulation and pathogenesis” discusses extensively the intrinsically disordered protein (IDP)-mediated functional mechanisms, pathogenesis, structural regulation and cellular regulation of host cell by complex viral proteome. For a complete understanding of IDPs and their role in Viruses, this chapter starts with the brief introuction of IDPs and their associated atypical properties and different instrumental and computational techniques to characterize IDPs. Next, chapter describes the IDP-related aspect of viruses. Different possible modes of viral IDP molecular mimicry and host IDP-mediated regulation of host cells have been discussed and a diagrammatic model is proposed. Subsequently, the origin of viruses and their special properties have been described. Further, the importance of viral structural, non-structural and other proteins is emphasized. Furthermore, the IDP prevalence in viruses and their comparison to three distinct domains of life (Archaea, Bacteria, and Eukarya) are discussed in detail. The last portion of this book chapter explains various IDP-associated patterns in viruses and their relation to the host range, pathogenicity, and protein aggregation. Next, the structural and functional importance of IDPs in different viruses (Bacteriophage, Plant and Animal virus) is discussed. The examples of the aforementioned viruses and description of their IDP-associated mechanisms have been taken from the different referenced publications. Lastly, this chapter summarizes the conversed contents and further discusses the future outlook for the purpose of studying IDP prevalence, distribution, and disorder-related mechanisms in the proteomes of viruses. We hope that this chapter will help in grasping the concept of IDPs and IDPs' perspective of viruses and spawning many novel ideas in relation to deciphering the complexity of viral pathogenesis and drug discovery. For instance, the prevalence of IDPs and patterns of pathogenesis and host range have been explored and proven in a few viruses; however other related patterns have not been explored completely. Additionally, the mechanisms of cell regulation via disordered viral proteome have not been completely understood. The proposed model will form the basis for further research and understanding. By authors.

A general introduction to intrinsically disordered proteins (IDPs) and their major properties

Intrinsically disordered proteins (IDPs)

The concept of structure-function paradigm that was widely accepted for more than a century tells us that the biological functions of proteins are linked to their rigid three-dimensional (3D) structures. The normal functioning of most of the globular proteins (e.g. enzymes) requires the orderly arrangement of various functional groups of amino acids in protein's unique 3D structure to facilitate the catalysis of chemical reactions or other related functions. However, recent research demonstrated that the large fraction of genome-encoded proteins of many organisms lack the well-defined 3D structures, but still play various important roles in cellular functionality. The group of such proteins is generally known as intrinsically disordered proteins (IDPs).2, 3, 4 However, they have multiple alternative names, such as natively denatured, natively unfolded, intrinsically unstructured, natively disordered, dancing proteins, protein clouds,10, 11 4D, malleable,13, 14, 15 chameleon, vulnerable, intrinsically disordered, intrinsically unfolded, intrinsically denatured, flexible, mobile, pliable, rheomorphic, and partially folded proteins. The different name identities of IDPs are based on their properties observed in different experiments conducted at a different time. The computational analysis reveals that greater than one-third of eukaryotic proteins harbor the intrinsically disordered regions (IDRs) of greater than 30 residues in length.24, 25, 26, 27, 28, 29, 30, 31 In solution, when IDPs are kept alone, they lack a unique 3D structure either in parts or completely. The high abundance of IDPs is associated with their functional importance for many crucial cellular processes, such as signaling, recognition, and regulation by means of high specificity and low-affinity interaction and binding to multiple partners. The disorder-based signaling interaction can be mediated as many to one and one to many interactions. The functional tuning of IDPs induced by various post-translational modifications (PTMs), Alternative splicing and induced folding. The high prevalence of IDPs in various diseases suggests the root cause is not only the protein misfolding but beyond it and also caused by mis-signaling and misidentification. The peculiar behavior of IDPs draws attention to drug targets, which temper the protein-protein interactions.

Properties of IDPs

Sequence of amino acid define disorderliness

Although the IDPs are biologically active molecules, they tend to adopt an extended mobile dynamic or collapsed conformational ensemble either at the tertiary or secondary structure. A comparative analysis of amino acid sequence of IDPs with respect to those of ordered proteins demonstrate the noticeable enrichment in the content of disorder-promoting amino acids, such as Ala, Arg, Gly, Gln, Ser, Glu, Lys, and Pro, paralleled by the significant depletion in the content of order-promoting amino acids, Ile, Leu, Val, Trp, Tyr, Phe, Cys, and Asn. In addition to the above-observed criteria, several other disorder-promoting factors are involved that are; 14 Å contact numbers, coordination number, hydropathy, Cys + Phe + Tyr + Trp, volume, Arg + Glu + Ser + Pro, bulkiness, net charge and β-sheet propensity that provide the reliable basis for differentiating disorder and other proteins.18, 33, 34, 35, 36

Intrinsic disorder and binding promiscuity

One of the important properties of IDPs is being promiscuous in nature. This involves interaction with multiple partners and ability to act as highly connected nodes, or hubs, most frequently within the protein-protein interaction (PPI) networks. Hubs are vital for the normal functioning and stability of PPI networks in any organism. It has been shown that the deletion of hub protein could be lethal for that organism.37, 38, 39, 40, 41, 42, 43, 44, 45 The illustrative examples of disordered hub proteins that bind to around 10–100 binding partners are p21, p27, p53, BRCA1, XPA, α-synuclein, estrogen receptor, etc. IDRs within disordered hub protein present in at least one of the two functional forms; the one functional form defines the ability of the disordered binding site to interact with specific partner and, upon interaction, to adopt an ordered conformation, another functional form is a flexible linker that connects two ordered domain and allows their unrestricted movement.

Properties of charge

The presence of charge in amino acid residues helps in establishing the structure and function of proteins. The high content of charged residue in the highly disordered proteins (native pre-molten globules (PMG) and native coils) is an important, conspicuous feature. The high net charge is important for extended conformation of IDPs, because it has been observed repeatedly for proteins in aqueous environment, that the sequences lacking in certain hydrophobic residues and rich in polar uncharged amino acids form the heterogeneous ensemble of collapsed structures.50, 51, 52, 53, 54, 55, 56 Analysis of the number of highly charged polypeptides revealed that the intrinsic preference of a polypeptide backbone for the formation of collapsed structure depends on charge content.

Human disease association

The analysis of various protein databases of human diseases and other observations determine the contribution of IDPs to the pathogenesis of many human ailments and their role as common player in between the diseases.8, 57 Few examples of diseases, where IDPs/IDRs are involved are listed below. Cancer: Intrinsic disorder has been observed in many cancer-associated proteins, such as p53, p57kip2, c-Fos, Bcl-2 and Bcl-XL, thyroid cancer-associated protein TC-1, and protein components of cancer-causing viruses. Down's syndrome: Non-filamentous deposits of intrinsically disordered amyloid-β (Aβ). Alzheimer's disease: IDPs associated with this disease are depositions of Aβ, Tau, and α-synuclein NAC fragment.64, 65, 66, 67 Other diseases, where intrinsic disorder in protein components was reported are family of polyQ diseases; variant of Alzheimer's disease, dementia with Lewy body, diffuse Lewy body disease, Parkinson's disease, Hallervorden–Spatz disease and multiple system atrophy; prion disease; argyrophilic grain disease, myotonic dystrophy, and motor neuron disease with neurofibrillary tangles, subacute sclerosing panencephalitis, Niemann-Pick disease type C. Intrinsic disorder is also reported in protein components of viruses causing various human diseases, such as AIDS and Cutaneous diseases.

Ensemble structures of IDPs

IDPs do not have specific fixed structures, hence they exist as dynamic ensembles, quite similar to the clouds of proteins. In these protein cloud structures, the atomic position and the backbone Ramachandran angle does not have the fixed value and vary significantly over time. Despite being dynamic in nature, these protein clouds could be represented by a fairly limited number of low-energy conformations (but still significantly more than one low-energy state typical for ordered proteins).10, 72, 73 To understand the regulatory mechanism and cellular functions involvement, structural details of IDPs are necessary. Various methods have been developed to construct the ensemble modeling of IDPs.74, 75, 76

Hydration property

Due to the difference in structure and structure-associated properties, ordered and disordered proteins possess different hydration degrees. The degree of hydration is significantly higher for the IDPs in comparison to the similar size globular proteins. Furthermore, the hydration degree also varies for the partially and fully intrinsically disordered proteins.77, 78, 79 In addition to retaining a high amount of water content, IDPs also possess a high propensity of binding to charged solute ions. Both properties play an important protective role in biological systems. For example, under the adverse water-stressed conditions, D. radioduran is able to protect its enzyme nudix hydrolase from denaturation due to the aforementioned properties of the IDRs of this protein. Several plants and free-living insect species also protect themselves by using the ability of IDPs and IDRs for excessive hydration and absorption of solute ion.

Property of induced folding

Many IDPs can undergo (at least partial) disorder-to-order transitions upon binding to the specific partners. The free energy required for the transition comes from the interface contacts, which results in the formation of low net free energy association for the high specific interaction combination.18, 38, 39, 81, 82 In IDPs/IDRs, coupled properties of high specificity and low affinity seems to ensure specific binding and reversibility to complete the signaling cascade. IDPs/IDRs can change their shapes to readily bind multiple different partners. Also, it has been shown that in their unbound conformational ensembles, IDPs/IDRs have a preference for the structure they most likely to adopt after binding.81, 83, 84

Interactability of IDPs

Interactions of IDPs with their partners are characterized by a diverse range of binding modes, due to which the formation of many unusually shaped complexes takes place, with some of these complexes being relatively static hence their structure could be determined by the x-ray crystallography method. The most common binding modes of IDPs that have been studied extensively relative to others are Molecular Recognition Features (MoRFs). MoRFs are intrinsically disordered protein segments, which are short and interaction-prone. These regions also have intrinsic propensity for order, which is not strong enough to ensure their folding in the unbound state. However, upon binding to specific partners, MoRFs undergo disorder-to-order transition. Such regions are chiefly involved in molecular recognition. The classification of MoRFs is based on their structures in the bound state. As a result, they are classified into α-helix-forming α-MoRFs, β-strand forming β-MoRFs, ordered regions without any regular structure or irregular ι-MoRFs, and complex MoRF that contain two or more types of secondary structure.85, 86 In addition to MoRFs, other known binding modes are Pullers, Penetrators, Flexible Wrapper,89, 90, 91, 92 Connectors and Armature,93, 94, 95, 96, 97 Huggers,98, 99, 100 Stackers or β-Arcs, Intertwined Strings,102, 103, 104 Long Cylindrical Containers, Tweezers and a Forceps, Grabbers, Tentacles, and Chameleons.16, 109, 110, 111, 112

Roles of IDPs in protein interaction and PPI networks

IDP/IDR can play its roles by contributing to the binding diversity in three different ways, as it may serve as the structural basis for hub protein promiscuity, secondly, it may bind to structured hub proteins, and thirdly, IDR can act as a flexible linker between the functional domains and facilitate the binding diversity through the linker-enabling mechanism. A vast range of functional importance of IDPs/IDRs has been found by the researchers. Few examples are given here to illustrate the type of biological activities carried by the IDPs/IDRs. (1) IDPs contain sites for various posttranslational modifications (PTMs), such as phosphorylation, methylation, glycosylation, ADP-ribosylation or acetylation; (2) Entropic spring (rubber-like) property can be provided by IDRs; (3) IDPs contain autoinhibitory domains; (4) IDPs/IDRs possess binding sites for DNA, rRNA, mRNA, tRNA, metal ions, and other proteins; (5) IDRs include regulatory protease digestion site; (6) Signal for the nuclear localization is located within IDRs; (7) IDRs provide flexible linkers between structured domains; and (8) IDPs, such as p21 and p27, mediate cell regulation. Fig. 1 provides details of the involvement of IDPs in crucial cellular functions and processes.
Fig. 1

IDPs involvement in various cellular processes.114, 115

IDPs involvement in various cellular processes.114, 115

Predictors of intrinsic disorder

The compositional differences between ordered proteins and IDPs facilitated the development of various disorder predictors. These predictors were initially elaborated based on amino acid composition. Later, the predictors were developed on the basis of some basic physical principles and machine learning algorithms, which use the characteristic features of IDPs/IDRs, such as net charge, hydrophobicity, and other sequence features. As of 2009, more than 50 predictors for intrinsic disorder prediction have been developed and published, and currently, this list is likely to be doubled. There are the good chances for the development of improved predictors for intrinsic disorder, if the proper sequence information is encoded into the prediction algorithm. The example of few common predictors are as follows: various members of PONDR family, DISOPRED, FoldIndex, IUPRED, DisEMBL, DISOPRED2, and RONN121, 122 to name a few.

Structural assessment of IDPs through biophysical techniques

There are three functional conformational states, in which IDPs could globally exist, depending upon the environment and content of residual structure. These are, in a range of the increasing depth of disorder, molten globule (MG), pre-molten globule (PMG), and random-coil-like (RC-like) states. Therefore, IDPs could adopt either extended conformations (RC and PMG) or remain globally collapsed (MG). So far, the conformational and spectroscopic study of IDPs confirmed the important notion that the IDPs could not be represented by a homogeneous structural class, but it would be in the range of fully extended (RC-like) to compact (MG-like) conformations. Protein trinity hypothesis given by the Keith Dunker to accommodate three most known conformations of a protein molecule in a functional framework, which postulated that there a biologically active protein molecule can exist in three conformationally different native states, an ordered form, a state with collapsed-disorder (molten globule, MG) and a state with extended disorder (RC). Functional form is represented by any of the three conformations or transitions between them. Subsequently, this model was extended to accommodate an extra conformation that is the PMG, which is an intermediate conformation between MG and RC. Many biophysical techniques can be applied for the conformational analysis and structure determination of IDPs. Some of these techniques provide outputs in an indirect way, while others are useful in providing more quantitative structural data. Nuclear Magnetic Resonance (NMR) is one of the most powerful techniques for deriving quantitative structural information. A wide line NMR relaxation experiment characterizes the IDPs and provides details about the presence of the hydrated layer in the vicinity of disordered regions in the extended and open state. Additionally, the diffusion coefficient of protein can be measured by the pulse field gradient NMR, from which the hydrodynamic parameters could be derived. Structural transition in IDPs can be mapped and documented by the electron paramagnetic resonance (EPR) spectroscopy. The introduction of new generation spin-labels EPR that target the residues other than the cysteine expanded the approach of this technique.125, 126, 127, 128, 129 Small-angle X-ray scattering (SAXS) and small-angle neutron scattering (SANS), which are the experimental techniques for the extraction of quantitative information, lead to an investigation of transient intermediates and provide detailed information about the nature of IDPs. The techniques of a single-molecule approach such as fluorescence resonance energy transfer (FRET),130, 131, 132 High-Speed Atomic Force Microscope (HS-AFM),133, 134 and AFM-based force spectroscopy (FS) are the tools to explore the dynamics and structure of IDPs. The change in distance between two residues and study of conformational equilibria in time length of less than a second based on the intramolecular distance distribution is done by the Single-molecule fluorescence resonance energy transfer (SM-FRET). Formation of secondary structures and probing of time scales from milliseconds to seconds is particularly sensed by AFM-based SM-FS. HS-AFM is used for the direct observation of dynamic processes and structural dynamics of biological molecules, with the temporal resolution of subsecond to sub-100 ms.134, 136 To date, the various dynamic processes have been visualized successfully by using this approach. HS-AFM is applicable to both IDPs and well-structured protein. Various other complementary methods that can be used to study protein disorder are sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE), gel filtration or size exclusion chromatography-based analysis, and specific behavior analysis in acidic and high-temperature environments. In the SDS-PAGE analysis, the observed mobility of IDPs appears to be anomalous. This phenomenon is explained by the less efficient binding of SDS molecules to highly charged IDPs in comparison to the globular proteins of similar molecular masses. The apparent molecular mass determine by this method is up to 1.2- to 1.8-fold higher than the molecular mass determined from the protein sequences or by mass spectrometry. The unusually high apparent molecular mass of IDPs is also observed by gel filtration or size exclusion chromatography techniques. The specific behavior of IDPs in different sets of environmental conditions, such as their stability in an acidic environment and insensitivity to high temperature, has been described for several IDPs, such as caldesmon, microtubule-associated protein-2 MAP2, involucrin, and α-synuclein to name a few. These environmental conditions usually cause the denaturation or/and precipitation of globular proteins out of solution. This difference in the behavior of IDPs and globular protein in the various sets of environmental condition form the basis of purification of IDPs.141, 142, 143, 144 This uniqueness of IDPs provides the first clue of their unusual structural conformation.

The dark proteomes of viruses

IDPs offer high flexibility to viral proteins either in the wholly or partially disordered form. This provides viral proteins with the capability for quick adaption in the changing environment, survival in host body environments, and invasion of the defense mechanism of the host. To accomplish aforementioned tasks, a high mutation rate is exhibited by the viral genomes. For example, the rate of nucleotide exchange per position per generation exhibited by ribonucleic acid (RNA) viruses fall in the range of 10− 5 to 10− 3, for deoxyribonucleic acid (DNA) viruses it is 10− 8 to 10− 5, while eukaryotes and bacteria demonstrate mutation rate of 10− 9. Even a single mutation has high potency to affect more than one viral protein, due to high compactness of viral genomes and the existence of the overlapping reading frames, which often is observed in the viral system. Throughout the life cycle of the virus, many interactions are made to various components of the host cell. It begins with the attachment, entry, and proceeding for the hijacking of the cellular machinery and further viral components synthesis, viral particle assembly, and end by exiting the host cell in the form of new infectious particles. And all these stages are heavily relying on the intrinsic disorder of viral proteins.

Involvement of IDPs in the pathogen-host mediated regulation of cell cycle

There are numerous pathways in host cells, where IDPs are involved in the controlling of the cell cycle. Several illustrative examples are discussed here and shown in Fig. 2 . (1) Preformed helical portion of the cyclin-dependent kinase inhibitor 1B (p27Kip1) protein is associated with positive and negative regulation of cell cycle. (2) Preformed helical structure in disordered N-terminal transactivation domain (TAD) of p53 determines interaction of this protein with Mouse double minute 2 homolog (Mdm2), any change in amino acid residues in the molecular determinant region affects the binding of Mdm2 and subsequently cell regulation and apoptosis. (3) Conformational fluctuation in the intrinsically disordered cell proteins transiently exposes dynamic interaction motif that leads to post-translational modifications (PTMs) and interaction with various target protein that affects cell cycle control. (4) Early mitotic inhibitor protein-1 (EMI-1) containing zinc-binding domain embedded on IDPs inhibits anaphase-promoting complex/cyclosome (APC/C) that controls cell division by promoting ubiquitin-mediated degradation of cyclins and other proteins involved in the regulation of the cell cycle. Kinase phosphorylations regulate the interaction between EMI-1 and APC/C. Intrinsically Unstructured viral protein components, through molecular mimicry, could invade the host IDPs position involved in various cell regulatory processes (few of them as discussed above) and hijack host cell machinery.114, 149, 150, 151, 152
Fig. 2

Intrinsic disorder controlled (A) a structural change in the cyclin-dependent kinase inhibitor 1B (p27Kip1); (B) TADp53-mediated Mdm2 binding have an association with the cell cycle regulation and apoptosis. (C) Conformational fluctuation and transient exposure of interaction motif lead to PTM that subsequently controls the target binding and ultimately regulation of cell cycle. (D) IDRs control zinc-binding domain of EMI-1 that upon interaction with APC/C complex controls UBQ-mediated degradation of cyclins and other proteins that are related to the cell cycle regulation. (E) Viruses hijack the cellular machinery using one or more cell regulation pathways by using their proteins to mimic the host IDPs/IDRs in cell cycle pathways.

Intrinsic disorder controlled (A) a structural change in the cyclin-dependent kinase inhibitor 1B (p27Kip1); (B) TADp53-mediated Mdm2 binding have an association with the cell cycle regulation and apoptosis. (C) Conformational fluctuation and transient exposure of interaction motif lead to PTM that subsequently controls the target binding and ultimately regulation of cell cycle. (D) IDRs control zinc-binding domain of EMI-1 that upon interaction with APC/C complex controls UBQ-mediated degradation of cyclins and other proteins that are related to the cell cycle regulation. (E) Viruses hijack the cellular machinery using one or more cell regulation pathways by using their proteins to mimic the host IDPs/IDRs in cell cycle pathways. In addition to the aforementioned pathways, viruses through histone mimicry can control the expression of the gene and ultimately cell cycle regulations network of host cells. Many unconventional RNA binding proteins containing IDRs can also play an important role in the control of cellular machinery by viruses in the battlefront of host and pathogens. Besides mimicking the function of host cells, viral protein complex directly attacks the cellular components and disrupt their normal functions, for instance, the disordered viral oncoproteins of many cancer-causing viruses attack the Retinoblastoma protein (pRb) and E2F complex and affect the normal cell regulation mechanism as shown in Fig. 3 .
Fig. 3

Host cell cycle regulation influenced by the attack of viral protein components on pRb and E2F complex. Viral protein complex forcibly releases the E2F from pRb and E2F complex and abruptly increases the cell cycle progression in an uncontrolled way. The blue color shows the normal pathway of G1 to S progression, while red color shows virus-induced uncontrolled cell progression from G1 to S.

Host cell cycle regulation influenced by the attack of viral protein components on pRb and E2F complex. Viral protein complex forcibly releases the E2F from pRb and E2F complex and abruptly increases the cell cycle progression in an uncontrolled way. The blue color shows the normal pathway of G1 to S progression, while red color shows virus-induced uncontrolled cell progression from G1 to S. Phosphofurin acidic cluster sorting protein (PACS) acting as a traffic modulator first appeared in lower metazoans. Later evolution of this protein in vertebrate makes integration of cytoplasmic trafficking and inter-organellar communication with nuclear gene expression. In due course of evolution, PACS functional diversity increases in the vertebrate by acquiring the phosphorylation sites and nuclear trafficking signals within its disordered regions. PACS proteins variants PACS-1 and PACS-2 mediated protein trafficking pathways hijacked by viruses for immune evasion, multiplication, and pathogenesis. However, the complete mechanism is yet to decipher. To accomplish all the above functions, viral proteins made interactions with many cellular components, with few of them being Nucleic acids, Proteins, and Membranes. The presence of intrinsic disorder in viral proteins is advantageous for their interactions with the host cell components. The easiness of said interactions could be explained on the basis of the lack of rigid 3D structure and the presence of high structural flexibility in these viral IDPs or proteins containing IDRs to allow their interaction with many binding partners at a time. Linking of functional domains and their promiscuity is achieved by interaction with partner IDRs, where the flexibility plays a major role in bringing two or more domains in proximity in order to perform a particular function. Flexible linker functions of IDRs in viral protein confer the advantage of escaping the recognition by the host immune system; the viral protein interacts with host protein in such a way that the recognition of viral epitope becomes difficult to be recognized by the components of the host immune system. A mutation rate that is typically high in the viral system could be tolerated by the presence of these flexible regions in viral proteins that forbid the structural constraints, hence avoid the susceptibility to mutation. The expected explanation behind all these incidents points toward the involvement of IDPs. The first and pivotal observation of an abundance of intrinsic disorder in the replicative complex of paramyxoviruses had been confirmed.157, 158, 159 Availability and use of bioinformatics tools in the last decades and their continuous growth and the development of sensitive biophysical experimental techniques lead to the identification of an abundance of IDPs in Viruses.31, 160, 161, 162, 163

Origin of viruses and their exclusive properties

Among all replicating organisms, the highest number is demonstrated by the viruses, which, therefore, are considered as the most abundant biological entities on the Earth, For instance, if we compare the count of cells of all living creature present on the earth to number of the viral particles, it will be less than at least an order of magnitude.165, 166 The number of viruses can be estimated by counting the number of virus-like particles in the environment. For example, 1 mL of natural water contains as many as 2.5 × 108 viral particles. Viruses are parasitic in nature, and in high abundance could be found in infected cells of Bacteria, Archaea, and Eukarya or even in other viruses.164, 166, 168 The discovery of a small icosahedral virophage named Sputnik established the concept of infection of the virus by another virus Sputnik virus infects the Acanthamoeba polyphaga Mimivirus (APMV) that in turn infects amoeba. APMV is a member of the Megaviridae family.170, 171, 172 Infection of APMV by Sputnik virus is damaging and produces many deleterious effects in APMV, e.g., the assembly of capsid becomes abnormal and abortive viral forms appear. This breach in the normal morphogenesis of APMV is explained on the basis of cytoplasm-independent replication center of APMV, where final morphogenesis normally takes place. However infection with Sputnik and multiplication of this virus at this center hinder its normal function. From the structural perspective, viruses demonstrate very simple structural organization. However, they display various shapes and strictly do not possess a unique common morphology. The genome of all viruses either made up of double or single-stranded DNA or RNA. It is encapsulated within a protective protein coat known as the capsid. An additional lipid envelope contains a number of membrane proteins found in Enveloped viruses. The position of the envelope is above the matrix protein, which is an additional proteinaceous coat. Some complex viruses in addition to the non-structural proteins contain numerous accessory and regulatory proteins all that help in the assembly of the viral capsid. Viruses in reference to the structure of their genome, mechanism of replication, and transcription display a wide array of diversity. The viral genome could be of single or double-stranded DNA or single or double-stranded RNA and transcribed via a negative sense, positive sense, or ambisense mechanism. The diversity of the viruses either in genomic structure or mechanism of function leads to their classification in seven major classes. Following this classification, all DNA based viruses kept in class I, II, and VII that contain dsDNA viruses, ssDNA viruses and dsDNA viruses that replicate via an intermediate single-stranded RNA (ssRNA) respectively. The remaining four classes, that is III, IV, V and VI, contain various RNA viruses, such as double-stranded RNA viruses (dsRNA), ssRNA viruses of positive (+) sense, ssRNA virus of negative (−) sense, and ssRNA virus of positive (+) sense that replicate via DNA intermediate, respectively. The certain features of viruses that typically oppose them to the living organisms are the absence of cell-like defined structure and inability of maintaining homeostasis and reproduce outside of the cellular environment due to the absence of their own metabolism and essential dependence on the host cell to make new products. The other features, such as the presence of a genome, replication ability and self-assembling creation of their own copies, and continuous evolution by natural selection make viruses similar to other living organisms. The presence of unusual properties makes it difficult to agree on the common view on the viruses. It is difficult to elaborate on whether viruses are some organisms at the edge of life, different and special with respect to other living cellular organisms, or nonliving organic structures that have a self-driven property to interact with living organisms. The recent discovery of the presence of the metabolic protein-encoding genes in giant viruses challenged the previous view of the lack of these genes in viruses. Certain bacterial species, such as Mycoplasma, Rickettsia, and Chlamydia are obligate intracellular parasites exactly as viruses. All this approves the reconsideration of criteria describing the living organisms. There is an incomplete understanding of virus origin, three chief hypotheses have put forth to explain the understanding of their beginning. The first hypothesis is the coevolution theory, according to which viruses and cells appeared simultaneously in the early history of the Earth. Since their emergence, viruses have a dependency on cellular life. The second hypothesis is known as the cellular origin hypothesis or the vagrancy. According to this hypothesis it is assumed that the evolution of viruses occurs from the DNA or RNA pieces that escaped from the genes of the larger organisms. Examples of potential candidates for this escaped genetic material are (1) physically separated chromosomal DNA that is naked and can replicate independently called plasmid, (2) DNA pieces that have the ability to move from one place to other within the gene and replicate, termed transposon. Last, the third hypothesis of virus origin is a regressive or degeneracy hypothesis that proposes the origin of viruses take place from a parasitic cell that sheds all genes that were not required for the support of parasitism. The root of viral origin also traced from the nucleoprotein world that transiently existed during the transition of the RNA world to the modern DNA-RNA-Protein world according to different hypotheses. The appearance of RNA viruses took place either due to reduction or escape from the RNA containing primitive cells. These RNA viruses are also considered as the evolutionary starting point for some of the DNA viruses. The origin of viruses considered to be in the early phase of the evolution of life, when the living cells first evolved. Since then the existence of viruses has been proposed. This could be a reason why viruses have the ability to affect the cells from all three kingdoms of life that are Eukarya, Archaea, and Bacteria. The primitive viruses and their quick evolution propose a possible explanation for the lack of homology among the major viral proteins and proteins of cellular organisms.

Classification of viral protein and their functions

Viruses contribute to the evolution of life through their ability to promote horizontal gene transfer and discovering DNA and its mechanism of replication among different life forms. The amalgamation of foreign genes often from unrelated organisms and modification in replication machinery leads to continuous evolution and genetic diversity. The contribution of virally originated DNA fragments in the genetic material of humans is between 3% and 8%. Origin of few DNA replicating proteins through viral sources and their successive transfer in the cellular organisms advocate the key role of viruses in the formation of DNA and subsequent development of replication mechanism. These viral-mediated developmental processes were essential for the evolution of the eukaryotic nucleus and potentially the development of three domains of life. A new classification for the life forms present on the Earth has been proposed. According to this classification all ribosome encoding organisms that include Archaea, Bacteria and Eukaryotes are kept in one class and all viruses are included in separate class of capsid encoding organisms that dependent on ribosome-encoding host for completion of their life cycle and contain nucleic acids and proteins and also possess the ability of self-assembly into nucleocapsids.

Viral structural proteins

The viral capsid is the protective coat surrounding the viral genome. Protein monomeric subunits termed as capsomers or protomers combine together to build the shell structure of the capsid. A tight association of RNA or DNA based genome to capsid protein results in the formation of a nucleoprotein complex. Nucleoprotein complex of viruses has the capability to interact with both nucleic acids and proteins, thereby possessing multifunctionality. Capsid structure is determined by the arrangement of capsomers. On this basis capsid could be of helical, icosahedral, or complex in shape. Highly ordered helical structure is a property of the capsids of helical, rod-shaped and filamentous viruses that are generally formed around a central axis with a single type of capsomer packaging. The genetic material of viruses made of RNA or DNA occupies the central cavity of the capsid, the positive charge of capsid protein and negative charge of viral genome maintain an electrostatic interaction between them. There is a variation in the size of helical viruses, which could be very long and flexible or very short and rigid. Capsid length of the helical viruses is defined by their genome size, whereas their diameter is defined by the size and arrangement of capsomers. Well-known illustrative examples among filamentous viruses are Sulfolobus islandicus filamentous virus (SIFV), Tobacco mosaic virus (TMV), Acidianus filamentous virus 1 (AFV1), and bacteriophage fd. In icosahedral viruses, the capsids are either icosahedral or nearly spherical with icosahedral symmetry. Although the number of capsomers required in the formation of such an icosahedral structure theoretically is calculated to be 60, in reality, in the majority of icosahedral viruses it is above the 60. Viral capsids are often made up of more than one capsid protein. For instance, capsid of Human papillomavirus (HPV) is made of major (L1) and minor (L2) capsid proteins. In the case of icosahedral viruses, the capsid is made up of more than 60 identical subunits. To develop the icosahedral shape, the same protein in different sites shows different symmetries. This intriguing puzzle has been the topic of long-lasting debates on how the identical subunits with identical unique 3D structures fit into different symmetries in different environments. Few viruses have complex capsid structures that are neither completely helical nor icosahedral and contain some extra structures, such as protein tails or complex outer walls. The example of one of the best-studied complex viruses is T4 bacteriophage. The characteristic feature of this virus is an icosahedral head on the top of the helical tail. A structure of a hexagonal base plate with extended and protruding proteinaceous fibers occurs at the end of its tail. T4 virus attains the ability to bind host bacterium and successfully transfer its genome into it due to this tail structure that acts as a molecular syringe. Lipid membrane of viral capsid is acquired from the host by certain viruses. The membrane-coated capsid of these viruses is known as the viral envelope that might also contain the viral glycoprotein, for example, gp160 in Human Immunodeficiency Virus (HIV) that contains transmembrane subunit gp41 and structural subunit gp120, proton-selective ion channel and M2 protein of influenza virus and Hemagglutinin (HA) and neuraminidase in other enveloped viruses. The functional role of these surface-incorporated viral glycoproteins is rather diverse. Few among these glycoprotein that protrudes from the lipid bilayer of the virus, for example, neuraminidase (NA), HA and gp120, play a number of important roles in early-stage viral infection typically associated with attachment and penetration of the viruses into the host cells. As stated earlier, viral glycoprotein performs diverse functions related to the life cycle of enveloped viruses. For instance, the M2 proton channel of influenza A virus has a crucial role in the early and late replication cycle of influenza. The exposure of viral content to host cytoplasm requires hydrogen ion to lower the pH. At lower pH, M1 dissociate from the ribonucleoprotein and initiate viral uncoating. The supply of hydrogen ion into the viral particle from endosomes is mediated through the integral homotetrameric membrane protein (M2 proton channel), situated in the viral envelope. This ion channel is proton-selective and is gated by low pH conditions. In enveloped viruses, the viral envelope is attached to their core via matrix proteins. Matrix protein plays their role once the virus enters into the host cell. In addition to expelling the genetic material from the viral core, matrix proteins have various regulatory roles via interacting with host components. For instance, influenza virus matrix protein M1 controls inhibition of viral transcription, its ribonucleoprotein export from nucleus and budding.186, 187

Viral non-structural proteins

Non-structural proteins do not form the capsid structure. Instead, they participate in viral multiplication and have multiple regulatory functions. Below are some illustrative examples of the non-structural proteins of a few viruses and their involvement in crucial viral functions. HPV open reading frames (ORFs) are classified in early (E) and late (L) types on the basis of location within the viral genome. HPV early ORFs code for non-structural proteins. Both E1 and E2 proteins participate in viral replication and regulation of transcription at an early stage. E1 binds to the origin of replication and unveil helicase and ATPase activity,188, 189 while E2 facilitates E1 binding to the origin of replication by forming the complex with it.189, 190, 191 E2 also plays a role of a transcription factor by regulating (both positively and negatively) early gene expression by attaching to the specific recognition sites within the upstream regulatory region (URR).192, 193 A differentiation-dependent productive phase of the viral life cycle is promoted by the highly expressed protein E4 that is involved in a number of important functions.194, 195, 196 In vitro studies found that E5 has weak transforming capabilities.197, 198 It disrupts the MHC class II maturation and is involved in the HPV late functions.200, 201 E6 and E7 proteins are primarily involved in the progression of HPV-mediated malignant cells that ultimately cause invasive carcinoma. Their role in high-risk HPVs is to act as partial oncoproteins at least by targeting the cell cycle regulator/tumor suppressor p53 and Rb. Another example that demonstrates the diversity of functional roles attributed to the non-structural proteins is given by Hepatitis C Virus (HCV), where the interaction of non-structural protein with the hVAP-33 (VAMP-Associated Protein A), which is a human cellular vesicle membrane transport protein, lipid raft membranes, and with each other leads to the formation of the HCV RNA replication complex also called HCV replicon. In the diversity of their functional roles, immunomodulation is also demonstrated by non-structural proteins. The non-structural protein NS1 of West Nile virus (WNV) has displayed is presence in the immunomodulation, as concluded by experimental finding that both cell-surface associated, as well as soluble NS1 was able to bind and recruit the complement regulatory protein factor H. Due to this activity, there is a decrease in the complement activation that minimizes the targeting of WNV by immune system via decrease in the infected cells complement recognition. The immune modulation role is also exhibited by rinderpest virus non-structural C protein, but via a different mechanism. In rinderpest virus action of type 1 and type 2, interferons, which are responsible for the induction of innate immune response, are specifically blocked by non-structural C protein. It has been determined that many non-structural V proteins of paramyxovirus have shown their roles in countering the response of antivirals. At last, gene transactivation may require viral non-structural proteins. For instance, the autonomous parvovirus minute virus of mice (MVM) non-structural protein NS1 is required the activation of p39 promoter that controls the transcription of a gene that encodes capsid protein. Gene that code for NS-1 also codes NS-2 due to overlapping transcription unit in MVM virus. This gene is transcribed by a P04 promoter.

Viral accessory and regulatory proteins

Many of the crucial functions of viruses are performed by various accessory and regulatory proteins through their involvement in an indirect functional role that ranges from transcription rate regulation of viral gene encoding structural proteins to modification of host cell functions. For instance, the replication of HIV-1 is actively controlled by the production of several accessory (Nef, Vpu, Vif, and Vpr) and two regulatory proteins (Rev and Tat). These regulatory and accessory proteins control the various aspects of the viral life cycle, in addition to regulating the host cell functions, such as gene regulation and apoptosis. A number of accessory proteins are, in fact, responsible for in vivo infection. For instance, Vif protein overcomes the host defense mechanism, while Nef increases the viral pathogenesis by targeting the bystander cells.

Role of bioinformatics in divulging the dark proteome of viruses

Viral proteins contain many unusual features that are lacking in the cellular proteins of other organisms.160, 179 The presence of a specific feature in a viral proteome helps them to adopt to a hostile environment quickly, while providing means for controlling the cellular machinery easily. The absence of corresponding features in the proteome of other organisms might reflect the ancient origin of viruses and their genome from the cellular lineage that is extinct now. In addition to demonstrating various peculiar features, as enlisted in, viral proteome contains frequent short disordered regions that generally lack the hydrophobic residues and lysine, while containing the polar residues and residues that are not involved in the regular secondary structure formation.148, 160 The polar residues are required for the specific recognition and stabilizing the interaction with partner molecule through hydrogen bonding in a bound state and maintain randomness in an isolated state. The loosely packed and disorder-enriched viral proteome resists the negative effect of mutations that is a quite common event in viruses. In order to evaluate the correlation of structure, function, and extent of disorder in the proteome of viruses, Pfam database analysis was carried out. The disordered regions of viruses are mainly attributed to the protein-protein interaction, recognition, signal transduction, and regulation. Viruses hijack the host cell machinery and use it for their specific functions on the basis of their ability to mimic the host protein short linear motif (SLIMs). SLIMs are embedded in disordered regions and play a great number of diverse roles, such as directing proteins to the correct subcellular localization, targeting host proteins for proteasomal degradation, cell signaling, deregulating cell cycle checkpoints, and altering transcription of host proteins. Based on the requirements, the proportion of SLIMs varies; hence the number of disordered regions could vary from one viral family to other. Recent studies also determined that there is no specific correlation in the genome size and disordered content in viruses. Bioinformatics plays an important role in divulging the Intrinsic disorderness of small biological machinery owning the replication ability in the host, and establishing the structural, functional and regulation networking as discussed and referenced in the aforementioned paragraph.

Prevalence of IDPs in viruses in context to three distinct domains of life

Many different studies and evaluations of IDPs fraction in evolutionarily distant species were conducted in the last decade.24, 25, 26, 27, 28, 29, 213 Based on the major outcomes, in general, it was concluded that in comparison to prokaryotic proteomes, proteomes of Eukaryotic species have a higher portion of IDPs and IDRs. The basis for the justification of these observations was the repertoire of the specific function of IDPs/IDRs which are mainly involved in the events of recognition, regulation, and signaling. The regulatory network of eukaryotic organisms, especially those who are multicellular, is explicitly depend on the ability of IDPs/IDRs to perform multiple vital functions.2, 38, 39 Although, as much as the functional basis considered as an important component that acts as a driving force for evolutionary changes, the change of proteome by itself cannot be ignored. The assumption to establish the relationship between morphological complexity and proteome size of the organism is alluring. Although this trend is valid in the case of establishing the difference between eukaryotes and prokaryotes, but cannot be implemented among species of eukaryotes, where the wide variations in nuclear genome size have been reported and termed as the C-value paradox. C-value, which is simply described as the amount of haploid DNA present in the cells of an organism, was described as a significant quantity that could be used to estimate and look into the nature of the gene.214, 215, 216 In comparison to the human genome. The genome size of a plant Paris japonica is nearly 50 times greater; genome sizes of some unicellular Protista are much larger than the human genome. For instance, Polychaos dubium genome is 210 times of human genome and is the largest known genome. Cells of some salamanders contain 40 times more DNA than cells of humans. The mystery of complexity of the relation of eukaryotic genome size and gene number is solved with the discovery of non-coding DNA revealing that the most of the DNA of eukaryotes is non-coding in nature hence cannot be incorporated in genes. This discovery also proposed that the description of organisms should not be solely based on a total number of protein-encoding genes, but the number of encoded proteins should be taken into account. However, the recent finding evidenced the poor correlation between the complexity of a given organism and its proteome size for instance number of proteins in the whole proteome of Nematode, Caenorhabditis elegans is ~ 20,000 and is similar to the number of proteins encoded by the human genome. A study focused on the analysis of predicted intrinsic disorder in the proteome of 3484 organisms including viruses conducted in 2012 revealed the number of significant details of the proteomes of various organisms. Table 1 lists the details of the prevalence of intrinsic Disorder in protein contents of different viruses deposited into the DisProt database.
Table 1

Details the DisProt ID, Uniprot ID, protein name, source organism and identified disordered content.

No.DisProt IDUniProt accessionProtein nameOrganismDisorder content (%)
1DP00003P03265DNA-binding proteinHuman adenovirus C serotype 59.83
2DP00005P03045Antitermination protein NEscherichia phage lambda100.00
3DP00024P03129Protein E7Human papillomavirus type 16100.00
4DP00034P03661Attachment protein G3PEnterobacteria phage fd5.66
5DP00048P03406Protein NefHuman immunodeficiency virus type 1 group M subtype B (isolate BRU/LAI)59.22
6DP00064P03607Capsid proteinSouthern cowpea mosaic virus22.94
7DP00066P27285Structural polyproteinSindbis virus subtype Ockelbo (strain Edsbyn 82-5)9.08
8DP00087P68336Tegument protein VP16Human herpesvirus 2 (strain HG52)27.14
9DP00101P12493Gag polyproteinHuman immunodeficiency virus type 1 group M subtype B (isolate NY5)5.60
10DP00133P03422PhosphoproteinMeasles virus (strain Edmonston)62.72
11DP00148P03347Gag polyproteinHuman immunodeficiency virus type 1 group M subtype B (isolate BH10)10.74
12DP00160P04851NucleoproteinMeasles virus (strain Edmonston)23.71
13DP00182P03087Major capsid protein VP1Simian virus 4014.09
14DP00189P04324Protein NefHuman immunodeficiency virus type 1 group M subtype B (isolate PCV12)21.84
15DP00284P16009Baseplate central spike complex protein gp5Enterobacteria phage T417.74
16DP00288Q06253Antitoxin phdEscherichia phage P1100.00
17DP00410P12497Gag-Pol polyproteinHuman immunodeficiency virus type 1 group M subtype B (isolate NY5)6.41
18DP00419P03176Thymidine kinaseHuman herpesvirus 1 (strain 17)15.69
19DP00424P04325Protein RevHuman immunodeficiency virus type 1 group M subtype B (isolate PCV12)62.07
20DP00447P12579PhosphoproteinHuman respiratory syncytial virus A (strain Long)100.00
21DP00566P13102HemagglutininInfluenza A virus (strain A/Whale/Maine/328/1984 H13N2)6.18
22DP00573P03305Genome polyproteinFoot-and-mouth disease virus (isolate Bovine/Germany/O1Kaufbeuren/1966 serotype O)1.67
23DP00583P16006Deoxycytidylate deaminaseEnterobacteria phage T410.88
24DP00588P27958Genome polyproteinHepatitis C virus genotype 1a (isolate H)2.72
25DP00615Q9WMX2Genome polyproteinHepatitis C virus genotype 1b (isolate Con1)3.39
26DP00627Q05323Hexameric zinc-finger protein VP30Zaire ebolavirus (strain Mayinga-76)4.86
27DP00629Q07097NucleoproteinSendai virus (strain Fushimi)23.66
28DP00640Q89933NucleoproteinMeasles virus (strain Edmonston B)24.00
29DP00673P06935Genome polyproteinWest Nile virus3.06
30DP00674Q69422Genome polyproteinHepatitis GB virus B5.38
31DP00675P19711Genome polyproteinBovine viral diarrhea virus (isolate NADL)2.56
32DP00685Q98157Viral macrophage inflammatory protein 2Human herpesvirus 8 type P (isolate GK18)25.53
33DP00686Q9IH62Glycoprotein GNipah virus3.65
34DP00697Q9IK92NucleoproteinNipah virus25.00
35DP00698O89339NucleoproteinHendra virus (isolate Horse/Autralia/Hendra/1994)25.00
36DP00699Q9IK91PhosphoproteinNipah virus57.26
37DP00700O55778PhosphoproteinHendra virus (isolate Horse/Autralia/Hendra/1994)57.14
38DP00726Q5UPJ7Tyrosine—tRNA ligaseAcanthamoeba polyphaga mimivirus6.07
39DP00741P03040Regulatory protein croEscherichia phage lambda16.67
40DP00750Q3815139 proteinBacillus phage SPP146.83
41DP00764O89467Protein TatEquine infectious anemia virus69.23
42DP00808P24937Pre-protein VIHuman adenovirus C serotype 557.60
43DP00820O73557RING finger protein ZLassa virus (strain Mouse/Sierra Leone/Josiah/1976)58.59
44DP00842P12506Protein TatHuman immunodeficiency virus type 1 group M subtype D (isolate Z2/CDC-Z34)100.00
45DP00847P20220Protein F-112Sulfolobus spindle-shape virus 134.82
46DP00849Q9Q8E9M156RMyxoma virus (strain Lausanne)43.14
47DP00850P36932IntegraseEscherichia phage P234.42
48DP00871A4ZNR2Nuclear export proteinInfluenza A virus100.00
49DP00875P69723Virion infectivity factorHuman immunodeficiency virus type 1 group M subtype B (isolate HXB2)27.08
50DP00876P14340Genome polyproteinDengue virus type 2 (strain Thailand/NGS-C/1944)2.95
51DP00895P03421PhosphoproteinHuman respiratory syncytial virus A (strain A2)42.32
52DP00898P13338RNA polymerase-associated protein Gp33Enterobacteria phage T436.61
53DP00919C6KEI3Protein NefHuman immunodeficiency virus 150.00
54DP00929P04608Protein TatHuman immunodeficiency virus type 1 group M subtype B (isolate HXB2)100.00
55DP00932P35926Recombination enhancement function proteinEscherichia phage P140.86
56DP00939P04859PhosphoproteinSendai virus (strain Harris)7.57
57DP00947O10609Protein E7Human papillomavirus type 4571.70
58DP00948P59595NucleoproteinHuman SARS coronavirus42.42
59DP00965P0C6L3Small delta antigenHepatitis delta virus genotype I (isolate D380)75.90
60DP00976P04578Envelope glycoprotein gp160Human immunodeficiency virus type 1 group M subtype B (isolate HXB2)12.27
61DP00978P35961Envelope glycoprotein gp160Human immunodeficiency virus type 1 group M subtype B (isolate YU-2)55.28
62DP00986Q8QWD4VP4Enterovirus D6840.58
63DP00998Q05127Polymerase cofactor VP35Zaire ebolavirus (strain Mayinga-76)8.53
64DP00999P03315Structural polyproteinSemliki forest virus11.33
65DP01012P27392Protein P16Enterobacteria phage PRD131.62
66DP01013P68927ExcisionaseEscherichia phage HK02230.56
67DP01016Q20MD5Matrix protein 2Influenza A virus (strain A/Udorn/1972 H3N2)21.88
68DP01031Q99IB8Genome polyproteinHepatitis C virus genotype 2a (isolate JFH-1)3.10
69DP01039Q85258PolyproteinPotato virus Y31.65
70DP01043Q80FJ1Membrane fusion protein p14Reptilian orthoreovirus24.00
71DP01059Q71FK2Coat proteinPepino mosaic virus8.44
72DP01060A8CDV5Latent membrane protein 2AEpstein-Barr virus (strain GD1)100.00
73DP01087Q1PAB4Protein TatHuman immunodeficiency virus 1100.00
74DP01129P12296Genome polyproteinMengo encephalomyocarditis virus0.61
75DP01142O92972Genome polyproteinHepatitis C virus genotype 1b (strain HC-J4)8.54
76DP01150P03255Early E1A proteinHuman adenovirus C serotype 548.10
77DP01151P03259Early E1A proteinHuman adenovirus A serotype 12100.00
78DP01186P03086AgnoproteinJC polyomavirus43.66
79DP01188Q5XXP4Polyprotein P1234Chikungunya virus (strain 37997)0.69
80DP01245P12823Genome polyproteinDengue virus type 2 (strain Puerto Rico/PR159-S1/1969)0.59
81DP01256Q32ZE1Genome polyproteinZika virus2.22
82DP01295Q98XH7Protein TatHuman immunodeficiency virus 1100.00
83DP01305P08392Major viral transcription factor ICP4Human herpesvirus 1 (strain 17)2.23
84DP01336P03709DNA-packaging protein FIEscherichia phage lambda29.55
85DP01391P03520PhosphoproteinVesicular stomatitis Indiana virus (strain San Juan)22.64
86DP01393P04880PhosphoproteinVesicular stomatitis Indiana virus (strain Mudd-Summers)22.64
87DP01394P04879PhosphoproteinVesicular stomatitis Indiana virus (strain Glasgow)22.64
88DP01395Q8B0H3PhosphoproteinVesicular stomatitis Indiana virus (strain94GUB Central America)22.64
89DP01405Q5V913NucleoproteinInfluenza B virus12.50
90DP01428P03120Regulatory protein E2Human papillomavirus type 1621.92
91DP01466A4L7I2Non-structural polyproteinChikungunya virus8.04
92DP01468A3RMR8Non-structural polyproteinChikungunya virus8.04
93DP01469A4L7I4Non-structural polyproteinChikungunya virus8.04
94DP01481Q5UPT2Probable uracil-DNA glycosylaseAcanthamoeba polyphaga mimivirus25.41
95DP01512P03050Transcriptional repressor arcSalmonella phage P22100.00
96DP01539O57173Protein F1Vaccinia virus (strain Ankara)22.52
97DP01615P03126Protein E6Human papillomavirus type 1613.29
98DP01616P10104FibritinEnterobacteria phage T423.20
99DP01618P03070Large T antigenSimian virus 4011.72
100DP01621E5LC01LANAHuman herpesvirus 84.74
101DP01625P07567Gag polyproteinMason-Pfizer monkey virus3.04
102DP01642P06492Tegument protein VP16Human herpesvirus 1 (strain 17)27.14
103DP01759Q0GBY3PhosphoproteinRabies virus (strain China/MRV)22.90
104DP01762P03714Head-tail connector protein FIIEscherichia phage lambda35.04
105DP01780P21736Protein E7Human papillomavirus type 4550.94
106DP01806Q67953Large envelope proteinHepatitis B virus24.27
107DP01843P03404Protein NefHuman immunodeficiency virus type 1 group M subtype B (isolate BH10)12.14
108DP01928P03254Early E1A proteinHuman adenovirus C serotype 281.66
109DP01929P17763Genome polyproteinDengue virus type 1 (strain Nauru/West Pac/1974)0.29
110DP01930P29990Genome polyproteinDengue virus type 2 (strain Thailand/16681/1984)0.62
111DP01931Q2YHF0Genome polyproteinDengue virus type 4 (strain Thailand/0348/1991)0.86
112DP01983Q9Q8N4Probable host range protein 2-3Myxoma virus (strain Lausanne)26.11
113DP01984B4Y891Capsid protein VP1Adeno-associated virus29.06
114DP02042Q98325Viral CASP8 and FADD-like apoptosis regulatorMolluscum contagiosum virus subtype 122.41
115DP02051P14335Genome polyproteinKunjin virus (strain MRM61C)0.52
116DP02071P04383Capsid proteinCarnation mottle virus23.28
117DP02128P06437Envelope glycoprotein BHuman herpesvirus 1 (strain KOS)12.39
118DP02194P68466Protein K7Vaccinia virus (strain Western Reserve)16.78
119DP02203Q9Q6P4Genome polyproteinWest Nile virus (strain NY-99)0.61
120DP02204Q5UB51Genome polyproteinDengue virus type 3 (strain Singapore/8120/1995)0.77
121DP02208A0A140GKJ0TAP transporter inhibitor ICP47Human herpesvirus 137.50
122DP02212P05769Genome polyproteinMurray valley encephalitis virus (strain MVE-1-51)0.70
123DP02256P26554Protein E6Human papillomavirus type 517.28
124DP02261P13848Capsid assembly scaffolding proteinBacillus phage phi2920.41
125DP02291P04486Tegument protein VP16Human herpesvirus 1 (strain F)16.12
126DP02334Q98148Kaposi's sarcoma-associated herpes-like virus ORF73 homologHuman herpesvirus 84.56
Details the DisProt ID, Uniprot ID, protein name, source organism and identified disordered content.

Continuous spectrum of the proteome size space

Analysis of IDPs of the 3484 proteomes of different species resulted in the observation of the continuous spectrum of the proteome size space among the proteomes of eukaryotes, bacteria, archaea, and viruses, as wonderfully depicted in Fig. 1A. Eukaryotes demonstrate wide-scale variations in the size of their proteome that form proteins whose number ranges from 4000 for unicellular species to ~ 20,000 for multicellular species. Bacterial proteomes have a number of proteins in the range of 500–8000, with only a small portion of bacterial species having proteome size less than 1500 proteins. The archaeal proteomes are condensed to the much narrow range of 1500–3000 proteins. Proteomes of viruses are very compact, being limited to less than 1000 proteins. Log-based plot analysis (Fig. 1B of ) determines that the only one polyprotein is possessed by the greater than 200 viruses and the number of viruses whose genome encode proteins between 15 and 30 is limited in comparison to the viruses with other sizes of the proteome. So far, nine large mimiviruses are known, each containing more than 500 proteins. The size of the proteome of these mimiviruses is so large, that we can say that it is of nearly equal size to the proteome size of some small bacteria. Therefore, the continuous spectrum of a size of proteome arrange in the order of viruses to archaea, to unicellular eukaryotes and lastly to multicellular eukaryotes. The proteome of bacterial species overlapped with the proteome of viruses, archaea, and unicellular eukaryotes.

Disordered residues fraction in various proteomes

Bacteria

Disorder protein content in the majority of bacterial species is estimated to be between 18% to 28%, which is quite low. Although the small number of bacteria shows disorder content as high as 35%, this value represents the lower boundary of the fraction of disordered residues predicted for both unicellular and multicellular eukaryotic organisms (Fig. 1, Fig. 2 of ).

Archaea

Based on the estimated disordered content in Archaea, this kingdom can be split into three classes. Class one consists of the organism whose proteomic disordered content range from 12% to 21%, and 61 organisms such have been analyzed. Class two consists of 4 organisms whose disordered content varies from 21% to 32%. The last class has the 8 organisms with the estimated variation in their disordered content being reported to range from 32% to 38%. The comparatively higher percentage of disorder in the class three species is attributed to the peculiarities of their habitats. As confirmed by the studies, the high disordered bearing archaeal species are halophiles and methanophiles. Generally, the global disorder predictors are developed on the basis of the training set of non-halophilic proteins under the normal physiological conditions of 100–150 mM NaCl. The accuracy of determined IDRs for the proteins of the extremophilic microorganisms surviving under the hypersaline conditions with the help of such predictors might vary. Actually, since halophilic microorganisms are the salt-loving extremophilic organism, their optimum growth occurs in the salt-rich environment. A strategy used by these microorganisms to maintain an appropriate osmotic environment in their cytoplasm is “salting- in”. Through this, they accumulate molar concentration of chloride and potassium. Extensive adaptation in the intracellular proteins is required for this strategy to tackle the presence of excessive salt concentration, as at near saturating salt concentration they should maintain proper conformation and activity. The proteomes of these “salting-in” organisms are highly acidic in nature and corresponding proteins possess remarkable structural instability in low salt conditions, while possessing soluble and active conformations in a hypersaline (Salt rich) condition that are usually detrimental to proteins of non-halophilic organisms. Furthermore, a salt-rich environment determines the structure to function capability. In similarity to their physiological environment, excessive salts and water bind to proteins of these organisms in solvent conditions that depend upon the acidic amino acid residues present on the protein surface.222, 223, 224, 225, 226, 227, 228, 229, 230 Considering the aforementioned reasons, it could be suggested that prediction of high disorder in these organisms may simply represents prediction error.

Eukaryotes

The analyzed disorder levels among non-viral proteomes revealed that unicellular and multicellular eukaryotes generally have the highest amount of IDPs/IDRs in their proteomes. Comparative fractional analysis of disorder for them range between 35% and 45%. However, a group of unicellular eukaryotes has levels of disordered residues in the range of 45–50%. The organisms included in this group are Cryptococcus neoformans (CRYNE, DISORDER%, 47.1), Neurospora crassa (NEUCR, DISORDER%, 48.2), Plasmodium falciparum (PLAF7, DISORDER%, 49.5), Plasmodium yoelii (PLAYO, 46.0%), and Ustilago maydis (USTMA, 49.9%). The observed high variability and high levels of predicted disorder are in line with the earlier study that revealed enrichment of predicted disorder in early-branching protein, while comparing it to typical eukaryotic proteins structure submitted in Swiss-Prot database and ordered proteins from PDB. As much as twice the fraction of IDRs with ≥ 30 disordered residues is found in some protozoa, in comparison to Swiss-Prot database-based representative set of proteins. If it will be compared with similar regions from a PDB select 25 set of proteins, it would be sevenfold increase. It is noteworthy that more disordered proteins were found in parasitic protozoa than in non-parasitic protists. For instance, 35% proteins encoded by genes present on the chromosomes 2 and 3 of P. falciparum were predicted to contain long IDRs (i.e., longer than 40 residues). Although more recent study revealed that the data on the amount of disorder in P. falciparum was underestimated, proposing that 52–67% proteins of this organism contain long disorder regions. The latest study examines the prevalence of disorder in the proteome of many apicomplexan parasites, the obtained result demonstrated that the primate malaria parasite (P. knowlesi) and human malaria parasites (P. falciparum and P. vivax) contain more disordered regions in comparison to rodent malaria parasite. Additionally, more disorder was reported in the proteins expressed at a sporozite stage of P. falciparum in comparison to those expressed in the other stages of their life cycle. It has been proposed that a high abundance of disorder in the proteome of this unicellular organism is related to its adaptation to changing environment during its whole life-cycle, as it is able to affect many different hosts. In simple words, we may say that the abundance of intrinsic disorder in the apicomplexan parasite evolves as a way to adopt a parasitic life style. Overall observance of various proteomes of different life forms and their disorder contents revealed that with the increase in the proteome size, the lower bound fractions of disordered content appear to increase continuously, whereas the upper bound fractions of disordered residues decrease in viruses and increase among the bacteria, archaea, and eukaryote. Therefore, the species whose proteome size falls between 1000 and 2000 proteins have the least variance of the fraction of disordered residues. Nevertheless, if the variance of a fraction of disordered residues is measured by different domains of life, the largest variance comes to 70% and it would be for viruses, whereas for multicellular eukaryotes variance comes to 12% which is smallest.

Viruses

There is a variation in the fraction of disordered protein residues among viral proteomes as shown in Fig. 1 of reference. For example, avian carcinoma virus proteome has the highest fraction of disordered residues (77.3%), while human coronavirus NL63 has very low fraction of disordered residues (7.3%). Few species of viruses are highly rich in disordered residues. There are 20 small viruses that encode ≤ 5 proteins in their proteomes and that have disorder content 50% or greater. In viruses, it appears that with increasing proteome size, the disorder content converges in the range of 20–40%. The prediction of the high content of intrinsically disordered residues in viruses found to be in great agreement with a study showing that many proteins of bacteriophage, viruses, bacteria, and archaea are significantly depleted in the hydrophobic residues and enriched in polar (hydrophilic) residues in their sequences. A portion of IDRs in viruses is likely to evolve to support their ability to deal with their hostile habitat, in addition to be profoundly involved in functioning of their proteins. Still, other IDRs have evolved to deal with the alternative splicing, antisense transcription, and gene overlapping in a way that makes more efficient use of genetic material. Polar residues have the ability of specific recognition and could establish a strong hydrogen bond with partner molecules contrary to the non-specific hydrophobic interactions. An increased amount of polar residues in viral proteins could be linked to increasing demand for disorder in the unbound state and specific recognition and stabilization inbound states.31, 210

Predicted IDPs pattern relation to viral transmission and host tropism

A model has been proposed to categorize the different coronaviruses on the basis of the distribution patterns of IDPs within their Nucleocapsid (N) and Membrane (M) proteins. This categorization allows the quick determination of transmission behaviors (Route, mode, and mechanisms) of various coronaviruses regardless of their genetic proximity. For instance, the shell rigidity has been reported in the viruses transmitted by the oral-fecal route because rigidity in shell protein protects the virions from damage, rigidity in shell protein is directly linked to intrinsic disorder of N and M protein. Envelope protein gp120 of HIV-1 contains both ordered and disordered regions. V3 loop represents a disordered region that is important for controlling the immune cell receptor chemokines co-receptor mediated entry. Chemokines co-receptors CCR5 (R5), CXCR4 (X4) or Both (R5X4) used by the viruses are known as R5, X4, and dual tropic respectively. HIV-1 variant, while infecting the host, uses the different chemokine receptors. Switch from R5 to X4 is related to disease progression and pathogenesis, however, the reason for switching is majorly unknown. Xiaowei Jiang et al. hypothesized that this change is associated with sequence variation and intrinsic disorder. Detailed analysis by the same group using the nonparametric statistical approach determined that there is an increased disordered propensity in the V3 domain, while switching from the dual/R5 tropic to the X4 tropic virus. This increased structural disorder of the V3 domain is associated with HIV-1 cell tropism. The aforementioned study forms the basis for the identification of different hidden patterns with respect to IDPs and their association with viral distinguished characteristics.

Aggregation in viral protein and its relation to intrinsic disorderness

Host cellular machinery hijacking and modulation of regulation network/components often results in the formation of insoluble inclusions/aggregates that usually contains the viral structural components. These viral-mediated aggregates utilize the viruses to build the large complex containing both viral and host protein assembly for promoting viral replication, transcription, and translation and Intra/Intercellular transport. The aggregated structure housing the viral-host assembled complex protects it from cellular degradation mainly. Although the complete role and mechanism of function of these aggregates with respect to specific viruses are not completely understood, however, in most cases, the pattern of aggregates and their associated characteristics helps in unraveling the behavior, quantification, and identification of viruses. However, deep understanding and establishing an association between aggregation behavior and intrinsic disorder might provide the surplus information pertaining to the viral infection. Fig. 4 demonstrates the analysis of intrinsic disorder predisposition and intrinsic propensity for aggregation (and intrinsic solubility) in Japanese encephalitis (JEV), Enterovirus-71 (EV-71) and ZIKV genome polyproteins.
Fig. 4

Comparative analysis of IDPs contents to intrinsic aggregation (and Intrinsic solubility) propensity in three viruses genome polyprotein; JEV (UniProt id: P27395), EV-71 (UniProt id: Q66478) and ZIKV (UniProt id: A0A024B7W1) that impacted India). A, C, E represents IDPs content (and MoRF) propensity in JEV, EV-71, and ZIKV respectively determined by IUPred2A while B, D, F show Aggregation propensity (and Intrinsic solubility) in JEV, EV-71 and ZIKV respectively determined by CamSol method Vendruscolo lab software.238, 239 A relation between IDPs content and aggregation propensity and viral infection pattern could be established.

Comparative analysis of IDPs contents to intrinsic aggregation (and Intrinsic solubility) propensity in three viruses genome polyprotein; JEV (UniProt id: P27395), EV-71 (UniProt id: Q66478) and ZIKV (UniProt id: A0A024B7W1) that impacted India). A, C, E represents IDPs content (and MoRF) propensity in JEV, EV-71, and ZIKV respectively determined by IUPred2A while B, D, F show Aggregation propensity (and Intrinsic solubility) in JEV, EV-71 and ZIKV respectively determined by CamSol method Vendruscolo lab software.238, 239 A relation between IDPs content and aggregation propensity and viral infection pattern could be established.

Functional prominences of disordered viral proteins: Examples from bacteriophages, plant, and animal viruses

Viral proteins are atypical in nature due to their poor homology to the proteins of modern cells, which proposed viruses are very primitive. While evading the defense mechanisms of the host, it is compulsory for the viruses that they are able to survive outside and inside the host and also be able to quickly adapt to fast-changing surroundings. In order to keep the pact of quick adaptation with the fast-changing environment, viruses undergo a very high mutation (for RNA viruses it is 10− 5 to 10− 3 nucleotide exchange per generation and for DNA viruses it is in the range of 10− 8 to 10− 5). This much higher rate of mutation in viruses is due to the lack of RNA repair mechanisms. On average, mutation rate in Bacteria and eukaryotes is 10− 9 nucleotide exchange per generation, which is comparatively low. The viral genome is quite compact, and there is an overlap of many reading frames, a single mutation might affect more than one viral protein. During various stages of their life cycle, viral proteins usually interact with multiple components of the host cells, starting from the early entry to formation and exit of new infectious viral particles. In order to perform crucial functions associated with their life cycle events, viruses interact with host nucleic acid and proteins, even though the large gaps exist in between viral and host protein.178, 240 The aforementioned features incite curiosity to look into more details of their unique characteristics from the biophysical perspectives. The extent of the presence of intrinsic disorder in the viral proteome provides the corresponding plasticity that confers numerous functional advantages. The flexibility of an IDP/IDR and the lack of compact rigid structure enable it for multiple interactions. IDR binding promiscuity is facilitated by various mechanisms, with the operability of theses mechanisms depending upon its extent of the flexible linking property. This property of flexible linking provides an additional advantage to the viral proteins for eluding the host immune system and making it difficult for the host immune system to properly recognize the epitope. High disorder in viral proteome can be a way to deal with high frequency of mutations. Deleterious effects of mutation buffered by the high adaptability and low interaction between amino acids (flexibility) of IDPs. This is because the unstructured IDPs has less to lose when substitution takes place than a highly ordered structure that might have more impact on substitution. It is clearly evident that viral proteins can be benefited from flexibility garnered by disordered residues but not all the viral proteins have IDRs nor they are IDPs. There is a relation between disorder content and location of a protein within the virion, and a comparative analysis of disorder predictors used in the analysis of viral proteins confirms it. Such a study has begun with the construction of a database including viral proteins from HIV and Influenza-related viruses that followed by the protein sequence comparison, structure prediction, as well as function and location within the virion. The outcomes (particularly for influenza virus) demonstrated a correlation between the proximity to the RNA core of the virion and the levels of disorder in protein, where the closer protein is located to core the higher disordered percentage it would have. This finding of a relation between disorder and proximity to the core could be explained on the basis of more interactions with viral RNA. It has been found that nucleic acid-binding proteins are commonly disordered or at least have disordered regions at the site of nucleic acid binding. In the case of the HIV, the correlation between proximity to the core and high disorder content has not to be observed possibly due to the presence of enzymes around the core region that are predominantly structured proteins.243, 244, 245 The matrix protein of both HIV and influenza A viruses have rather different disorder contents. The HIV matrix protein is predicted to be highly disordered, while Influenza A virus protein is less disordered or somewhat ordered. Concerning the Surface protein disorder, it was found that the surface protein gp120 of HIV has less disorder content across all analyzed strains, while gp41 found to be highly disordered. Surface proteins of Influenza A virus NA and HA are predicted to be mostly disordered. However, the subsequent studies revealed that predicted disordered content vary among subtypes and suggested that this variability could have a link to the virulence level.241, 246

Flexible promiscuity of viral proteins

IDPs can made interaction with several distinct partners due to their conformational flexibility and property of interaction adaptability. When a single IDR binds to many partners, then it converts themselves in many different structural forms. IDPs demonstrate different interaction modes, either being able to form a very stable complex structure or transiting between the interacting partner as dynamic bound and unbound state acting as an on-off switch in signaling pathways. Depending upon the surrounding environment, IDPs adopt different conformations and functions accordingly. Binding promiscuity is an important characteristic and required feature of the for viral proteomes, since despite encoding many proteins, viruses explicitly require host cell machinery to complete their life cycle, and in doing so, binding promiscuity is helping them to fulfill this role. The binding promiscuity and interaction types have well explained in the earlier paragraphs of this chapter. The compact genomes of viruses restrict them to encoding fewer proteins, but the presence of IDRs or global disorder allowed proteins to be involved in different tasks by interacting with various partners. With a few given examples, it would be easy to understand how the binding promiscuity of viruses is related to their intrinsic disorder. The replication of the RNA genome of hepatitis delta virus (HDV) requires the translation of a single basic protein known as the delta antigen (δAg). δAg is a small protein containing 195 amino acid residues and has no known enzymatic activity, although being essential for the replication of viral genome. Experimental CD measurement and computational research via disordered protein meta-predictors have proven this protein to be an IDP. Completion of the HDV replication cycle of the depends on this protein and various components of the host cell. Therefore, it is easy to understand the importance of binding promiscuity of δAg, that interact with multiple components in the host cell for various reasons and through a different approach, although the exact purpose of these interactions is still unclear and studied widely.249, 250 In an in vitro analysis, it was found that δAg binds to RNAs and even dsDNA in addition to binding to HDV RNAs that shows a lack of specificity in δAg protein. HCV NS5A protein that is involved in viral replication and viral particle assembly makes another example.110, 251 NS5A is a membrane-associated protein that has both disordered and ordered regions, an anchor attaches its N-terminal region to the membrane, but its cytoplasmic regions are mostly disordered and contain three domains. Among these three domains, domain I (D1) is highly conserved and has ordered sequence, while domain II (D2) and III (D3) are highly disordered and less conserved.253, 254 Promiscuity of NS5A is well studied, and some of the interactions that involve its disordered domain have been identified. D2-associated binding motifs that appear to affect the host regulation pathways, such as apoptosis and signaling demonstrate distinct interaction patterns described in detail. A third example of binding promiscuity was described for the Measles virus (MeV) Nucleoprotein (N) that forms the nucleocapsid of the virus. Intrinsically disordered regions are located at the C-terminal of N-protein, that make interaction with phosphoprotein of the viral polymerase complex and perform functions required in replication and transcription. Besides interacting with phosphoprotein for crucial processes, N-protein interacts with several host components, including cellular receptor and cellular cytoskeleton through its C-terminal tail. Phosphoprotein of MeV is an important cofactor of polymerase complex and requires for recruitment of transcriptional machinery through its long disordered regions that it contained. It has been observed that when IDRs of both phosphoprotein and N-protein binds, the major extent of flexibility disappeared, although some flexibility still presents that represent remaining disorder within the complex. This finding in N-protein suggests that its IDRs act as a platform for the interaction with various protein partners for the completion of cellular processes. The common feature of the structural disorder has successively shown in the nucleoprotein of Paramyxoviruses. Disordered (Intrinsically unstructured) components were found together with structural components in proteins like nucleoprotein and phosphoprotein of Hendra and Nipah viruses.

Intrinsic disorder in viral proteome regions affected by alternative splicing and overlapping reading frames

In due course of evolution to maximize the use of the limited genome in regulatory and structural protein, viruses adapted sophisticated genetic organization and mechanisms such as alternative splicing of polycistronic RNA which are necessary for the expression of the regulatory viral proteins in controlled manners. Viruses also evolve their genetic constitution, genomic structure and mechanism of transcription and replication to efficiently use both positive and negative and even ambisense transcription. Among examples of such viruses are human T-cell lymphotropic virus type 1 (HTLV-1), a delta-retrovirus that causes HTLV-1-associated myelopathy, adult T-cell leukemia (ATL), and Strongyloides stercoralis hyperinfection. Economic usage of the genetic material of HTLV-1 is due to the wide accumulation of intrinsically disordered proteins in its proteome. This is paralleled to the occurrences of intrinsic disorder in HIV-1 protein, where intrinsic disorder was observed in post-translational cleavage sites leading to the production of Gag, Pro and Pol from Gag-pro and Gag-pro-pol grand polyproteins and cleavage sites of polyproteins that yield MA, NC, CA, RT, TM, IN, and SU proteins.

Intrinsic disorders in viral genome-linked proteins

In few viruses, a protein named viral genome-linked protein (VPg) is bound to 5′ end of their RNA genome through a phosphodiester bond formed between the hydroxyl group of Thr/Ser/Tyr residues and 5′ phosphate group of RNA.260, 261, 262 VPg's are highly diverse in terms of their size and sequence. For example, in Comoviridae and Picornaviridae members it is 2–4 kDa, Caliciviridae, Sobemoviruses, and Potyviridae members it is 10–26 kDa, while it is up to 90 kDa in Birnaviridae members. VPg plays a key role in major steps of the viral life cycle, such as cell-cell movement, replication, and translation. Since VPg performs these crucial functions either in its mature or precursor form, VPg precursor processing represents one of the regulatory mechanisms of its multi-functionality. The multitude of interactions with different viral and host proteins define VPg multifunctional role. The different interactions made by VPgs are: VPg to itself, cylindrical inclusion helicase, cylindrical inclusion protein, nuclear inclusion protein b, helper component protease, coat protein or eukaryotic translation initiation factors eIF4A, eIF4E, eIF3, and eIF4G, and the poly(A)-binding protein.262, 264, 265, 266, 267, 268, 269, 270, 271, 272 Poly-functionality and binding promiscuity of VPs' at least to some extent is due to its intrinsically disordered nature. Intrinsically disordered nature of VPg was reported for many viruses through their individual protein characterization. These viruses are: rice yellow mottle virus (RYMV), Sesbania mosaic virus (SeMV), potato virus Y (PVY), potato virus A (PVA), and lettuce mosaic virus (LMV).262, 273, 274, 275, 276 The computational analysis showed that functionally important disordered VPg representative of viral diversity includes four members of the Caliciviridae family, six potyviruses and six sobemoviruses. The disordered VPg components associated with the regulation of enzymatic activity in different viruses273, 277 in addition to performing specific regulation and transportation of viral RNA from one cell to another.

Intrinsic disorder in matrix proteins and nucleocapsid of HIV-related viruses

In order to determine the intrinsic disorder content in viral proteins, bioinformatic studies were carried on a few viruses matrix proteins.241, 279 This study revealed that matrix proteins p17 of SIVmac and HIV-I possess high disorder content, while low disorder was observed in the matrix protein of equine infectious anemia virus (EIAV). Matrix protein p17 of HIV-I, also known as MA protein, is 132 amino acid long polypeptide that lines the inner surface of the virion membrane and holds the RNA containing viral core at its place. The N-terminal part of the p17 matrix protein is myristylated.280, 281 p17 associated with the inner leaflet of the viral membrane and form the protective shell and participate in virion assembly. A targeting signal for the Gag polyprotein transport to plasma membrane is provided by co-translational myristylation of p17 N terminus.280, 281 A specific feature; i.e., the presence of a set of basic residues within the first 50 amino acid residues of p17, enable its involvement in membrane targeting. In addition to performing the number of functions in the viral replication cycle, it could be involved in nuclear import possibly through its specific nuclear localization sequence. HIV-I nucleocapsid protein is 55 residues long protein that contains two zinc finger domains flanked by linker comprised of basic amino acids, which is required for nucleic acid interaction.285, 286 This nucleocapsid covers the genomic RNA inside the virion core. The important function of nucleocapsid is in viral genomic RNA assembly; it binds to the signal sequence of full-length RNAs and transports them into the assembling virion. Within the virion, nucleocapsid binds to ssRNA non-specifically due to its highly charged basic regions and protects it from nuclease besides compacting it. Nucleocapsid also acts as a chaperone for viral RNA and facilitates the several steps of the viral life cycle associated with a nucleic acid, such as the melting of secondary structure within RNA, annealing of t-RNA primer, stimulating integration and promoting the DNA exchange reactions during reverse transcription.157, 158, 159 Computational prediction reveals that p7 is a highly disordered protein except for a few regions that are corresponding to the zinc finger domain and possess ordered structure identified as α-MoRFs. Flexible nature of p7 (NC) explains its multiple functional roles, such as participation in RNA chaperoning and viral replication.

Replicative complex of Paramyxoviridae and Rhabdoviridae members: Intrinsic disorder and disorder-to-order transitions

Paramyxoviridae and Rhabdoviridae are the members of the mononegavirales order consisting of viruses with non-segmented ssRNA genome of negative polarity. In mononegavirales, genome is tightly encapsidated by the nucleoprotein within a helical nucleocapsid. The viral nucleocapsid serves as a substrate for both replication and transcription. Both replication and transcription are performed by the viral RNA-dependent RNA polymerase (RDRP) that consists of complex formed between the viral large protein (L) and phosphoprotein (P). P protein acts as an essential polymerase cofactor and recruits the L-protein onto the nucleocapsid template. Beyond its role as a polymerase cofactor, it also acts as chaperone for the N-protein in a way that it prevents their illegitimate self-assembly when genomic RNA synthesis does not occur and maintain them in a soluble form (N°) within a complex (N°−P) and used for the encapsidation of Nascent RNA chain during replication. The significant functional importance of N and P protein appears due to their involvement in numerous protein-protein interactions within the internal (viral) and external (Host) PPI networks. Multiple biological functions occur due to this interactability. Including modulation of both acquired and innate immunity. Experiments have proven the abundance of disorder in the N and P protein of these viruses. The persistence of disorder in the C-terminal domain of nucleoprotein (NTAIL), even after complex formation, indicates potential role of this region in binding,259, 291, 292 as described in case of MeV NTAIL, whose first 20 amino acids interacts with cellular nucleoprotein receptor293, 294 and C-terminal region interact with the major inducible heat shock protein Hsp 70 that leads to both viral replication and transcription. The disordered nature of NTAIL in measles and Hendra viruses has also confirmed in the context of full-length N protein that formed Nucleocapsid like particle (NLP) when expressed in the heterologous system.296, 297, 298 Initially, it was thought that the C-terminal X domain (XD) of the phosphoprotein triggers major conformational rearrangement within nucleocapsid, and this leads to the access of the viral polymerase to RNA genome.259, 292, 299 However recent NMR studies rule out these possibilities and provide the first direct observation of the interaction between XD and intact nucleocapsid in the Paramyxoviridae. The disordered NTAIL region is partially exposed at the surface of the nucleocapsid and provides a way for interaction with numerous protein partners. Indeed, MeV NTAIL interacts with various viral protein partners, such as P, P-L complex, and matrix protein. Besides interaction with viral components, it also interacts with host cellular components, such as Interferon regulatory factor 3 (IRF3), hsp70, peroxiredoxin 1, casein kinase II, the cell protein responsible for the nuclear export of N, and possibly the components of the cell cytoskeleton.305, 306 Additionally, the NTAIL of MeV nucleocapsid released from infected cells binds to the cell receptors involved in MeV-induced immunosuppression.293, 294 The P protein disorder was reported in both Paramyxoviridae157, 158, 307, 308, 309, 310, 311 and Rhabdoviridae.312, 313, 314 P protein in the members of these families possesses a very high modular organization that consists of alternate ordered and disordered regions. In Paramyxoviridae, P protein possesses a large disordered region (4000 residues) at its N terminal (PNT) domain. Several interactions made by the PNT domain of MeV and Sendai virus (SeV) have been reported, such as the PNT domain of MeV interactions with N and cellular protein,315, 316 SeV interacts with an unassembled form of N (N°) and L protein.317, 318 While the C-terminus nucleocapsid binding region of P adopts compact folded stable conformations in members of Rhabdoviridae and majority of Paramyxovirinae, it remains disordered in the respiratory syncytial virus which is a member of Pneumovirinae subfamily.311, 319 The N-terminal region of P protein from Rhabdoviridae and Paramyxoviridae that is involved in binding to N° has been reported to contain the α-MoRF.158, 312, 313, 320 This induced folding upon the binding effect in a form of the α-MoRF is limited to vesicular stomatitis virus (VSV), a rhabdovirus. The structure of VSV N°−P complex was solved and verified that although the binding region adopts an α-helical configuration, the flanking regions remain flexible. P protein α-MoRF binding occurs at the same site that is responsible for RNA and different N protein binding, thereby preventing the polymerization of N protein. These results provide a link between different processes and possibly explain the mechanism of initiation for viral RNA synthesis. In MeV, limited proteolysis study carried out in secondary structure stabilizer (TFE) provided evidence for the disorder to order transition of disordered N-terminal region of P (PNT). The presence of disordered domains in both P and N proteins leads to the controlled dynamic interactions in a coordinated manner between template nucleocapsid surface and polymerase complex that could extend further over the successive turns of the helix. The long disordered regions in viral proteins enable them to act as a potential linker between the binding partner and participate in large macromolecular assembly acting as a scaffolding engine.322, 323

Intrinsic disorder in capsid proteins

IDRs provide more flexibility, hence help in the quick conformational changes of proteins required for the capsid assembly of viruses. For instance, the VP-4 protein of the Foot-and-mouth disease virus (FMDV) contains low structure content however plays a crucial role in capsid assembly. As most viral proteins have synthesized in the form of the polyprotein, the presence of IDRs at the proteolytic sites make digestion easy and faster and generate independent functional chains.324, 325 The presence of IDRs in viral proteins provides a self-driven mechanism of self-assembly due to the aforementioned property.

Intrinsic disorder in Flaviviridae core proteins

Flaviviridae family members are non-segmented single-stranded positive-sense viruses, whose genome size varies between 9.6 and 12.3 kb. Viral genera Flavivirus, Hepacivirus, and Pestivirus come under the family of Flaviviridae. N-terminal region of viral core protein is highly basic and makes interaction in a sequence-specific manner with RNA to accomplish the various functions. The core protein is released from the rest of the poly-protein to initiate the functions required for further maturation and multiplication of viruses. RNA chaperoning activity of core protein is confirmed in in vitro assays, additionally it is responsible for packaging and condensation of viral genomic RNA during viral morphogenesis. Core protein mediates several interactions with host proteins for viral persistence and pathogenicity and simultaneously involves itself in functions related to viral replication. Biophysical and biochemical studied done so far on the Flaviviridae family confirmed the widespread use of core protein IDRs in its member viruses despite having the low sequence similarity and other pronounced differences in their modular organization.

Disordered capsid protein of ZIKV and DENV

Capsid protein of DENV and ZIKV are found to be highly disordered with respect to other proteins encoded by their genome. The disorder content is found to be 33.3% and 36% in ZIKV and DENV, respectively. This high amount of disorder suggests the exclusive involvement of these regions in the mechanism of viral-mediated functions at the battlefront of host and pathogens. The ZIKV capsid major functions are nucleocapsid assembly and involvement in the viral infection processes by interacting with cellular proteins, modulating cellular metabolism, apoptosis, and immune response. Major functions of the Capsid protein of DENV are RNA binding and RNA chaperone activity, nucleocapsid assembly, lipid droplet accumulation and interaction with host components. Despite major knowledge on the functions and disorder status of capsid proteins of DENV and ZIKV, the exact mechanism of IDR-mediated control of various functions of this protein is yet to be discovered. Fig. 5 demonstrates the MoRF position of (A) ZIKV and (B) DENV capsid proteins predicted by the MoRFchibi SYSTEM HTML server. A pattern of position and number of MoRFs could be analyzed in detail in the capsid proteins of these viruses to identify the factors associated with their specific functions.
Fig. 5

The capsid protein MoRF position (toggled gray bars) predicted by the MoRFchibi system for (A) ZIKV (UniProt id: Q32ZE1| 1-104); (B) DENV (UniProt id: P33478| 1-100). Three MoRF regions of different lengths have observed in the capsid protein of both viruses, located within disordered areas, these MoRF regions play a crucial role by recognizing, interacting and inducing a conformational change to viral as well as host proteins.

The capsid protein MoRF position (toggled gray bars) predicted by the MoRFchibi system for (A) ZIKV (UniProt id: Q32ZE1| 1-104); (B) DENV (UniProt id: P33478| 1-100). Three MoRF regions of different lengths have observed in the capsid protein of both viruses, located within disordered areas, these MoRF regions play a crucial role by recognizing, interacting and inducing a conformational change to viral as well as host proteins.

The fd phage coat protein pVIII undergoes transitions from order to disorder form

Fd bacteriophage, filamentous in shape, belongs to the Invorus genus and infects enterobacteria, such as E. coli.332, 333 The coat protein of Fd phage undergoes the transition from the state of disordered to ordered and ordered to disordered to regulate the molecular mechanism of its penetration and assembly. The structural transition in FdpVIII coat protein indicates that there is involvement of MG (partial disorder) intermediate in the process of macromolecular assembly and disassembly.

Capsid protease: An illustrative example of an intrinsically disordered enzyme in Semliki forest virus

The IDRs play their role in the activation and deactivation of the enzymatic property of viral proteins, as in the case of the Semliki forest virus (SFV). SFV belongs to the Alphavirus genus that has enveloped positive-strand RNA with an icosahedral nucleocapsid and spherical morphology.335, 336 The N-terminal region of SFV polyprotein (residues 1–267) is an intramolecular serine protease that cleaves itself off after the Trp267 from the rest of the polyprotein segment and provides a mature capsid protein. After this auto cleavage process, the free carboxyl group of Trp267 interacts with catalytic triad consisting of amino acid His145, Asp167, and Ser219 and leads to inactivation of the enzyme.

Intrinsic disorder in the nucleocapsid protein of SARS-CoA

Nucleocapsid protein (N) of Severe Acute Respiratory Syndrome Coronavirus (SARS-CoA) plays a crucial role in its viability and packaging of its genomic RNA. However, the exact mechanism of binding of N protein to genomic RNA is not completely understood. Two domains present in N protein NTD and CTD are flanked by long stretches of disordered regions that counts for almost half of the entire length. Both domains through their flanking disordered regions bind to RNA. Although low sequence homology reported in different coronavirus N protein through bioinformatics study, flexible linker region of N protein of all coronaviruses started with SR-rich region and end with region enriched with basic residues. These features are the hallmarks of the protein disorder. The overall isoelectric point (pI) of these flexible linkers is high, which is self-explanatory for their RNA binding abilities. The aforementioned findings suggest that the physiochemical features are likely to be conserved across different groups of Coronaviridae. This observation highlights the role of intrinsic disorder in N protein whether it be multisite nucleic acid binding or RNP packaging.

Intrinsic disorder in influenza virus surface glycoproteins

Surface glycoprotein is required for the fusion of viral membrane with host membrane, hence mediating the way of entrance to the target cell.339, 340, 341 One of the best examples of the most studied membrane fusion proteins is the influenza virus HA. HA is homotrimeric type I transmembrane surface glycoprotein responsible for the binding of viruses to the host receptor, their internalization and subsequent membrane fusion events within the endosome of the infected cell. Presence of HA at the viral surface in high numbers make it the most abundant antigen that contains primary neutralizing epitopes for antibodies. Recent bioinformatics study revealed that although many viral membrane proteins are universally ordered, intrinsic disorder is still present in these proteins pointing out that IDRs might have crucial functions. For instance, influenza A virus virulent strain 1918 H1N1 and H5N1 differ from less virulent or nonvirulent strain H3N2 and 1930 H1N1 in their disordered content of the HA protein.

Intrinsic disorder influenza virus non-structural protein 2

It has been observed that during viral replication, non-structural protein 2 of the influenza virus interacts with nuclear export machinery. It behaves as an adaptor molecule between viral ribonucleoprotein complex and the viral nuclear export machinery. Various techniques such as differential scanning calorimetry (DSC), hydrodynamic techniques, and limited proteolysis demonstrated the presence of high levels of disorder in this protein.

Intrinsic disorder in human adenovirus type 5 early transcription unit 1B

A set of proteins comprises early transcription unit 1B (E1B) encoded by human adenovirus type 5. These proteins participate in several important viral functions, such as viral replication and adenoviral-mediated cell transformation.344, 345 An interesting feature demonstrated by this set of proteins is that they are expressed from the overlapping reading frames of the 2.28 kb E1B-mRNA through alternative splicing that takes place between common splice donor and one splice acceptor site among three possible sites. This results in the encoding of proteins from mRNAs having common N-terminus and different C-terminus.346, 347 This feature determines one of the names of these proteins, E1BN proteins. Computational analysis along with NMR and CD determines that E1B-93R is a typical IDP, and the N-terminal region within E1B and other E1BN proteins is likely to be intrinsically disordered.

Intrinsic disorder in non-structural HCV proteins

HCV NS5A, a key protein involved viral replication that plays a role in viral particle assembly. Numerous interactions made by NS5A with viral and host proteins have been reported. NS5A is a membrane associated protein that possesses an anchor at its N-terminal region with C-terminal region being divided into three different domains, D1, D2, and D3. D1 is highly conserved and is less disordered, while D2 and D3 are less conserved and are highly disordered.252, 253, 254, 349 High disorder content defines the dynamic behavior of D2 and D3 that makes them a hub-like a center for multiple interactions. NS5A-D2 is important for NS5A function and is involved in molecular interaction with RDRP (NS5B) and PKR. The interaction established by NS5A-D2 interferes with host signaling pathways and apoptosis. Although NS5A-D3 is mostly disordered, it contains short ordered elements at its N-terminus. In a recent study, NS5A-D3 proteins from two HCV strains were found to exhibit a propensity to partial folding into an α-helix. NMR analysis revealed two putative α-helices for that a molecular model could be proposed. The first α-helix conservation in all genotypes and its amphipathic character suggest that it could be corresponding to MoRE and hence promote the interaction with a suitable biological partner(s). One such partner is Cyclophilin A (CypA). Cyclophilins are cell factors crucial in HCV replication. Interestingly, Cyclosporin completely abrogates the interaction between HCV NS5A-D3 and CypA. CypA together with NS5A and NS5B forms the crucial component of multi-protein complex and supports RNA transcription and replication.

Intrinsic disorder in the HDV basic protein δAg

Among many animal viruses know so far, HDV has the smallest RNA genome that code for single protein known as δ-antigen (δAg). From a structural perspective, this protein comprises of the coiled-coil domain, a nuclear localization signal (NLS) and RNA binding domain. δAg is self-oligomerize to yield dodecamers structure associated with HDV genomic RNA.248, 353 Computational and experimental analysis of eight clades of HDV shows the high disorder of this protein.

Intrinsic disorder in HIV-1 accessory and regulatory proteins

Tat protein of HIV-1 is an important factor in viral pathogenesis that serves as a transactivator of viral transcription. The activity of Tat is dependent on its interaction with the Transactivation response region (TAR), whose example is a short nascent stem bulge loop leader RNA. TAR present at 5′ extreme of all viral transcripts. Tat protein display typical characteristics of IDPs that include the high net charge to low global hydrophobicity. Intrinsic disorder of Tat is also proven by CD and NMR studies. Rev. protein also plays a regulatory role in HIV-1. This is a basic protein of 116 residues in length that belongs to the ARM family of RNA-binding proteins. Rev. binds to the Rev. Response element (RRE) of viral mRNA in the cytoplasm of the host cell, and, therefore, Rev. is essential for viral replication Monomeric Rev. adopts MG state as confirm by Hydrodynamic and Spectroscopic studies. Recent biophysical studies of Rev. ARM associated with RNA binding suggest it is intrinsically disordered not only in the isolated state but also when embedded into oligomerization deficient Rev. Mutant.

Intrinsic disorder in non-structural HPV E6 and E7 proteins

The large family of papillomavirus (PV) includes small DNA viruses infecting mammals, reptile, and birds. At least 100 different types of HPV are reported to date that act as a cofactor in the development of carcinoma of head, neck, genital tract and epidermis and also cause the papillomas and benign wart. HPV classified into two classes on the basis of its association with cancer. The first category includes low-risk viruses (HPV-6, HPV-11), and the second category contains high-risk viruses (HPV-16, HPV-18, and HPV-45) types. Similar to all DNA tumor viruses, HPV hijacks the replication machinery and forces the infected cell to enter into the S phase of the cell cycle. The transforming activity of high-risk HPVs is mainly exerted through their E7, which is one of their two oncoproteins. E7 is responsible for pathogenesis and maintenance of human cervical cancer and has been determined to participate in numerous cellular processes including DNA synthesis, transcription, transformation, cell growth, and apoptosis. E7 interacts with Rb, which is a tumor suppressor protein, and interferes with its tumor suppression activity. Rb acts as guardian of the cell cycle due to its involvement into the control of G1/S transition. Therefore, Rb is critical for determining the progression of the cell into the normal phase or transformation. Besides interacting with proteins of Rb family, E7 also interacts with histone deacetylase, kinase p33CDK2 and cyclin A, protein phosphatase 2A (PP2A), and the cyclin-dependent kinase inhibitor p21cip1 protein. PP2A is sequestered and excluded from its interaction with protein kinase B (PKB) or Akt due to its involvement in the formation of a complex with E7. PKB is one of several second messenger kinases that is activated via cell attachment and growth factor signaling and that sends a signal to the cell nucleus to prevent apoptosis, thus leading the way toward cell survival during proliferation. The interaction between PP2A and E7 leads to the inhibition of PKB/Akt dephosphorylation that keeps the PKB/Akt signaling activated. E7 protein broad range molecular interactions depend on the flexible disordered region present within the E7. Previous studies performed on recombinant E7 reveal that its structure can be described as the elongated dimer that changes conformation upon a small change in pH, while gaining α-helicity by exposure to solvents. Biophysical characterization of E7 from HPV-45 with far-UV CD and NMR revealed that its N-terminal region (E7N, amino acids 1–40) is disordered, while its C-terminal domain is well structured (41–98) with a unique zinc-binding fold. The Intrinsically unstructured N-terminal region of E7 contains binding and Casein kinase II phosphorylation sites.292, 366, 367 The CD spectra recorded for the different conformations as a function of temperature and pH indicated a polyproline II-like structure. The structural stability is maintained by phosphorylation that results in increased transformation activity in the cell. Transforming protein E6 and E7 of high-risk HPVs incorporate high amounts of intrinsic disorder.

Intrinsically unstructured N protein of λ bacteriophage

In λ bacteriophage, its N protein (λN) plays an important role in the transcription of the gene. The absence of this protein leads to the reduction in the phage genome transcription to 2% with the only transcription of the early gene. λN protein positively regulates the transcription of λ bacteriophage and promotes the expression of a gene located downstream to the termination signal. λN acts as an anti-terminator transcription factor and in doing so, it binds to an RNA sequence (the box B segment) and multiple proteins in the transcription complex, where it serves as an important regulator of antiterminator complex that allows transcription through termination sites during phage gene expression. The interaction between host bacteria RNA polymerase and factor NusA to λN has been also observed. λN demonstrate all features of unstructured flexible protein that are typical to IDPs. These features include high net charge and low hydrophobicity, as well as structural asymmetry determined through various experiments.370, 371, 372, 373, 374

Intrinsic disorder in the Hordeivirus movement TGBp1 protein

Plant viral infection spreads from one infected position to another through special proteins known as movement proteins (MPs) that facilitate the movement of viruses within the plant body. These MPs possess a wide range of functions. They interact with the viral proteins and RNA to form ribonucleoprotein complex that facilitates cell to cell and long-distance movement of the viral genome in the plant and helps in the interaction with cytoskeleton components and endoplasmic reticulum. Three types of movement protein that are TGBp1 (528 residues), TGBp2 (204 residues), and TGBp3 (155 residues), encoded by “triple gene black” (TGB) are reported in hordeiviruses. The N-terminal region of TGB1 of Barley stripe mosaic virus (residues 1–180) are predicted to be highly disordered, whereas C-terminal is not as shown in Fig. 6 .
Fig. 6

Intrinsic disorder prediction in TGB1 protein (Uniprot Id: P04867) of Barley stripe mosaic virus by IUpred2A server.

Intrinsic disorder prediction in TGB1 protein (Uniprot Id: P04867) of Barley stripe mosaic virus by IUpred2A server.

Summary and outlook

This chapter summarizes the current knowledge on the protein intrinsic disorder phenomenon, discusses various peculiar features of IDPs, including their involvement in PPI networks, other biological roles and introduces different disorder predictors. It also discusses some details of the intrinsic disorder perspective of viruses, the role of IDPs and IDRs in the virus-facilitated host mechanisms, prevalence of the intrinsic disorder in viral proteomes, and functional prominence of disordered viral proteins. The role of IDRs in various structural and non-structural proteins of viruses, such as capsid, nucleocapsid, genome-linked surface glycoproteins, matrix and accessory, and regulatory proteins have been summarized. IDPs/IDRs role in specific function-oriented proteins in different viruses have been elaborated, such as membrane-binding protein λN of bacteriophage, hordeivirus movement protein TGBp1, influenza virus non-structural protein 2, bBasic protein δAg of HDV, and Human adenovirus type 5 early transcription unit 1B. Also, the importance of intrinsic disorder for the alternative splicing and overlapping reading frames of viral proteome is discussed. Viruses mainly cause pathogenesis by hijacking the cell machinery and modulating its functions, e.g., by altering IDP components involved in the host cell cycle control mechanism. Viral IDPs mediate successful infection and regulate pathogenesis at multiple levels. Therefore, the knowledge of intrinsic disorder and structural flexibility in processes of virus-host interaction and associated functions is crucial for better understanding of viral pathogenesis. The involvement of IDPs/IDRs in the mechanism of viral infection is not completely understood. Therefore, this chapter would allow readers to get better understanding of the importance of IDPs/IDRs in various functional mechanisms/viral components, which are essential for the completion of crucial phases of the viral life cycle. Finally, the IDPs/IDPRs of viruses are considered as potential drug targets, due to their high prevalence in viral proteomes and ubiquitous involvement in host-pathogen mediated regulations. In conclusion, the involvement of IDPs in viral pathogenesis should be solemnly considered for unlocking the complex riddles of viral infection and associated patterns, their cellular control, and exploitation strategies, and drug development approach in near future by targeting their disordered regions.
  360 in total

1.  The protein non-folding problem: amino acid determinants of intrinsic order and disorder.

Authors:  R M Williams; Z Obradovi; V Mathura; W Braun; E C Garner; J Young; S Takayama; C J Brown; A K Dunker
Journal:  Pac Symp Biocomput       Date:  2001

2.  Structure of the molecular chaperone prefoldin: unique interaction of multiple coiled coil tentacles with unfolded proteins.

Authors:  R Siegert; M R Leroux; C Scheufler; F U Hartl; I Moarefi
Journal:  Cell       Date:  2000-11-10       Impact factor: 41.582

3.  How Do Intrinsically Disordered Viral Proteins Hijack the Cell?

Authors:  H Jane Dyson; Peter E Wright
Journal:  Biochemistry       Date:  2018-06-28       Impact factor: 3.162

4.  Protein intrinsic disorder and human papillomaviruses: increased amount of disorder in E6 and E7 oncoproteins from high risk HPVs.

Authors:  Vladimir N Uversky; Ann Roman; Christopher J Oldfield; A Keith Dunker
Journal:  J Proteome Res       Date:  2006-08       Impact factor: 4.466

5.  Structure and function of the influenza A M2 proton channel.

Authors:  Sarah D Cady; Wenbin Luo; Fanghao Hu; Mei Hong
Journal:  Biochemistry       Date:  2009-08-11       Impact factor: 3.162

6.  Potato virus A genome-linked protein VPg is an intrinsically disordered molten globule-like protein with a hydrophobic core.

Authors:  Kimmo I Rantalainen; Vladimir N Uversky; Perttu Permi; Nisse Kalkkinen; A Keith Dunker; Kristiina Mäkinen
Journal:  Virology       Date:  2008-06-03       Impact factor: 3.616

7.  The matrix protein of measles virus regulates viral RNA synthesis and assembly by interacting with the nucleocapsid protein.

Authors:  Masaharu Iwasaki; Makoto Takeda; Yuta Shirogane; Yuichiro Nakatsu; Takanori Nakamura; Yusuke Yanagi
Journal:  J Virol       Date:  2009-08-05       Impact factor: 5.103

8.  Protein disorder prediction: implications for structural proteomics.

Authors:  Rune Linding; Lars Juhl Jensen; Francesca Diella; Peer Bork; Toby J Gibson; Robert B Russell
Journal:  Structure       Date:  2003-11       Impact factor: 5.006

9.  Interaction of Sesbania mosaic virus movement protein with VPg and P10: implication to specificity of genome recognition.

Authors:  Soumya Roy Chowdhury; Handanahal S Savithri
Journal:  PLoS One       Date:  2011-01-05       Impact factor: 3.240

10.  Flexible nets: disorder and induced fit in the associations of p53 and 14-3-3 with their partners.

Authors:  Christopher J Oldfield; Jingwei Meng; Jack Y Yang; Mary Qu Yang; Vladimir N Uversky; A Keith Dunker
Journal:  BMC Genomics       Date:  2008       Impact factor: 3.969

View more
  16 in total

Review 1.  Melatonin: Regulation of Viral Phase Separation and Epitranscriptomics in Post-Acute Sequelae of COVID-19.

Authors:  Doris Loh; Russel J Reiter
Journal:  Int J Mol Sci       Date:  2022-07-23       Impact factor: 6.208

2.  Genetic Diversity of Rice stripe necrosis virus and New Insights into Evolution of the Genus Benyvirus.

Authors:  Issiaka Bagayoko; Marcos Giovanni Celli; Gustavo Romay; Nils Poulicard; Agnès Pinel-Galzi; Charlotte Julian; Denis Filloux; Philippe Roumagnac; Drissa Sérémé; Claude Bragard; Eugénie Hébrard
Journal:  Viruses       Date:  2021-04-23       Impact factor: 5.048

Review 3.  Proteome expansion in the Potyviridae evolutionary radiation.

Authors:  Fabio Pasin; José-Antonio Daròs; Ioannis E Tzanetakis
Journal:  FEMS Microbiol Rev       Date:  2022-07-01       Impact factor: 15.177

Review 4.  Spacer Domain in Hepatitis B Virus Polymerase: Plugging a Hole or Performing a Role?

Authors:  Caitlin Pley; José Lourenço; Anna L McNaughton; Philippa C Matthews
Journal:  J Virol       Date:  2022-04-12       Impact factor: 6.549

5.  The SARS-CoV-2 nucleocapsid phosphoprotein forms mutually exclusive condensates with RNA and the membrane-associated M protein.

Authors:  Shan Lu; Qiaozhen Ye; Digvijay Singh; Yong Cao; Jolene K Diedrich; John R Yates; Elizabeth Villa; Don W Cleveland; Kevin D Corbett
Journal:  Nat Commun       Date:  2021-01-21       Impact factor: 14.919

6.  The structure of a plant-specific partitivirus capsid reveals a unique coat protein domain architecture with an intrinsically disordered protrusion.

Authors:  Matthew Byrne; Aseem Kashyap; Lygie Esquirol; Neil Ranson; Frank Sainsbury
Journal:  Commun Biol       Date:  2021-10-06

7.  The Virulence Factor p25 of Beet Necrotic Yellow Vein Virus Interacts With Multiple Aux/IAA Proteins From Beta vulgaris: Implications for Rhizomania Development.

Authors:  Maximilian M Muellender; Eugene I Savenkov; Michael Reichelt; Mark Varrelmann; Sebastian Liebe
Journal:  Front Microbiol       Date:  2022-01-24       Impact factor: 5.640

8.  Poxviruses and paramyxoviruses use a conserved mechanism of STAT1 antagonism to inhibit interferon signaling.

Authors:  Callum Talbot-Cooper; Teodors Pantelejevs; John P Shannon; Christian R Cherry; Marcus T Au; Marko Hyvönen; Heather D Hickman; Geoffrey L Smith
Journal:  Cell Host Microbe       Date:  2022-02-18       Impact factor: 21.023

9.  Zinc and Copper Ions Differentially Regulate Prion-Like Phase Separation Dynamics of Pan-Virus Nucleocapsid Biomolecular Condensates.

Authors:  Anne Monette; Andrew J Mouland
Journal:  Viruses       Date:  2020-10-18       Impact factor: 5.048

10.  Positive selection and intrinsic disorder are associated with multifunctional C4(AC4) proteins and geminivirus diversification.

Authors:  Carl Michael Deom; Marin Talbot Brewer; Paul M Severns
Journal:  Sci Rep       Date:  2021-05-27       Impact factor: 4.379

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.