Literature DB >> 36051878

Host-pathogen protein-nucleic acid interactions: A comprehensive review.

Anuja Jain¹, Shikha Mittal^1,2, Lokesh P Tripathi^3,4, Ruth Nussinov^5,6, Shandar Ahmad¹.

Abstract

Recognition of pathogen-derived nucleic acids by host cells is an effective host strategy to detect pathogenic invasion and trigger immune responses. In the context of pathogen-specific pharmacology, there is a growing interest in mapping the interactions between pathogen-derived nucleic acids and host proteins. Insight into the principles of the structural and immunological mechanisms underlying such interactions and their roles in host defense is necessary to guide therapeutic intervention. Here, we discuss the newest advances in studies of molecular interactions involving pathogen nucleic acids and host factors, including their drug design, molecular structure and specific patterns. We observed that two groups of nucleic acid recognizing molecules, Toll-like receptors (TLRs) and the cytoplasmic retinoic acid-inducible gene (RIG)-I-like receptors (RLRs) form the backbone of host responses to pathogen nucleic acids, with additional support provided by absent in melanoma 2 (AIM2) and DNA-dependent activator of Interferons (IFNs)-regulatory factors (DAI) like cytosolic activity. We review the structural, immunological, and other biological aspects of these representative groups of molecules, especially in terms of their target specificity and affinity and challenges in leveraging host-pathogen protein-nucleic acid interactions (HP-PNI) in drug discovery.

Entities: Chemical

Keywords: Drug design; Host pathogen interactions; Immunological response; Protein-nucleic acid interactions; Structural biology; Toll-like receptors

Year: 2022 PMID： 36051878 PMCID： PMC9420432 DOI： 10.1016/j.csbj.2022.08.001

Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN： 2001-0370 Impact factor: 6.155

Introduction

Decoding the detailed mechanisms of infections and immune-related diseases can be helped by investigating the interactions between the molecules of the invading species (pathogens) and the cellular machinery of the invaded (host) organism tasked with counteracting them. These so-called, host-pathogen interactions involves the molecules from the pathogen cell called Pathogen-Associated Molecular Patterns (PAMPs) with those of the hosts called Pattern Recognition Receptors (PRRs). The interactions between PRRs and PAMPs enable the host immune system to discriminate between the self and a foreign body before the stimulation of adaptive immunity. PRRs are either present on the surface or in the interior compartments of various host cell types such as dendritic cells (DCs), epithelial cells, mast cells, monocytes and granulocytes [1]. They are primarily germline-encoded receptors recognizing PAMPs [2], closely associated with danger-associated molecular patterns (DAMPs) [3]. They are also involved in activating transcription factors, acting in the regulation of cytokine expression. There are many different groups of PRRs among which, the most studied are TLRs, NOD-like Receptors (NLRs) and RLRs, C-type lectin receptors (CLRs) and AIM2-like receptors (ALRs) [4], [5]. These PRRs either recognize PAMPs in the nucleus and cytoplasm (such as cyclic GMP-AMP synthase (cGAS)–stimulator of interferon genes (STING) and the RLRs) [6] or an extracellular environment (such as TLRs) [7], [8], [9]. As both of these interactions occur at the onset of the disease, they are attractive targets for potential preventive and therapeutic interventions. The early PRRs recognition of PAMPs is aimed at eliminating the pathogen, preventing its entry into the host cells and triggering an adaptive immune response [9]. The PRR-PAMP host-pathogen interactions trigger a cascade of innate immune response reactions, including kinase pathway activation, production of effector molecules, and selective transcription factor stimulation. These events guide the immune system toward mounting either anti-inflammatory or pro-inflammatory responses [10]. From the molecular standpoint, PRR-PAMP host-pathogen molecular interactions include proteins, nucleic acids, carbohydrates and metabolites. Of these, protein–protein interactions (PPIs) have been widely studied and reviewed [11], [12], [13]. Host-virus and host-bacteria (microbe) interactions such as those involving non-coding RNAs and metabolites have also been well-documented [14], [15], [16], [17]. On the other hand, although many protein-nucleic acid interactions, crucial to the host defence, immune response and pathogen life cycle have been investigated across different species, the information is widely scattered and sometimes incoherent. Analysis and in-depth structure-based understanding of host-pathogen recognition via pathogen nucleic acid fragments or more broadly their genomic DNA is generally lacking in the literature. Although the field of protein-nucleic acid interactions is one of the most actively pursued topics in computational and experimental biology, dominant studies on the subject have largely focused on host–host dynamics [18], [19], [20]. In this review, we aim to provide a comprehensive overview of studies on host-pathogen interactions from the perspective of pathogen nucleic acids and their recognition by host proteins. We observed that an extensive body of literature is available that may provide deep insights into target specificity, systems-level responses and drug targeting of nucleic acid recognition machinery. Currently, most of it is reported in a focused and domain-specific manner, making it inconvenient to develop a holistic assessment, integrating immunological, therapeutic and structural perspectives. Here, we first provide an overview of the nature of the interactions and the diseases in which specific HP-PNIs are implicated. Next, we examine the disease and cellular specificity of common pathogens and their receptors and the therapeutic interventions that are available and being actively pursued. We also survey sequence, structural and expression level studies in the context of individual interactions and high throughput analysis. Finally, we discuss potential applications and future directions in the study of pathogen nucleic acid sensing by proteins.

Types of HP-PNIs

Given the diversity of the interaction sites that pathogen molecules encounter upon gaining cellular entry, we first review the literature on the spatial regulation of pathogen recognition. Nucleic acids (DNA or RNA) are polyanionic molecules. They are intracellular but upon cell death or injury, they are released to the extracellular environment and can stimulate or inhibit host immune response by binding to PRRs. In general, PRRs are protein molecules that interact with a nucleic acid through different types of inter-molecular forces. These include electrostatic interactions (e.g., salt bridges), dipolar interactions (such as hydrogen bonding, van der waals interactions), entropic effects (hydrophobic interactions) and dispersion forces (base stacking) [21]. Often water molecules also facilitate the binding, for example by screening the electrostatic repulsion between similar charges on complementary molecules [22]. The interactions can be sequence-specific (tight) or non-specific (loose) manner [21], [23], [24], [25]. Specific protein–DNA interactions are commonly mediated by an α-helical motif in the protein that inserts itself into the major groove of the DNA, thereby recognizing and interacting with a specific nucleotide sequence. The interactions are typically facilitated by H-bonds and salt bridges [26]. However, concomitant conformational changes in the DNA, sequence-dependent kinking, helical dislocation, untwisting, intercalation, etc., can contribute significantly to this recognition process [27]. Proteins that recognize DNA act through independently folded binding domains such as. Winged helix-turn-helix proteins, composed of two roughly perpendicular α-helices linked by a β-turn or loop; Zinc coordinating proteins, which entail the tetrahedral coordination of 1–2 zinc ions with conserved cysteine and histidine residues in α-helix and 2-stranded β-sheet; Zipper type proteins, such as the leucine zipper, which has an α-helix with a leucine at every 7th amino acid); Other α-helix proteins e.g., those using α-helices as the main binding motif; Other β-sheet proteins, which use β-strands as recognition and binding motifs and β-hairpin/ribbon proteins, which contain small 2- and 3-stranded β-sheets or hairpin motifs that binds with DNA major or minor grooves [28]. Interestingly, some non-enzymatic proteins that do not have a well-defined secondary structural motif use multi-domain subunits for DNA recognition. An enzyme is another group of proteins that recognizes DNA based on their biological function rather than structure. It mainly uses combinations of α-helices, β-strands, and loops to form domains such as DNA-recognition domain that reads sequence, a catalytic domain with the enzyme’s active site; where applicable, a dimerization domain [27], [28]. While the interactions of RNA to proteins are similar to those of DNA, their complex secondary and tertiary structures provide an important additional mechanism. At the detailed structural level, RNA molecule is recognized by RNA-binding modules such as. RNA recognition motifs (RRM): a four-stranded anti-parallel β-sheet with two helices packed in βαββαβ topology and interacts with 4 nucleotides of ssRNA through stacking, electrostatics and hydrogen bonding; hnRNP K homology domain (KH domain): a three-stranded β-sheet packed against three α–helices. It recognizes 4 nucleotides of ssRNA through hydrophobic interactions between non-aromatic residues. Based on its topology it can be further grouped into two subfamilies, type I (βααββα topology) and type II (αββααβ topology); Double-stranded RNA-binding domain (dsRBD), with a shape-specific dsRNA minor-major-minor groove pattern interacting with the sugar-phosphate backbone; Zinc fingers motifs: typically classified based on the residues used to coordinate zinc, cysteine and histidine. For example, ZnF-C2H2, which contains nine C2H2 zinc fingers of which fingers 1–3, 5 and 7–9 interact with DNA through hydrogen binding in the major groove, while fingers 4–6 interact with the 5S RNA through electrostatic contacts to two RNA loops. Another group of Zinc Fingers (ZnF-CCCH) has a stacking interaction between aromatic residues and bases, and Sterile alpha motif (SAM domain) has a shape-dependent recognition of RNA stem-loop, mainly through interactions with sugar-phosphate backbone and a single base in loop [29], [30], [31], [32], [33], [34], [35]. In general HP-PNI are affected by neighboring proteins, small molecules, and physical conditions such as temperature or pH. Such interactions are important to fully understand the physiological processes and pathology of the host, and drug design [21]. When it comes to first line of defense in terms of protein-nucleic interactions, the The TLR family of receptors is the best characterized group of PRRs and most of its members can recognize intracellular as well as extracellular pathogen molecules. Most TLRs have been conserved through evolution [36]. Of these, TLR3, TLR7, TLR8 and TLR9 can recognize extracellular (pathogen) nucleic acids in the endosome [37]. Currently, 13 members are known in the mammalian TLR family [4]. Among those TLR1–TLR9 are conserved between humans and mice, TLR10 is not functional in mice because of a retrovirus insertion, and TLR11, TLR12 and TLR13 are lost in human genomes [380]. Even though the specific association between TLRs or other PRRs towards each pathogen is not fully understood, a broad range of pathways that they activate have been reported. Studies focused on characterizing (1) the nature of the immune response towards the types of involved diseases, (2) sub-cellular location in which the interaction occurs, and (3) types of pathogen molecules recognized by each PRR. For example, extracellular CpG-DNA and RNA have been responsible for the pathogenicity in rheumatoid arthritis, SLE, toxic shock and bacterial sepsis [38], [39], [40]. Blood coagulation under severe tissue damage conditions, caused by secreted nucleic acids has been presented as a possible mechanism of pathogenesis in these diseases [40]. Apart from the extracellular recognition, host-pathogen interactions also take place in various subcellular locations. The most studied cytosolic PRRs are DAI, AIM2, protein kinase receptor (PKR) and the RIG-I [41]. Specifically, TLR3 is reported to recognize dsRNA [42], TLR7 and TLR8 bind to ssRNA [43], [44] and TLR9 identifies DNA-containing unmethylated CpGs [44]. Further DAI and AIM2 recognize dsDNA while PKR and RIG-I respond to single and double-stranded viral RNAs [41]. Aberrations arising due to an under-performing PRR-based recognition system pose a grave threat to hosts against a wide range of pathogens, whereas their hyperactivity poses a potential threat of autoimmune diseases [45]. Some nucleic acid-sensing hyperactivity autoimmune disorders include SLE, Aicardi-Goutieres syndrome, spondyloenchondrodysplasia, and STING-associated vasculopathy with onset in infancy [38], [46], [47]. The roles of host-pathogen interactions in some of these diseases are reviewed below. SLE is an autoimmune disease involving HP-PNIs. TLR7 and TLR9 have been implicated with stimulation of type I IFNs in SLE [48]. The role of TLR7 has been clearly established by showing that its overexpression in lupus in mouse models leads to SLE; its absence protects from the disease [49]. On the other hand, the role of TLR9 in SLE is not as well understood. It is known that theTLR9 plays a significant role in SLE by producing auto antibodies in mice models [50] and its overexpression is observed in SLE patients [51]. However, in contrast to TLR7, deletion of TLR9 in mice results in a more severe disease phenotype suggesting a protective role of TLR9 [52]. The exact synergy or competition between TRL7 and TLR9 and their detailed mechanism of molecular recognition in SLE is still not fully understood. Another autoimmune disease involving protein-nucleic acid interactions is Type-1 diabetes. Among the animal models, studies have shown that in the transgenic rat, insulin promoter (RIP)-B7.1 or RIP-LCMV mice, administration of TLR3 or TLR7 is required for stimulation of Type-1 diabetes [53]. Conversely, TLR3 was shown to protect against the disease occurrence in some studies [54]. TLR9 is also involved in the stimulation of Type 1 diabetes [55]. Again, a complete picture of various TLRs conferring or protecting from this disease remains to be completely understood. Another autoimmune disease involving protein-nucleic acids is Rheumatoid arthritis, mediated by synovial fibroblast’s hyper-activation [56] in synovial tissues and the overexpression of TLR3, TLR7, TLR9 together with TLR2 and TLR4 [57], [58]. TLR3 and TLR4 are hyper-activated during the onset and the end stages, indicating their roles in disease pathogenesis [57], [59], [60], [61]. However, the role of TLR9 appears contradictory in Rheumatoid arthritis [62], with TLR9 expression shown to have triggered the disease [63]. Injection of CpG DNA into mice produced an anti-inflammatory response and prevented arthritis [64], [65]. These studies highlight how TLRs target specific types of nucleic acids to misunderstand self as a DAMP and how the introduction of a competitive DNA can intervene in this process. Inflammatory bowel diseases (IBD) known as Crohn’s disease and ulcerative colitis in the gastrointestinal tract involve TLR2 and TLR4, wherein TLR3 and TLR9 have a protective role [66], [67]. Similar to SLE protection, CpG injections in the murine model reduce the severity of the disease through a pro-inflammatory secretion [67]. TLR9 overexpression is involved in the onset of multiple sclerosis, characterized by immune response and inflammation that leads to neuronal injury [68]. Mice models deficient in MyD88 of the TLR pathways are resistant to experimental autoimmune encephalitis (EAE) whereas mice deficient in TLR9 develop disease with decreased severity suggesting a synergistic role for EAE, MyD88 and TLR9 [69]. cGAS, is also involved in autoinflammatory diseases [70], [71], [72], [73], [74], with the cGAS-STING pathway triggering an anti-tumor immune response [75], [76]. DNA derived cGAS recognizes endogenous tumour cells, triggers the cGAS-STING pathway, production of IFN and acts on CD8+ cells to kill tumour cells [77], [78], [79]. In summary, protein-nucleic acid host-pathogen interactions, are primarily driven by TLRs which are involved in either protecting against or triggering auto-immune diseases due to defective regulation of the immune system. Treatments may include injecting nucleic acid PAMP-like molecules and blocking the interaction of TLRs that recognize nucleic acids. The challenge lies in clearly deciphering the roles of individual TLRs in a disease before we use them as targets. The critical aspect of characterization is whether a TLR is protective against or serves as a promoter of autoimmune response and exact quantification of the consequences of a specific intervention. This leads to the question of the molecular specificity of PRRs beyond the animal models and available clinical data. Such issues can best be investigated by looking at the detailed atomic structures of involved molecules in isolation or complexity with their targets. In the next section, we review the status of knowledge on these very issues.

Specificity and structural basis of pathogen nucleic acid recognition by host PRRs

In the last section, we took a disease-level view of various PRR-PAMP interactions. At the molecular level, nucleic acids from different pathogens are recognized by endosomal and cytoplasmic PRRs. Nucleic-acid recognizing PRRs include endosomal TLRs and cytoplasmic DNA sensors like cGAS, DAI, IFN-γ-inducible protein (IFI16), AIM2 as well as RLRs (RIG-I, Melanoma differentiation-associated protein 5 (MDA5)) and NLRs. Each PRR recognizes a specific class of pathogen nucleic acids. For example, TLR3 recognizes dsRNA [80], TLR7 and TLR8 detect viral ssRNA [81], [82] and TLR9 recognizes CpG motifs in viral and non-viral pathogens [83], [84], [85]. Similarly, cytoplasmic PRRs namely RIG-I, MDA-5 and LGP2 detect dsRNAs, whereas other cytoplasmic DNA sensors recognize dsDNAs highly specialized nucleic acid sensing PRR, the TLR13 recognizes a specific sequence in bacterial rRNA giving it a unique anti-bacterial function [86]. In general, PRR’s actions are mediated by cell-specific and condition-specific adaptors, leading to different downstream host defence pathways. A summary of PAMPs recognizing their respective PRRs, key adaptors involved in PRRs and their, downstream signaling events issignalling, cross-talks and response especially some of representative agonist responses are provided in Fig. 1.

Fig. 1

Nucleic acids are recognized by the pattern recognition receptors (PRRs) through key adaptors. Specific adaptors propagate the downstream signalling and cross-talk with other proteins, leading to the production of Type-I and Type-II IFNs, inflammatory cytokines and chemokines in the nucleus. Colours of dashed lines represents different key adaptors i.e., TRIF (black), MAVS (dark green), MyD88 (brown) and STING (purple). Representative agonists binding with their targets are shown through different shape (bright green). Common agonist shares same shape. Although TLRs detect different types of PAMPs, most are found to have a common horseshoe-shaped structure. They are composed of ectodomain (also called leucine-rich repeat (LRR) domain), transmembrane domain and Toll/IL-1 receptor (TIR) domain (Fig. 2). It is actually the extracellular LRR domain that recognizes the PAMPs, and other ligands directly and hence is used as a drug target. Transmembrane Typically, the LRR domain is composed of 19–25 tandem copies of LRR motifs that contain the ‘xLxxLxLxx’ as well as ‘xΦxxΦxxxxΦxxLx (Φ: hydrophobic)’ motif sequences. Generally, it is 20–30 amino acids long and contains a β-strand and an α-helix linked by loops, causing the horseshoe-like structure of the LRR [380]. Other, transmembrane and the intracellular TIR domains are responsible for the signal transduction. Domain wise availability of PDB structures for all domains of PRRs in humanhumans and mousemice are providedlisted in Table 1. Domain structures need to be listed separately. Although PDB structures are available for PRRs, the reported crystal data are only for a single domain or its fragment and does not tell us about the PRR’s complete folding. We discuss below nucleic acid recognition by specific TLRs areas known from the literature.

Fig. 2

Table 1

List of known PDB structures (A) for all domain of 10 TLRs in Human and Ectodomain of 13 TLRs in Mouse. No structure is available for transmembrane and cytoplasmic domain of any TLRs in Mouse. (B) for specific domain of other PRRs. “_” represent the data are not available.

(A)	Humans				Mouse
TLRs	Extracellular Domain	Transmembrane Domain	Cytoplasmic Domain	All-Domain	Extracellular Domain
TLR1	6NIH (2.3 Å)	_	1FYV (2.9 Å)	_	2Z81 (1.8 Å)
TLR2	6NIG (2.3 Å)	_	1FYW (3.0 Å)	_	3A7C (2.4 Å)
TLR3	1ZIW (2.1 Å)	2MKA (NMR)	_	7C76 (3.4 Å) EM	3CIG (2.6 Å)
TLR4	3FXI (3.1 Å)	5NAM (NMR)	_	_	3VQ2 (2.4 Å)
TLR5	_	_	_	3j0A (26 Å) EM	_
TLR6	_	_	4OM7 (2.2 Å)	_	3A79 (2.9 Å)
TLR7	_	_	_	7CYN (4.2 Å) EM	_
TLR8	3W3G (2.3 Å)	_	_	_	4QDH (2.3 Å)
TLR9	_	_	_	_	3WPF (1.9 Å)
TLR10	_	_	2 J67 (2.2 Å)	_	_
TLR11	_	_	_	_	_
TLR12	_	_	_	_	_
TLR13	_	_	_	_	4Z0C (2.3 Å)

Schematic representation of TLR structure (Assembled structure of TLR3) with highly conserved nucleic acid (dsRNA) sensing LRRs on TLR surface. A number of Leucine enriched, so called Leucine Repeat Regions (LRRs) are observed in the ectodomain (extracellular) in all TLRs. LRR-NT and LRR-CT referred to as the N- and C-terminal of the ectodomain. List of known PDB structures (A) for all domain of 10 TLRs in Human and Ectodomain of 13 TLRs in Mouse. No structure is available for transmembrane and cytoplasmic domain of any TLRs in Mouse. (B) for specific domain of other PRRs. “_” represent the data are not available. TLR3, an endosomal PRR, recognizes polyinosinic-polycytidylic poly(I:C). Earlier reports depicted TLR3 as involved in recognition of viral dsRNA such as reovirus [76], respiratory syncytial virus [87], west nile virus [88], dengue virus [89], Influenza A virus [90], epstein-barr virus [91], hepatitis C virus (HCV) [92] and herpes simplex virus (HSV) [93]. The TLR3-dsRNA complex has a horse-shoe shaped structure with dsRNA bound to the amino and carboxyl termini on the lateral convex surfaces of the TLR3 ectodomains [94], [95]. Synthetic poly(I:C) is an important ligand used to study this system. A typical structure of poly(I:C) and TLR3 interacting residues and their hydrogen-bond and non-bonded contacts to dsRNA (<3.35A) are shown in Fig. 3.

Fig. 3

Ligand bound ectodomain structure of TLR3. Center panel shows the dsRNA bound to TLR3 at N and C terminals through both its chains (A: brown and B: cyan respectively). Inset view shows the middle part of the interaction site, whereas residue-wise interactions of N and C-terminals are shown in the left and right-side panels respectively. (For interpretation of the references to colour in this figure legend, the reader is referred to the web version of this article.) In contrast to TLR3, which recognizes dsRNA, TLR7 and TLR8 recognize ssRNA in viruses and certain bacteria, which are also U-rich [77], [78]. Although TLR7 and TLR8 share high sequence similarity, TLR7 prefers GU-rich ssRNA whereas TLR8 prefers AU-rich ssRNA in humans [96] although, interestingly such a behaviour is not observed in mice. TLR7 and TLR8 contain two ligand-binding sites. In TLR8, the first site binds U while the second site binds to an oligonucleotide like UG; both sites are required for activation of signal transduction [97]. TLR7 is a dual receptor as it can bind G and U-rich ssRNA. The structures of the ssRNA-binding sites of TLR7 and TLR8 differ [98]. TLR7 recognizes a 3-mer UUU motif of a long poly-U ssRNA. Crystal structure of TLR8-dsRNA complex revealed that it undergoes large conformational changes upon ligand binding, bringing the carboxyl terminals close to enable dimerization with TIR domain and stimulate downstream signaling [99]. The third TLR group with nucleic acid recognition function is TLR9, which recognizes ssDNA having an unmethylated CpG motif in bacteria and viruses [100]. TLR9 CpG hexamer motif was first described as “RRCGYY”, where R and Y represent purine and pyrimidine respectively [101]. Following additional fine tuning by other researchers, the CpG motif, GTCGTT and GACGTT have been proposed to be the optimal TLR9 ligands for humans and mice respectively [102], [103]. Notably, TLR9 primarily recognizes the hexamer consensus sequence of ssDNA [104]. Water-mediated hydrogen bonding and vanderWaals interaction are required for the recognition of the CpG motif by TLR9 although TLR9 also recognizes DNA:RNA hybrids, with the ssDNA isolated from DNA:RNA hybrids unable to activate TLR9 [105]. Furthermore, TLR9 recognizes the CpG motif in viruses, including Human Papillomavirus (HPV) [106], Herpes simplex virus (HSV) [107] and also in several bacteria such as salmonella Typhimurium [108] and Mycobacterium tuberculosis (MTB) [109]. Unlike humans, mice express TLR13, an endosomal TLR that recognizes its ligand in a sequence-specific manner, sensing a highly conserved bacterial 23S rRNA sequence that contains 5′-GAAAGACC-3′ [110], [86]. Interestingly, a 13-nt ssRNA derived from 23S rRNA and a viral-derived 16-nt ssRNA, containing the same sequence that bound to TLR13 and folds into a stem-loop-like structure that is responsible for activation of TLR13 [111]. Notably, this sequence is found within a region of RNA targeted by certain antibiotics, and clinical isolates of Staphylococcus aureus resistant to these antibiotics are unable to stimulate mouse TLR13 [86]. Overall, TLR13 functions as a sequence- and conformation-specific PRR [112]. Apart from TLRs, RIG-I, MDA5 and LPG2 receptors are the best-studied cytoplasmic receptors. The domain structure of most RLRs consists of a Caspase recruitment domain (CARD) at the N-terminal, a central ATPase/Dead-box helicase domain and a C-terminal regulatory domain, with some notable omissions in specific groups as shown in Fig. 4. For example, in contrast to RIG-I, LGP2 lacks an N-terminal CARD domain and hence functions as a regulator in the signalling of RIG-I and MDA5 [113]. Crystal and NMR structures revealed the presence of a groove within the C-terminal domain, which represents the ligand binding site [114], [115]. RIG-I, MDA5 and LGP2 recognize different ligands because of the differences in the residues at the bottom of the groove. A crystal structure of LGP2-dsRNA indicated that the initial contacts are made with the 3′- OH whereas RIG-I and LGRP2 NMR data pointed to secondary structural elements in ligand binding [114].

Fig. 4

Domain structure of RIG-I (925aa), MDA5 (1025aa) and LGP2 (678aa) receptors. Both RIG-I and MDA-5 have similar structures consisting of a CARD domain, RNA helicase domain and C-terminal domain but LGP2 consists of only RNA helicase and C-terminal domain and lacks CARD domain. RIG-I recognizes either viral dsRNA (>200 bp) or a base-paired region of 18–20 nucleotides with a 5′ triphosphate end [116]. Previous studies reported that in the hepatitis c virus, RIG-I detected the poly-U/UC motif in the 3′ untranslated region whereas in the 5′-triphosphate in the case of the hantaan virus [117], [118]. Modifications in the sequences of the RNA ligand stimulate the activation of RIG-I [119], [120]. Next-generation sequencing (NGS) in viral infection has shown that RIG-I and MDA5 prefer binding to AU-rich RNAs in viral genomes [121], [122], [123]. Structural analysis of RIG-I indicated that for the recognition of the 5′-ppp end of RNA, the C-terminal binding site of RIG-I must be acidic. MDA5 detects distinct groups of viral RNAs. It has been shown that some viruses are specifically sensitive to RIG-I and MDA-5 while others are sensitive to both RIG-I or MDA5. For example, RIG-I recognizes the Newcastle disease virus, Sendai virus, Influenza virus and Japanese encephalitis virus and MDA5 detects Picorna viruses like Encephalomyocarditis virus, and Theiler’s virus. Viruses sensitive to both RIG-I and MDA5 include the West Nile virus and the Dengue virus [124], [125], [126], [127], [128], [129]. Previous studies had identified LGP2 as a negative regulator of RIG-I and MDA5 signalling [130], [131]. However, more recent studies showed that LGP2 positively regulates these signalling pathways. For example, in LGP2 deficient mice, impaired type-I IFNs production was observed revealing the role of LGP2 as a positive regulator [113]. Structural studies have also suggested that the un-liganded RIG-I and MDA5 have a closed conformation [131] but ligand binding induces open active conformation that oligomerizes in an ATP-dependent manner [132]. In contrast to RNA PAMPs, it was initially believed that the DNA PAMPs are primarily recognized by TLR9 only. However, recent studies have identified additional cytoplasmic receptors which recognize either microbial DNA or self-DNA during cell damage leading to infection and stimulating production of Type I IFNs, Type III IFNs or IL-1β. For example, DAI [133], [134], leucine-rich repeat flightless-interacting protein 1 (LRRFIP1) [135], RNA polymerase III [136], [137], IFI16 [138], extrachromosomal histone H2B [139], DNA-PK [140], and MRE11[141] recognize dsDNA to induce type-I IFNs production. Similarly, DHX9, DHX36 and DDX41 is involved in recognition of DNA with different microbial specificities [142]. Finally, AIM2 and IFI16 have also been established as recognizing cytosolic DNA [143], [144]. The cGAS-STING is another critical dsDNA-sensing PRR that provides an innate immune response to infections, inflammations, and cancers [145], [146]. The dsDNA interacts with cGAS in a sequence-independent manner [147], promoting a conformational change of cGAS to catalyze the formation of 2′,3′-cyclic GMP-AMP (cGAMP). It is a cyclic dinucleotide from ATP and GTP, containing the phosphodiester linkages of both 2′–5′ and 3′–5′ [148]. Activated cGAS and cGAMP synthase activate the STING, comprising an N-terminal transmembrane domain with four helices (aa 1–154), an acidic C-terminal tail (aa 342–379), and a central globular domain (aa 155–341) [149]. This cytosolic DNA sensing by cGAS-STING can induce type-I IFNs production, infiltration of T cells and natural killer (NK) cells [150], [151]. Thus, we conclude that cell surface as well as cytosolic nucleic acid PRRs recognize specific pathogenic genomes. The primary grouping of these PAMPs and PRRs depends on whether the nucleic acid is single-stranded, double-stranded, RNA or DNA, methylated or unmethylated, rich in AU, GU and CpG etc. As we discuss below, these PAMPs’ specific attributes are crucial for designing therapeutic strategies against pathogen-specific PAMPs.

Target recognition, drug design and resistance against HP-PNI

Invading pathogens often hijack the cellular machinery of the host cells and subvert their immune system leading to disease [152]. Evidence indicates that under many conditions, correcting the aberrant nucleic acid sensors can be a robust therapeutic intervention [153]. However, obstacles such as drug resistance can arise [153]. Nonetheless, nucleic acid sensors remain attractive targets [154]. In Table 2, we have listed some representative agonists and antagonists, their mechanisms and effect on HP-PNI targets. Below, we review the status of designing molecular interventions in nucleic acid PRRs.

Table 2

Pharmacological agents mechanism and effects, targeting nucleic acid sensing PRRs: (A) for agonist and (B) for antagonist. “_” represent the data are not available.

(A)
Target	Agonist	Agonist mechanism	Agonist effects	Other agonist bound structure
TLR3	Poly(I:C)	Recruits NK cells and tumor specific CTL cells through maturation of DCs; TNF-related apoptosis of tumor growth.	Therapeutic agent for chronic fatigue syndrome, adjuvant to cancer vaccines, antiviral response in human immunodeficiency virus	3QOQ (TLR3/C1068)3ULV (TLR3/TLR3ecd-2)3ULU (TLR3/TLR3ecd-1)3ULS (TLR3/Fab12)
TLR3	ARNAX	Target immune checkpoint blocker; Promotes cross-priming of DCs	overcome resistance to agents targeting Programmed cell death in mice
TLR7	Imiquimod(R-837)	Reverse the local immunosuppression; Induce secretion of pro-inflammatory cytokines, IFN- α, TNF-α and IL-12	Antiviral agent in cytomegalo virus & herpes simplex virus-2, genital warts, superficial basal cell carcinoma & actinic keratosis treatment	4QC0 (TLR8/compound53)4QBZ (TLR8/compound9)3W3K (TLR8/CL075)3W3J (TLR8/CL097)6KYA (TLR8/TH1027)4R07 (TLR8/ORN06)4R08 (TLR8/ssRNA40)4R09 (TLR8/ORN06S)4R0A (TLR8/uridine)3WN4 (TLR8/DS-877) 6WML (TLR8/GS-9688)7CRF (TLR8/CU-CPD107)5AWC (TLR8/MB-564)5AWA (TLR8/MB-568)
TLR8	Motolimod (VTX-2337)	Improve NK cells ability to mediate antibody-dependent cellular toxicity	Used in head and neck cancer and in chemotherapy with platinum-resistant ovarian cancer
TLR7/8	Resiquimod(R-848)# 3W3N	Stimulates DCs maturation by IL-12 and other Th1 cytokines; Generate CD8 + T cell responses	Limits viral replication in monocytes isolated from human immunodeficiency virus-1 infected individuals; treatment of herpes simplex virus-2 infection or hepatitis c virus infection;
TLR9	Lefitolimod	Increase the expression of surface markers, such as CD86, CD40, HLA-DR, CD169 and CD69 along with cytokines IL-6 and IL-8	Antiviral agent for human immunodeficiency virus-1; suppress IL-33 driven airway hyperreactivity in mice	5ZLN (TLR9/CpG DNA)
	CpG-1018	Induces B-cell proliferation and cytokines production	Provide seroprotective responses against hepatitis b virus
	Agatolimod	Develop the Th2 and Th17 cell responses; Mediate superior immunostimulatory effects	Adjuvant for prophylactic hepatitis b virus vaccination
RIG-I	SB-9200	Culminates in type-I and type-III IFNs secretion and IL-1β release	Antiviral agents for hepatitis b virus and hepatitis c virus infected patients	_
	BO-112	Releasing type-I IFN, IFN-γ and CD8 + T lymphocytes	Used for tumor cell apoptosis; Activate systemic immunity against distant lesions
	5′-pppRNA# 3OG8# 3LRR# 5F9F	Stimulates the innate antiviral response including IRF3, IRF7; STAT1 activation	Provide resistance against both RNA (dengue, chikungunya) and DNA (stomatitis, vaccinia) viruses
cGAS-STING	DMXAA# 4QXR# 4QXQ# 4QXP# 4QXO# 4LOL	Potent vascular disrupting agent; Induce TNF-α and IFN-β production	Mediate antiviral activity in hepatitis b and herpes simplex virus infection	7SSM (hSTING/compound11)5VDV (cGAS/compoundF3)5VDU (cGAS/compoundF2)
	2′,3′-cGAMP# 4LOJ# 4LOH	Natural STING agonists; Enhanced type-I IFN signaling, Cxcl10, Ccl5, and T-cell migration	Effectively used in immunotherapy such as the combination with antigen-specific vaccinations
	ADU-S100# 7Q3B	Promote PBMC; Generate pro-inflammation cytokines	Tumour regressor in B16 melanoma, CT26 colon, and 4 T1 breast cancer murine models
Note: In this table additional available PDB structures for human and mouse agonist not included in the text are included.

Note: We have mentioned available PDB structures for only human and mouse for antagonist that are not describe in this review.

Pharmacological agents mechanism and effects, targeting nucleic acid sensing PRRs: (A) for agonist and (B) for antagonist. “_” represent the data are not available. Note: We have mentioned available PDB structures for only human and mouse for antagonist that are not describe in this review.

TLR3 pharmacological agents

Multiple drugs have been proposed for targeting host-pathogen interactions between proteins and nucleic acids. Most prominent among them is arguably the poly(I:C), a synthetic analogue of dsRNA used as a TLR3 agonist (a potent adjuvant) that is locally administered for viral prophylaxis and therapeutic anticancer vaccination [155]. However, abundant preclinical data demonstrated that poly(I:C) is unstable with side effects including shock, renal failure, hypersensitivity reactions and limited therapeutic efficacy in early clinical trials with leukaemia patients [156], leading to the termination of its clinical development (source: http://www.clinicaltrials.gov). At least two poly(I:C) derivatives have been attempted to address some of the issues. First, poly(I:C12U) (rintatolimod or Ampligen) that contains uridylic acid in a 12:1 M ratio in the poly(C) strand for chronic fatigue syndrome [157], and shown to reduce the concentration of antiretroviral agents in human immunodeficiency virus-1 control [158] as an adjuvant to cancer vaccines in mice [159]. Second, poly-ICLC (Hiltonol) is stabilized with poly(l-lysine) and carboxymethylcellulose [153]. Poly(I:C) has been shown to have a significant potential to boost the immune system [160], [161], [162] by mediating antitumor NK cell and tumor-specific cytotoxic T lymphocyte (CTL) activities through maturation of DCs [163], [164], [165]. Therefore, Clinical trials suggested its effectiveness in combination with cancer vaccines and chemotherapeutics for haematological conditions [166], [167], solid malignancies [168], [169], [170], brain [171], [172] and antiviral responses in human immunodeficiency-1-positive individuals [173]. Since, TLR3 is frequently expressed by various types of malignant cells and can directly trigger tumor cell apoptosis, poly(I:C) has been used to induce potent anti-tumor activity against various tumors [379]. Along these lines, it has been reported that poly(I:C) inhibits tumor growth in a TNF-related apoptosis inducing ligand (TRAIL)-dependent manner [174]. Both (poly(I:C12U) and poly-ICLC) favour cross-priming in human and mouse experimental systems [146], [147]. Clinically, poly(I:C) has been used as an adjuvant to enhance cancer vaccine protocols [175]. Even though poly(I:C) was downgraded as a candidate drug for TLR3-targeted treatments, if suitably remodeled, it may emerge as a promising candidate in the future. Both (poly(I:C12U) and poly-ICLC) favour cross-priming in human and mouse experimental systems [176], [177]. Clinically, poly(I:C) has been used as an adjuvant to enhance cancer vaccine protocols [175]. Even though poly(I:C) was downgraded as a candidate drug for TLR3-targeted treatments, if suitably remodeled, it may emerge as a promising candidate in the future. Our in silico docking study (previously unpublished) suggested that poly-ICLC can bind to both TLR3 residues and dsRNA bases at four different locations. These are the locations where dsRNA electrostatically interacts with TLR3 [178]. Potential interactions of TLR3 with its ligands have been shown in Fig. 5 and Fig. 6 and the detailed results are presented in Table 3. Strong binding energy also suggested that it may stabilize the TLR3-dsRNA complex. Another TLR3 agonist, ARNAX (a novel synthetic DNA–dsRNA hybrid molecule), also promotes robust cross-priming by DCs. ARNAX with cancer vaccine and a programmed cell death 1 ligand 1 (PD-L1)-targeting immune checkpoint blocker overcame resistance to agents targeting programmed cell death 1 in mice [179].

Fig. 5

Fig. 6

The interactions and key residues of TLR3 and important nucleic acids positions of dsRNA at four locations of the binding pocket in TLR3-dsRNA complex. Purple arrows illustrate hydrogen bonds (distance closer than 2.5 Å) between Poly-ICLC and TLR3-dsRNA complex.

Table 3

The docking study provides information regarding the binding free energy and interaction of TLR3 agonist poly-ICLC with TLR3-dsRNA complex at four different locations.

S. No	Location of binding pocket	Binding energy (kcal/mol)	Interacting nucleic acid bases	Interacting protein residues
1	C-terminal of Chain B	−5.485	U-D:26U-D:27	LYS:ChainB:467ALA:ChainB:519
2	C-terminal of Chain A	−5.76	A-C:26	ASN:ChainA:494VAL:ChainA:495ASN:ChainA:520ALA:ChainA:519ASN:ChainA:517
3	N-terminal of Chain A	−6.03	None	GLH:ChainA:110HIS:ChainA:156LYS:ChainA:182HIE:ChainA:136
4	N-terminal of Chain B	−5.17	G-D:4	HIE:ChainB:136LYS:ChainB:182GLY:ChainB:158

The four critical locations on TLR3-dsRNA complex where TLR3 agonist poly-ICLC are dock (represented by arrows in panel A and their inset views in panel B). These first to fourth locations refer to C-terminal chain B, C-terminal chain A, N-terminal chain A and N-terminal chain B of the TLR3-dsRNA complex respectively. A two dimensional poly-ICLC structure with yellow color coding at the center. Both strands of dsRNA are colored red and blue. And, brown and cyan colours correspond to chains A and B of TLR3. The interactions and key residues of TLR3 and important nucleic acids positions of dsRNA at four locations of the binding pocket in TLR3-dsRNA complex. Purple arrows illustrate hydrogen bonds (distance closer than 2.5 Å) between Poly-ICLC and TLR3-dsRNA complex. The docking study provides information regarding the binding free energy and interaction of TLR3 agonist poly-ICLC with TLR3-dsRNA complex at four different locations. TLR3 Antagonist CNTO2424 is a monoclonal antibody (mAb) that recognizes the extracellular domain of human TLR3 in a conformation-dependent manner and down-regulates poly(I:C)-induced production of IL-6, IL-8, MCP-1, and IP-10 in human lung epithelial cells [180]. Additionally, CNTO4685 (rat anti-murine TLR3) and CNTO5429 (CDRs grafted onto mouse IgG1 scaffolds from CNTO4685) mAbs were also worked in a similar manner and reduced poly(I:C)-induced production of CCL2 and CXCL10 in primary mouse embryonic fibroblasts [181]. The compound4a or CU-CPT4a was another potent agonist of TLR3, recognized as a competitive inhibitor of dsRNA binding to TLR3 with high affinity and specificity. After binding to the target, it repressed the expression of downstream signaling pathways mediated by the TLR3/dsRNA complex, including TNF-R and IL-1β. Docking studies showed that CU-CPT4a forms hydrogen bonds with Asn541 residues to target asparagine glycosylation and prevent dsRNA binding to TLR3 in murine macrophage RAW 264.7 cells with an IC50 of 3.44 μM at TLR3-dsRNA interface [182], [183].

TLR7/8 pharmacological agents

Some TLR7 agonists have been used for developing novel antiviral agents [184]. These are mostly imidazoquinoline and adenine derivatives [185]. Imiquimod or R-837 is a prototypic imidazoquinoline that shared resemblance with nucleoside analogue but lacks the fourth nitrogen atom present in purines. Imiquimod binding with TLR7 induces secretion of pro-inflammatory cytokines, predominantly IFN-α, TNF-α and IL-12. This creates a local cytokine milieu biased towards a Th1-type response, with the generation of cytotoxic effectors [186], [187]. Therefore, originally attracted attention for its antiviral effects in animal models of cytomegalo virus and herpes simplex virus 2 infection [188] but failed long-term efficacy trials in patients with standard chemotherapeutic regimens [189], [190]. Imiquimod mediates several therapeutic effects alone or in combination with other drugs. Another study reported that imiquimod combined with monobenzone (an ICB targeting cytotoxic T lymphocyte-associated protein 4) reverses the local immunosuppression in nivolumab resistant melanoma patients [191]. This drug has now been approved for the treatment of some skin tumours involving genital warts, superficial basal cell carcinoma and actinic keratosis [192]. Another TLR7 agonist is S-28463 that has been shown to reduce airway resistance, leukocyte infiltration and IgE levels in mouse models of allergic sensitization to ovalbumin [193]. Motolimod or VTX-2337 is imidazoquinoline-derived agonist, targeting TLR8 [194]. Motolimod administration to white blood cells improves the ability of NK cells to mediate antibody-dependent cellular cytotoxicity [195]. It is currently tested in combination with ICB specific for PD-1 and/or cetuximab in head and neck cancer as well as together with durvalumab plus chemotherapy in women with platinum-resistant ovarian cancer. Resiquimod (also known as R-848) and CL075 are other agonists that target both TLR7 and TLR8. In vitro studies have demonstrated that resiquimod stimulates DC maturation by inducing a Th1 cytokine profile. This leads to a more efficient cross-presentation of exogenous antigens and stronger antigen-specific CD8+ T cell responses [187], [196]. It also favors IL-12 secretion and limits the viral replication in monocytes isolated from human immunodeficiency virus-1-infected individuals [197]. Resiquimod is currently in phase II clinical trials and has been proposed for the treatment of patients with herpes simplex virus-2 or hepatitis c virus infection [198]. Accumulating evidence indicates that Resiquimod acts as immunological adjuvant and can be combined with peptide-based vaccines such as recombinant cancer/testis antigen 1B (CTAG1B or NY-ESO-1) to CD8+ cytotoxic T lymphocyte responses against melanoma, or can be used to reverse immunosuppression in cervical carcinoma linked to human papillomavirus 16 infections [199]. TLR7 antagonists can be generated through a robust chemical approach using 2′-O-ribose methylation in the sense or anti-sense strand [200] or selective incorporate into siRNA which abrogates cytokine production without reduction of gene silencing activity [201] and therefore, acts as a potent inhibitor of immunostimulatory RNA in both human and murine systems and used as a therapeutic tools for the management of SLE [202]. The ODN-1411 is TLR8 antagonist that competitively binds to its ectodomain. It limits the deregulation of cytokines secretion and TNF production in human models of Rheumatoid Arthritis [203] and decelerates disease progression in mouse models of psoriasis [204].

TLR9 pharmacological agents:

TLR9 has been targeted by potent antiviral agents that may act via viral interference or immunostimulation [184], [205]. One such synthetic immunomodulatory oligonucleotide (IMO) is tilsotolimod (IMO-2125), which incorporates cytosine or guanine analogues and shows increased stability, species-independent activity and a clear structure activity relationship. It has been evaluated in clinical trials for toxicity in severely hepatitis c virus-infected patients resistant to recombinant IFNα [206]. Another TLR9 agonist is lefitolimod, a CpG-rich ODN with a covalently closed dumbbell shape structure and also known as MGN1703. According to structural and preclinical studies, MGN1703 showed limited interactions with molecules outside its target structure [207]. Although, MGN1703 significantly increased the expression of surface activation markers, such as CD86, CD40, HLA-DR, CD169 and CD69, as well as the release of a variety of cytokines and chemokines, including IFN-*, IFN-*, IL-6, and IL-8 [208]. MGN1703 was investigated in combination with chemotherapy or immunotherapy in cancer patients [209], [210], [211]. Only a single clinical study has tested lefitolimod as an antiviral agent for human immunodeficiency virus-1-positive patients treated with human immunodeficiency virus-1-specific antibodies. It indicates that this drug is safe even as it induces robust virus-specific humoral and cellular immunity and prolonged control of viraemia [212]. It was further confirmed to efficiently suppress IL-33-driven airway hyperreactivity in mice [213]. Furthermore, MGN1703 was evaluated as an adjuvant for the treatment against infectious diseases [187]. One of the most important TLR9 agonists that were formulated in licensed vaccine (Heplisav-B) for Hepatitis B is CpG-1018, derived from nucleotide backbone sequence modification of CpG-ODN to produce immunostimulatory activity. CpG-1018 is a type B CpG-ODN that contains a phosphorothiolate backbone throughout their entire sequence with one or several CpG-hexamer motifs [214], [215]. It induces strong B-cell proliferation, cytokines production, and has some effect on the maturation and activation of plasmacytoid DCs, monocytes, and NK cells [216], [217], [218]. Administration of two doses of heplisav-b induced higher seroprotective responses against hepatitis b virus with a faster onset rate compared with the administration of three doses of Engerix-B vaccine with similar safety profiles [219], [220]. Agatolimod (CpG-7909 or PF-3512676) is another synthetic CpG-rich ODN tested as an adjuvant for prophylactic hepatitis b virus vaccination [221]. It is wrapped with non–agonistic ligands for DC receptors such as C-type lectin domain containing 7A (CLEC7A) that mediate the superior immune stimulatory effects and develop the Th2 and Th17 cell responses [222] while clinical development of agatolimod as an anti-cancer agent has been discontinued (source https://www.clinicaltrials.gov). TLR9 antagonist COV08-0064 (MP-3964) limited neurodegeneration in mice exposed to Parkinson’s disease [223] (source: ). Also, selectively blocked mRNA upregulation of TNF-α, IL-1β, NLRP3 and MCP-1 in macrophages and IFN-β mRNA in dendritic cells induced by the TLR9 agonist CpG-ODNT. This leads to inhibition of JNK and ERK phosphorylation. TLR9 signaling inhibition by COV08-0064 may be an effective approach in liver surgeries including transplantation [224].

RLRs pharmacological agents

Several agonists have been developed to target RLRs. Some have effectively cleared the clinical trials and were used for the treatment of severe diseases. Mostly, RIG-I agonists are being explored in a diverse range of cancers. RIG-I activation in cancer patients could stimulate three distinct immune responses: 1.) direct activation of tumor apoptosis and pyroptosis that is programmed necrosis; 2.) IFNs and cytokine-mediated activation and maturation of macrophages, DCs, natural killer cells, and 3.) increased recruitment and cross priming of adaptive immune effectors e.g. CD8 + T-lymphocytes and enhanced activity of APCs [225], [226], [227]. The SB-9200 (inarigivir soproxil or GS-9992) is an orally available prodrug of a dinucleotide agonist that targets RIG-I with cytosolic PRR nucleotide-binding oligomerization domain-containing protein 2 (NOD2) for the elimination of invading pathogens [228], [229]. It also culminates in type-I and type-III IFNs secretion and IL-1β release downstream of inflammasome activation [230]. SB 9200 mediated robust antiviral effects and tested in clinical trials as a stand-alone agent or combined with entecavir in chronically hepatitis b virus and hepatitis c virus-infected patients [228], [231]. In addition, it is known that poly(I:C) mimics the dsRNA and also acts as a RIG-I and MDA5 agonist. The BO-112 is another agonist based on a nanoplexed formulation of poly(I:C) complexed with polyethylenimine. It is used for tumour cell apoptosis and activation of systemic immunity against distant lesions via releasing type-I IFN, IFN-γ and CD8+ T lymphocytes [232]. A clinical trial just commenced testing the activity of BO-112 in adults with aggressive solid tumours (source https://www.clinicaltrials.gov). Natural RIG-I agonist 5′-triphosphate RNA (5′-pppRNA) acquires resistance against infection of some RNA viruses such as dengue and chikungunya virus [233] as well as DNA viruses such as vesicular stomatitis and vaccinia virus [234]. But, 5′-pppRNA is unstable and unable to cross the plasma membrane. To resolve this issue, short stem–loop RNA molecules that present a single duplex terminus and a triphosphorylated 5′ end (and hence retain strong RIG-I-binding capacity) have been developed. It stimulates innate antiviral response including IRF3, IRF7 and STAT1 activation in human lung epithelial A549 cells [235]. Only a few RIG-I agonists such as RTG100 and MK-4621 entered clinical stage but their development was ultimately terminated [153]. According to a 2017 report, KIN1000 is a benzobisthiazole compound identified as a potent RLR inducer via high-throughput screening-based approach. It was developed as an immunological adjuvant. Another compound having adjuvant-like activity for RLRs is KIN1148 as a prophylactic mice vaccination against a pandemic human influenza virus [236]. RIG-I antagonist such as ebola virus VP35 is a dsRNA binding protein that suppresses DCs maturation followed by impaired expression of α/β-IFN and proinflammatory cytokines, abnormal upregulation of costimulatory markers, and inhibition of naive T cells activation. It may be possible to counteract EBOV immune evasion by using treatments that bypass the VP35-imposed block to DC maturation [237]. Another is picorna viruses Vpg protein that serve to block the 5′ppp RNA motif associated with RIG-I activation thus preventing RIG-I recognition and signaling. The V proteins of several paramyxo viruses have been shown to directly bind MDA5 and block its downstream signaling actions mainly IFN-β induction in a range of mammalian cells as well as in avian cells [238], [239].

cGAS-STING pharmacological agents

The most important function of cGAS–STING is to direct cancer cell senescence through the secretion of chemokines, pro-inflammatory cytokines, growth factors, and proteases, thus mediating oncosuppressive effects either by autonomously controlling tumor cells or by stimulating immune cells (CD8+ T cells cross-priming via DCs) against tumors [240], [241]. STING agonists were modelled on their natural partner cGAS. Among them, 5,6-Dimethylxanthine-4-acetic acid (DMXAA, vadimezan or ASA404) developed as potent vascular disrupting agent and induce production of cytokines like TNF-α and IFN-β [242]. Study suggests that it presented therapeutic efficacy in preclinical models of acute myeloid leukemialeukaemia and mammary carcinoma and was shown to be safe in chemotherapy [243], [244], [245], [246], and was shown to be safe in chemotherapy [247], [248]. However, later structural studies suggested that it binds to STING protein in mice but not in humans [73], [249]. Cyclic GMP–AMP (cGAMP) 2′,3′-Cyclic GMP–AMP (2′,3′-cGAMP) and other cyclic dinucleotide (CDNs) of bacterial origins are natural STING agonists that enhanced the type-I IFN signaling, Cxcl10, Ccl5, and T-cell migration into the brain of glioma-bearing mice and effectively used in immunotherapy such as the combination with antigen-specific vaccinations [74], [250] successful in mouse tumor models [251] and combinatorial therapeutics [252]. Second-generation STING agonists are synthetic CDNs that include 2′-3′-cGSASMP and ADU-S100. Both are potent inducers of IFN-β secretion from THP-1 cells. 2′-3′-cGSASMP is phosphorothioate analogue of 2′,3′-cGAMP and ∼40 times more resistant to ENPP1 hydrolysis [382]. On the other hand, ADU-S100 (MIW815 or ML RR-S2 CDA) showed improved stability, lipophilicity, and comparable activity toward mouse and human STING, making it the first candidate to move to early clinical studies [381]. ADU-S100 could promote human peripheral blood mononuclear cell (PBMC) to generate pro-inflammation cytokines such as IFN-β and resulted in profound tumour regression in B16 melanoma, CT26 colon, and 4T1 breast cancer murine models [110], [253], [254], [255]. Pharmaceutical companies are actively exploring multiple cGAS–STING antagonists [256]. Among other actively pursued compounds are Astin C, C-176 and H-151. C-176 and H-151 occupy the CDN binding site at transmembrane domain and prevent STING from acquiring an “active” conformation in human and mouse, thus acting as competitive antagonists of STING activators. It irreversibly binds to Cys91 of STING and markedly reduces the elevated levels of type-I IFNs and IL-6, inhibits TBK1 phosphorylation and suppresses Cys91 palmitoylation in various cellular assays of Trex1−/− mice. These inhibitions are capable of reversing the strong tissue inflammation, recapitulating the autoantibody production and aberrant T-cell activation in Aicardi-Goutieres syndrome (AGS) patients [257], [258], [259]. Another antagonist is Astin C, a cyclopeptide from Aster tataricus that exhibits anti-inflammatory activity and blocks the recruitment of IRF3 to the STING signalosome. Therefore, it is used in STING-mediated cancer and autoimmune diseases [258], [260].

Common agonists

A novel TLR7, TLR8 and RIG-I agonist (CV8102) was used alone or with doses of a rabies vaccine to test its safety, tolerability and immunogenicity for various types of cancers such as melanoma and hepatocellular carcinoma with a PD-1 blocker [261]. NAB2 is an agonist of TLR3 and MDA5 that is a dsRNA molecule isolated from yeast. It is complexed with a cationic agent and effectively acts as an adjuvant to a prophylactic cancer vaccine [252]. Although, MDA5 stands out as a promising cancer therapy, its agonists and adenosine deaminase RNA (ADAR) inhibitors may be combinatorial partners in immunotherapy for ICB-resistant tumours [252]. IC31 is ODN based poly(I:C), combined with antimicrobial peptide KLKL5KLK, to develop the agonist effect of both TLR9 and TLR3. Studies in mice demonstrated that IC31 helped to induce potent antigen-specific CTL cells, strong protein-specific humoral responses and T cell proliferation and differentiation [377]. IC31 has been used as potent adjuvant against infectious disease mostly in the candidate tuberculosis vaccine H56:IC3. It is a novel vaccine consisting of a triple antigen (Ag85B, ESAT-6 and Rv2660c), demonstrated tolerability and immunogenicity, inducing antigen-specific IgG and persistent H56-specific CD4+ T cell responses [378].

Common antagonists

Above, we reviewed drugs that can induce or inhibit host nucleic acid-sensing efficacy of PRRs on a one-on-one basis. More recently, antagonists were used to effectively target multiple nucleic acid-recognizing PRRs. For example, inhibitory ODNs (INH-ODNs) are mixed TLR3 and TLR9 antagonists that mediate therapeutics in an experimental model of SLE. IMO-8400 is a combined TLR7, TLR8 and TLR9 antagonist that demonstrates efficacy in placebo-controlled moderate- to- severe plaque psoriasis [262]. Another TLR7, TLR8 and TLR9 antagonist is CPG-52364, quinazoline derivative which inhibits the disease progression of SLE and other autoimmune diseases in animal models [263]. Likewise, IMO-3100 blocks the TLR7 and TLR9 and reduce the expression of inflammatory genes such as IL-17A, β-defensin, CXCL1, keratin 16, TNF-α and IFNα [264]. Thus, we conclude that drugs targeting various protein-nucleic acid interactions are a highly pursued field and hold great promise in combating infectious diseases. Among them, multi-target agonists appear promising for multi-disease vaccination and treatments.

PRR agonist or adjuvant mechanism

When PRRs recognize an agonist or an adjuvant that helps to promote immune responses within a few hours of stimulus, an agonist mechanism is activated through the development of innate immunity [265], [266]. In this first phase, adjuvant-induced antigen-independent innate immune responses are critical for the subsequent development of antigen-specific immune responses. During this phase, gene expression increases; chemokines and proinflammatory cytokines are released from TLR-expressing cells; and innate immune cells including monocytes, macrophages, DCs, NK cells, and neutrophils are recruited to the site of injection. The expression of cell surface molecules, including the cluster of differentiation 80 (CD80), CD86, and molecules of the major histocompatibility complex (MHC) are increased. The APCs at the injection sites uptake the agonist antigen and migrate to the lymph node or primary lymphocytes [267], [268]. These TLR agonist-activated early immune responses are followed by a second phase of adaptive immune responses that occur several days later. During this second phase, activated APCs produce cytokines to shape the differentiation of naïve CD4+ T cells into different T helper (Th) cell. TNF-α, IL-12, and IFNs promote Th1 polarization and IL-1, IL-6, and IL-23 promote Th17 polarization. Th1 cells produce IFN-γ and proinflammatory cytokines, and Th17 cells are the major source of IL-17. The second phase of adaptive immune responses results in the expansion of antigen-specific CD8+ T cells that recruits B cells. In this way, TLRs regulate the development and differentiation of B cells and increase the production of antigen-specific antibodies. The class switching recombination process in the B cells further differentiate it into antibody-secreting plasma cells and memory B cells, which are long-lived and provide the adaptive immunity for later life [269], [270], [271], [272], [273]. Despite the fact that different TLR ligands share a common mechanism to develop the immune responses, their immunological inducing profiles are not entirely the same. TLRs have overlapping but different cell-type expression profiles [274], [275]. Their adjuvant effects are also distinct. These TLR adjuvants preferentially utilize different signal transductions and transcription factors activities for controlling gene expression. Studies involving human blood also revealed the differences in the cytokine-inducing profiles of the TLR7/8, and TLR9 agonists [276], [277]. The specific mechanisms of PRR agonist are noted in Table 2.

Computational approaches to HP-PNIs

Advancements in high-throughput omics technologies has facilitated large-scale mapping of the highly complex and rapidly evolving host-pathogen interactions. Computational methods have emerged as primary tool for the analysis of the voluminous experimental host-pathogen interaction data. Most of them aim to identify the key pathogen and host targets in infection, generate and test new hypotheses and help develop novel anti-pathogen drugs [278], [279]. PPIs are deemed to be most important among the host-pathogen interactions hence, and have been the most widely studied [280], [281]. Computational systems biology approaches evolved to investigate, analyze, predict and model host-pathogen PPI networks [207]. These include interface structural mimicry [280], [282], [283], [284], [285], structure-based mimicry, or homology [286], [287], [288], [289], [290], [291], [292], [293], [294], [295], [296] and interactome-based and systems biology [278], [297], [298], [299], [300], [301], [302]. In addition to encouraging progress in PPI modeling of host-pathogen interactions, the roles of immune cell receptors in sensing pathogen-derived nucleic acids have also been widely acknowledged. Consequently, there is a growing interest in implementing computational approaches to predict and analyze protein-nucleic acid interactions. Below we review some of these approaches and strategies.

Predicting foreign (pathogenic) nucleotide sequences interacting with host

Knowledge of the molecular interactions between hosts and their pathogens is critical to the understanding the mechanisms of infections and identifying potential targets for therapeutics. In that respect, structural approaches predict not only which pathogen protein interacts with which host protein, but also define the drug target. The same rationale holds for pathogen nucleic acids host protein interaction. However, the task is challenging, with a greater likelihood of false-positive predictions. Yet, the technical challenges in experimental identification on a large scale are daunting, emphasizing a pressing need for efficient and powerful computational approaches for the analysis and prediction of host-pathogen interactions [301]. Speedy and accurate sensing of pathogen-derived nucleic acids by host TLRs is the first line of defense for the host innate immune system. The two key aspects of this strategy are a) identifying the pathogenic nucleic acid sequence and b) identification of host elements i.e., the sequence and structural motifs in host receptors [112], [303], [304]. The initial set of computational approaches focused on distinguishing pathogenic DNA from the host DNA, which later evolved into more complex algorithms to pinpoint the patterns that discriminate the invading pathogenic from the host genomes [305]. Statistical approaches to identifying sequence and structure motifs, which may inform us about the specificity and viability of PRR-PAMP interactions, have been outperformed by machine learning-based approaches such as semi-supervised [306], and supervised learning [307], random forest and Support Vector Machine based classifiers. Some research has adopted specialized techniques including transfer and multitask learning [308], domain and motifs models [309], and sequence homology combined with others [310], [311]. Other machine learning-based methods such as multiple linear regression partial least squares analysis [312] and Gradient Boosting Regressor [313] were also reported. More recently, deep learning-based Artificial Neural Network [314] and Convolution Neural Network [315] methods have been used for binding affinity prediction. Chemical feature-based pharmacophore models used the HypoGen algorithm for TLR7 agonist prediction [316]. Similarly, mouse TLR9 agonists have been predicted through a random forest approach [317]. In summary, machine learning methods provided a better mapping of the sequence and structural features in PRRs and PAMPs that contribute to the host’s ability to defend against a pathogen. Machine learning and closely related statistical approaches have also provided much data on protein-nucleic acid interactions, which will help understand scenarios for host-pathogen pairs. Our work in these directions relates to the study of specificity, thermodynamics and clustering patterns [318], [319], sequence-based predictions of protein-nucleic acid interactions [320], [321], patterns of electrical moments of nucleic acids binding proteins [322] and co-operativity between gene expression and sequence [323]. Some of the purely data-driven and computational approaches can provide deep insights into the biological and mechanistic nature of protein-nucleic acid interactions, including host-pathogen interactions. This belief is borne out by the broad literature on the prediction of DNA binding sites and affinities and the key principles which are involved, recently including cooperativity [324], [325], [326], [327], [328], [329], [330], [331], [332], [333]. On the technical side, the prediction of DNA-binding sites and the proteins that specifically interact with them is a non-trivial task. Strategies can be sequence-based DNA-binding site prediction, structure-based DNA-binding site prediction, and homology modelling. Structurally, protein recognition may take place through positively charged amino acids, primarily lysine and arginine, and hydrophobic residues that interact with the bases. GC-rich regions are often recognized by arginine side chains through hydrogen bonding and cation-π interactions; AT-rich regions are often recognized via minor groove contacts that sterically exclude the N2 atom of GC base pairs. Recognition may involve the major and minor grooves; it may also involve Hoogstein base pairs. DNA binding specificity arises from all of the above. The shape is also a key factor, and recently a database that annotated transcription factor binding sites based on shape was described [334]. Algorithms for RNA binding sites have also been constructed [335]. Large-scale prediction of nucleic acid-host protein interactions, particularly structural predictions can serve as targets in drug discovery.

Gene regulatory networks

Systems-level approach for studying HP-PNIs is based on modelling of signalling and gene regulatory networks, which can also provide considerable insight into the interactions between the host and pathogen in infectious diseases [336]. This, the so-called network-based analysis of host-pathogen interactions is highly useful in improving our understanding of pathogenesis and pinpointing novel experimental and drug targets [279]. While signalling networks are driven primarily through PPIs, the gene regulatory networks (GRNs), i.e. a network of regulatory relationships between transcription factors and their targets are also crucial and gaining more attention recently. These networks are based on the time-series gene expression data and have been particularly effective both in predicting host-pathogen interactions and in understanding the mechanistic basis of the underlying PHI networks [337]. One of the most widely used tools for this purpose is NetGenerator, a computational tool to infer small-scale GRNs, that has been used to predict PHI networks [337], [338]. More recently, Castro and colleagues proposed Gene regulatory networks of transfer entropy (GRNTE), to examine the transcriptional regulatory network of the plant pathogen Phytophthora infections as it infected two different host organisms [339].

Databases related to HP-PNIs

Several web-based databases provide quick references on protein-nucleic interactions between host and pathogen, either exclusively or as part of a broader theme. Most significant among them are (1) LRRML: a conformational database and an XML description of LRRs [340], (2) TollML: a database of TLRs structural motifs [341], and (3) PRRDB 2.0: a comprehensive database of pattern-recognition receptors and their ligands [342], and (4) ProNIT database that uses structural and quantitative binding interaction data to elucidate the molecular mechanism of protein-nucleic acid recognition [343]. The sequence arrangement of amino acid or base pairs and stable conformation of protein or DNA/RNA molecules indicates specificity towards each other.

Current trends and evolving perspectives

With the emergence of genomic technologies, host-pathogen genetic studies have transitioned from single candidate gene studies to whole-genome studies of hosts and pathogens [344]. NGS techniques have enabled us to identify and feature the regulatory mechanism through which hosts and pathogens interact with each other either in a diseased or in a healthy state. Such studies have pointed out that host-pathogen interactions not only depend on the host and pathogen genomic sequences but also on the ecological, immunological and epigenetic context in which the genomic data are collected [345]. For example, in a study of mucus, it was revealed that in addition to genetic factors, post-translational modification plays a significant role in defence against pathogens [346], [347]. Pathogens represent a major portion of the biomass and diversity of several ecosystems [348], [349], [350], [351], [352]. PCR-based technologies and recently high throughput technologies have helped in the identification of novel pathogens especially viral sensing pathogens [353], [354], [355], [356], [357], [358]. Metagenomics approaches have further exposed pathogen- and host-associated microbial communities playing a significant role in infection and disease development suggesting that a pathogen may not occur alone but may belong to a larger community [359], [360], [361], [362]. Several immune gene families including TLRs, IFNs and antimicrobial peptides have gained attention in host-pathogen interaction studies [363], [364], [365], [366], [367], [368], [369], [370]. These studies have highlighted the positive selection and rapid evolution of these immune genes in the innate immune system [371], [372], [373]. Metagenomics studies have also provided key insights into host-pathogen interactions. For example, several metagenomics studies unearthed the links between the gut microbiome and diseases like RA, diabetes, and depression [374], [375], [376]. Thus, multi-target drug discovery, metagenomics, NGS and identification of better derivates of natural agonists are likely to remain the focus areas of research in HP-PNIs for the foreseeable future.

Conclusion

Investigations into the detailed interactions and recognition of pathogen nucleic acids by host factors such as TLRs, RLRs and NLRs have yielded common as well as specific insights into the mechanisms underlying such interactions. However, significant challenges remain in deciphering the full spectrum of host-pathogen interactions and their potential implications in countering infection and in therapeutics. A deeper understanding of these interactions will help in identification of new drug targets and clinical and therapeutic strategies to manipulate them and counter infectious diseases. Such associations also offer a means to explore alternative approaches such as targeting host proteins instead of pathogenic components to bypass the vexing pathogenic variability and genetic mutations. Further studies will improve strategies to inhibit host-pathogen interactions and clinical outcomes.

Declaration of Competing Interest

The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.

366 in total

1. Therapeutic Immune Modulation against Solid Cancers with Intratumoral Poly-ICLC: A Pilot Trial.

Authors: Chrisann Kyi; Vladimir Roudko; Rachel Sabado; Yvonne Saenger; William Loging; John Mandeli; Tin Htwe Thin; Deborah Lehrer; Michael Donovan; Marshall Posner; Krzysztof Misiukiewicz; Benjamin Greenbaum; Andres Salazar; Philip Friedlander; Nina Bhardwaj
Journal: Clin Cancer Res Date: 2018-06-27 Impact factor: 12.531

2. Intracellular localization of Toll-like receptor 9 prevents recognition of self DNA but facilitates access to viral DNA.

Authors: Gregory M Barton; Jonathan C Kagan; Ruslan Medzhitov
Journal: Nat Immunol Date: 2005-12-11 Impact factor: 25.606

3. Human innate responses and adjuvant activity of TLR ligands in vivo in mice reconstituted with a human immune system.

Authors: Liang Cheng; Zheng Zhang; Guangming Li; Feng Li; Li Wang; Liguo Zhang; Sandra M Zurawski; Gerard Zurawski; Yves Levy; Lishan Su
Journal: Vaccine Date: 2017-09-25 Impact factor: 3.641

4. TLR13 recognizes bacterial 23S rRNA devoid of erythromycin resistance-forming modification.

Authors: Marina Oldenburg; Anne Krüger; Ruth Ferstl; Andreas Kaufmann; Gernot Nees; Anna Sigmund; Barbara Bathke; Henning Lauterbach; Mark Suter; Stefan Dreher; Uwe Koedel; Shizuo Akira; Taro Kawai; Jan Buer; Hermann Wagner; Stefan Bauer; Hubertus Hochrein; Carsten J Kirschning
Journal: Science Date: 2012-07-19 Impact factor: 47.728

5. Phase II study on the addition of ASA404 (vadimezan; 5,6-dimethylxanthenone-4-acetic acid) to docetaxel in CRMPC.

Authors: Roberto Pili; Mark A Rosenthal; Paul N Mainwaring; Guy Van Hazel; Sandy Srinivas; Robert Dreicer; Sanjay Goel; Joseph Leach; Shirley Wong; Peter Clingan
Journal: Clin Cancer Res Date: 2010-05-11 Impact factor: 12.531

6. Direct Activation of STING in the Tumor Microenvironment Leads to Potent and Systemic Tumor Regression and Immunity.

Authors: Leticia Corrales; Laura Hix Glickman; Sarah M McWhirter; David B Kanne; Kelsey E Sivick; George E Katibah; Seng-Ryong Woo; Edward Lemmens; Tamara Banda; Justin J Leong; Ken Metchette; Thomas W Dubensky; Thomas F Gajewski
Journal: Cell Rep Date: 2015-05-07 Impact factor: 9.423

7. Activation of toll-like receptor 3 protects against DSS-induced acute colitis.

Authors: Matam Vijay-Kumar; Huixia Wu; Jesse Aitken; Vasantha L Kolachala; Andrew S Neish; Shanthi V Sitaraman; Andrew T Gewirtz
Journal: Inflamm Bowel Dis Date: 2007-07 Impact factor: 5.325

8. Nucleotide sequences and modifications that determine RIG-I/RNA binding and signaling activities.

Authors: Dina Uzri; Lee Gehrke
Journal: J Virol Date: 2009-02-18 Impact factor: 5.103

9. Inference of dynamical gene-regulatory networks based on time-resolved multi-stimuli multi-experiment data applying NetGenerator V2.0.

Authors: Michael Weber; Sebastian G Henkel; Sebastian Vlaic; Reinhard Guthke; Everardus J van Zoelen; Dominik Driesch
Journal: BMC Syst Biol Date: 2013-01-02

10. DNA-PK is a DNA sensor for IRF-3-dependent innate immunity.

Authors: Brian J Ferguson; Daniel S Mansur; Nicholas E Peters; Hongwei Ren; Geoffrey L Smith
Journal: Elife Date: 2012-12-18 Impact factor: 8.140