Literature DB >> 29954950

Effects of flanking regions on HDV cotranscriptional folding kinetics.

Yanli Wang¹, Zhen Wang¹, Taigang Liu¹, Sha Gong¹, Wenbing Zhang¹.

Abstract

Hepatitis delta virus (HDV) ribozyme performs the self-cleavage activity through folding to a double pseudoknot structure. The folding of functional RNA structures is often coupled with the transcription process. In this work, we developed a new approach for predicting the cotranscriptional folding kinetics of RNA secondary structures with pseudoknots. We theoretically studied the cotranscriptional folding behavior of the 99-nucleotide (nt) HDV sequence, two upstream flanking sequences, and one downstream flanking sequence. During transcription, the 99-nt HDV can effectively avoid the trap intermediates and quickly fold to the cleavage-active state. It is different from its refolding kinetics, which folds into an intermediate trap state. For all the sequences, the ribozyme regions (from 1 to 73) all fold to the same structure during transcription. However, the existence of the 30-nt upstream flanking sequence can inhibit the ribozyme region folding into the active native state through forming an alternative helix Alt1 with the segments 70-90. The longer upstream flanking sequence of 54 nt itself forms a stable hairpin structure, which sequesters the formation of the Alt1 helix and leads to rapid formation of the cleavage-active structure. Although the 55-nt downstream flanking sequence could invade the already folded active structure during transcription by forming a more stable helix with the ribozyme region, the slow transition rate could keep the structure in the cleavage-active structure to perform the activity.

Entities: Chemical Gene Species

Keywords: HDV ribozyme; cotranscriptional; folding kinetics; pathway; pseudoknot

Mesh：

Substances：
RNA, Catalytic
RNA, Viral

Year: 2018 PMID： 29954950 PMCID： PMC6097654 DOI： 10.1261/rna.065961.118

Source DB: PubMed Journal: RNA ISSN： 1355-8382 Impact factor: 4.942

INTRODUCTION

RNA can carry out numerous biological functions, such as translating genetic information into proteins (Skog et al. 2008; Mercer et al. 2009), regulating gene expression (Grundy and Henkin 1998; Batey et al. 2004), and catalyzing biochemical processes (Cheah et al. 2007; Neupane et al. 2011; Li and Breaker 2013; Lin and Thirumalai 2013; Reining et al. 2013; Hoffmann et al. 2014) by forming specific secondary and tertiary structures. RNA pseudoknots are examples of minimal structural motifs in structured RNAs with tertiary interactions. They have been found to play important roles in a wide range of biological functions, from ribosomal frameshifting (Gesteland and Atkins 1996; Kim et al. 1999; Giedroc et al. 2000; Plant et al. 2003; Cornish et al. 2005; Plant and Dinman 2005) to human telomerase RNA (hTR) activity (Comolli et al. 2002; Theimer et al. 2003; Chen and Greider 2005; Marrone et al. 2005). Many ribozymes (Lehnert et al. 1996; Zarrinkar and Williamson 1996; Ferré-D'Amaré et al. 1998; Treiber et al. 1998; Pan and Woodson 1999; Russell et al. 2000; Schultes and Bartel 2000) form a well-defined 3D enzymatic shape with pseudoknots. The HDV ribozyme is a small, single-stranded RNA satellite of hepatitis B virus (HBV), which can enhance the virulence of HBV infections. HDV replicates by a double rolling-circle model and the nascent RNA is processed into monomers by self-cleavage of the genomic or antigenomic ribozyme (Lai 1995; Taylor et al. 1996; Taylor 2006; Kapral et al. 2014). Pseudoknot structure is also important for HDV ribozyme function (Perrotta and Been 1990, 1991; Wadkins et al. 1999). A detailed understanding of the folding pathway of this ribozyme may provide insight into replication of HDV and help identify targets for therapeutics. The folding of functional RNA structures is often coupled with the transcription process (Sharma et al. 2010; Ameur et al. 2011; Tilgner et al. 2012; Brugiolo et al. 2013; Hamperl and Cimprich 2014). For instance, the functional native structure of the tetrahymena group I intron may form within the timescale of transcription, which is much faster than the refolding of the complete chain in vitro (Brehm and Cech 1983; Wu and Tinoco 1998; Treiber and Williamson 2001; Heilman-Miller and Woodson 2003). It has been proposed that natural RNAs can effectively avoid the formation of misfolded structures during the cotranscriptional folding process (Kramer and Mills 1981; Treiber and Williamson 1999; Wong et al. 2007), but the mechanism is still not completely clear. In vivo, during the transcription elongation, because the upstream has already been folded, this will further influence the folding pathways of the downstream section. Recently, a few experiments have studied the self-cleavage activity of the HDV ribozyme during transcription (Chadalavada et al. 2000, 2002, 2007; Diegelman-Parente and Bevilacqua 2002). A clear and detailed understanding of the kinetic process of RNA folding including the folding pathways during transcription is crucial for uncovering the mechanism of RNA functions. Many theoretical methods have been used to study the RNA folding kinetics. A few works have attempted to improve the energy parameters of pseudoknots (Cao and Chen 2005, Zhang et al. 2008). Molecular dynamic and Monte Carlo (MC) simulation approaches have been used to study the transition states, kinetic intermediates, and folding trajectories for a few specific sequences (Gultyaev 1991; Flamm et al. 2000; Isambert and Siggia 2000; Sorin et al. 2002; Krasovska et al. 2005; Yingling and Shapiro 2005; Danilova et al. 2006; Lin and Thirumalai 2008; Veeraraghavan et al. 2010), but the methods are limited to short timescales as a result of the restrictions of the computational efficiency and incomplete conformational sampling. Though coarse-grained models have already been developed to study RNA folding (Cho et al. 2009; Denesyuk and Thirumalai 2011; Shi et al. 2011), they are still limited to the simple H-pseudoknot folding without long life intermediate states. Based on coarse-grained kinetic moves, which can effectively reduce the number of conformations, several computational methods have been developed (Mironov et al. 1985; Geis et al. 2008; Tang et al. 2008; Hofacker et al. 2010); however, these simplified kinetic moves would miss the important folding pathways for some sequences/structures. Cao and Chen (Cao and Chen 2006) investigated RNA pseudoknot folding and unfolding kinetics with a combined master equation and kinetic cluster approach, but the method is also limited to short chains with small conformational preequilibrated macro-states. In this study, we combine the methods of predicting the folding kinetics of the pseudoknots (Chen et al. 2014), the cotranscription folding kinetics theory (Zhao et al. 2011; Gong et al. 2015a) of the secondary structures, and the transition node approximation method (Gong et al. 2015b) for long RNAs to establish a systematic method to predict the cotranscription folding kinetics of a long RNA chain with pseudoknots. Then we further studied how the flanking regions affect the HDV self-cleavage function during transcription.

RESULTS

Cotranscriptional folding for 99-nt HDV ribozyme can avoid forming trap intermediates

To test the validity of the transition node approximation, we first predict the cotranscriptional folding kinetics of the 99-nt sequence (which contains 84-nt ribozyme sequence and 15-nt downstream flanking sequence) for the HDV ribozyme at a transcription rate of 15 nt/sec using the original theory (Zhao et al. 2011) and the approximation method. Applying the transition node approximation reduces the number of conformations from 1884 to 142 at the end of transcription. As shown in Figure 1, the results show that from step 1 to step 99, the population kinetics of the main states are almost identical for the two methods. This indicates that the approximation effectively reduces the number of conformations and can reliably predict the cotranscriptional folding kinetics of longer nascent RNA chains. Hence, we would use the new method to study the cotranscriptional folding kinetics of HDV with different lengths.

FIGURE 1.

The population kinetics of main states (C0–C8) formed during the transcription (A) at a transcription rate of 15 nt/sec with the original theory, (B) and (C) with the node approximation method at a transcription rate of 15 and 40 nt/sec, respectively. (D) The folded structures and pathways during transcription. Upper case letters denote the ribozyme regions; lower case letters denote the flanking regions. The transition rates (unit sec−1) along the arrow are labeled. For the 99-nt sequence, as the chain grows, the nascent RNA chain folds to the native state C8 through a series of discrete intermediate states (C0–C8) (Fig. 1): (i) When the first 13 nucleotides (nt) are released by the RNAP, the hairpin structure C1 is formed. (ii) At step 29, structure C1 quickly converts to structure C2 by adding a new hairpin. (iii) From step 37, most of the structure C2 quickly transits to structure C3, and a small fraction of structure C2 converts to structure C4. Besides, structure C3 can slowly transit to structure C4. (iv) At step 63, structure C3 almost all converts to structure C5, which only stays several steps. (v) From step 64, a part of helix P4 begins to form, so the structures C5 and C4 quickly convert to structures C6 and C7 by adding helix P4, respectively. At the same time, structure C6 can slowly transit to structure C7. (vi) From the 81st step, structure C6 quickly converts to native structure C8, and a small fraction transits to C7. At the end of the transcription, the inactive structure C7 only occupies ∼10.18% and the active structure C8 occupies ∼78.37%. The cotranscriptional folding of the wild 99-nt HDV ribozyme can effectively avoid the meta-stable intermediate C7, which occupies ∼50% of the population and lasts ∼30 min and then transits to the native state in the refolding process (Chadalavada et al. 2000, 2002). The cotranscriptional folding also shows bifurcation folding behavior from step 34, one pathway as C2–C3–C5–C6–C8 would directly fold to the native state, another pathway folds to the intermediate state along C2–C4–C7. Although structure C2 begins to directly convert to structures C3 and C4 at the same time from step 34, the transition rate from C2 to C3 kC2→C3 ≈ 1.48 × 102 sec−1 is larger than that from C2 to C4, kC2→C4 ≈ 3.09 × 100 sec−1, which is not a zipping process due to a bulge loop. Even though the structure C4 (ΔG = −21.19 kcal/mol) is more stable than structure C3 (ΔG = −19.31 kcal/mol), when the sequence grows to step 39, most of the population structure C2 fluxes to structure C3 instead of C4, which indicates that the cotranscriptional folding behavior is a nonequilibrium folding kinetics. However, although the structure C3 could transit to C4 through a helix exchanging pathway (helix P1 exchange to helix AltP1), according to the steady-state approximation, the rates between the two structures could be obtained through Equation 8 as follows: kC3→C4 ≈ 0.04 sec−1, kC4→C3 ≈ 1.65 × 10−3 sec−1. Along the two pathways C3–C5–C6–C8 and C4–C7, the transitions are very fast through adding a new helix at each transition step, so most of the population would transit to the native state C8 from C3, and only a small fraction of the population would transit to the intermediate state C7 from C4. Structure C6 could also transit to structure C7 along the pathway as the transition from C3 to C4 through helix AltP1 exchanging with helix P1 and has the same transition rates. So from step 39 to 81, at which structure C8 is more stable than C7, the population flow from the upper pathway to the lower pathway at the time window (81 − 39)/υ sec could be approximated as . To further explore the effects of the transcription rate on the cotranscriptional folding kinetics, the cotranscriptional folding behavior with transcription rate 40 nt/sec has been studied. As shown in Figure 1C, the only difference is that the intermediate state C7 has an even lower population. This is because the time window 42/υ sec from the upper pathway to the lower pathway is shorter at the transcription rate 40 nt/sec than that at 15 nt/sec, the population for structure C7 decreases from 10.18% to 3.93%. Although at step 99 the population of structure C8 ( ∼78.38%) decreases to 53.83%, the other population is in structure C6, and would transit to C8, ∼0.18 sec. The results indicate that accelerating the transcriptional rate can lead to more conformation flux to native structure. As the transcription rate further increases to 200 nt/sec, the lower pathway almost vanishes and at the end of transcription, the intermediate structure only occupies ∼1.87% (see Fig. 2A). Although the structures (C4 and C7) on the slow pathway have the lowest free energy during the transcription until step 81, as the cotranscription folding is a nonequilibrium kinetic process, even at the slowest transcription rate, they could only occupy a small portion of the population, which is far less than its equilibrium population (∼90%). At a higher transcription rate, there would be less time to get equilibration, so the intermediate state would get less population.

FIGURE 2.

The population kinetics of main states (C0–C8) formed during the transcription (A) at a transcription rate of 200 nt/sec, (B) at a transcription rate of 15 nt/sec with pausing (45 sec) at step 39.

The population kinetics of main states (C0–C8) formed during the transcription (A) at a transcription rate of 200 nt/sec, (B) at a transcription rate of 15 nt/sec with pausing (45 sec) at step 39. Transcription often pauses at U-rich sites (Gusarov and Nudler 1999; Artsimovitch and Landick 2000). Recently, it was found that pausing at a special site could lead to different folding structures (Helmling et al. 2018). Here, we use the 99-nt HDV ribozyme sequence as a model to test the effects of transcription pausing on the folding. Suppose the pausing time is 45 sec at step 39; the folding results are different from that without pausing (Fig. 2B). At the end of transcription, the inactive intermediate structure C7 occupies ∼80% of the population. Transcription pausing would provide extra time for the system to get equilibration, as the intermediate structures (C4 and C7) have the lowest free energy, so more population would go to the intermediate structure.

The 30-nt upstream fragment can lead to HDV ribozyme inactivity by forming an Alt1 helix with the segments 70–90

To explore how the flanking sequence affects the genomic HDV self-cleavage during transcription, the cotranscription folding kinetics of the 129-nt HDV ribozyme (from −30 to 99 fragment, with a 30-nt upstream) were also studied. As the chain increases, the nascent chain folds into a series of discrete intermediate states and at the end of transcription, the structure S12 occupies ∼87.1% and the structure S13 is ∼10.2%. Both of them do not contain the pseudoknot structure (Fig. 3). So the 30-nt upstream sequence leads to the HDV in cleavage-inactive states. The experimental results showed that for this sequence, the reaction was not complete after 24 h and helix Alt1 was formed (Chadalavada et al. 2000; Diegelman-Parente and Bevilacqua 2002). Our results are consistent with the experiments. Although there are little differences among the folded structures before nucleotide 63 is transcribed for the 129- and 99-nt sequences, they all fold into the same structures, denoted by S10 and C6, respectively, when nucleotides 64–81 are transcribed. For the 99-nt sequence, structure C6 would transit to the cleavage active state C8. For the 129-nt sequence, the structure S10 could also transit to the cleavage active state S14 with the same rate as that from C6 to C8. But from the nucleotide 82 transcribed, due to the ability of the upstream segment to form an Alt1 helix with nucleotides 79–86, structure S10 could also transit to structure S12 by adding the Alt1 helix. As structure S10 transits to S12 through adding the Alt1 helix, the transition rate from S10 to S12 k = 4.13 × 103 sec−1 is much larger than that from S10 to S14. Also, the free energy of S12 (ΔG = −52.70 kcal/mol) is much lower than that of S14 (ΔG = −43.45 kcal/mol). So, most of the population would transit to the cleavage inactive state S12.

FIGURE 3.

(A) The population kinetics of main states (S0–S14) formed during the transcription at a transcription rate of 15 nt/sec for HDV ribozyme with 30-nt upstream segment. (B) The folded structures and the transition pathways during the transcription. Upper case letters denote the ribozyme regions; lower case letters denote the flanking regions. The transition rates (unit sec−1) along the arrow are labeled.

The 54-nt upstream region can restore HDV ribozyme activity by forming a self-structure P(−1) helix

For the 153-nt HDV ribozyme (from −54 to 99, with a 54-nt upstream segment), the nascent chain folds into a series of discrete intermediate states, and at the end of transcription, the active structure X12 occupies ∼83.1%, while the inactive structure X13 occupies only 5.63% (Fig. 4), which is different from that of the 30-nt upstream flanking sequence but is similar to that of the 99-nt sequence. For the 153-, 129-, and 99-nt sequences, when the nucleotide 73 is released, the ribozyme regions (from nucleotide 1 to nucleotide 73) of the three sequences fold to the same structures denoted as C6, S10, and X10, respectively. The main difference is that for the 153-nt sequence, structure X10 includes a stable helix P(−1), which consists of the upstream sequence (from −54 to −17). Because the early formed P(−1) helix is stable, exchanging this helix with Alt1 is not only thermodynamically unfavorable but also kinetically unfavorable. So the P(−1) helix can prevent the 30-nt upstream segment from pairing with nucleotides 79–86 to form an Alt1 helix. Structure X10 would transit to the cleavage active state X12 as the similar ways of the structure C6 transits to structure C8 for the 99-nt sequence. For this sequence, the experiments (Chadalavada et al. 2000; Diegelman-Parente and Bevilacqua 2002) detected that the RNA was nearly completely cleaved during transcription and detected the hairpin P(−1). Our prediction is in agreement with the experimental results.

FIGURE 4.

(A) The population kinetics of main states (X0–X13) formed during the transcription at a transcription rate of 15 nt/sec for 153 nt. (B) The folded structures and the transition pathways during the transcription. Upper case letters denote the ribozyme regions; lower case letters denote the flanking regions. The pathways, the structure of the states, and the transition rates (unit sec−1) along the arrow are labeled.

The cotranscriptional folding is important for the HDV ribozyme with a 55-nt downstream region to perform cleavage activity

To further study the effects of the downstream flanking sequence on the cotranscriptional folding kinetics, we studied the cotranscriptional folding kinetics of the 140-nt (85-nt ribozyme region and 55-nt downstream fragment) sequence. When nucleotide 110 is released, it folds as the same structures as that of the 99 nt, the active structure C8 occupies ∼85% of the population, and the inactive structure C7 occupies ∼12% of the population (Fig. 5). When nucleotide 113 is transcribed, the downstream segment would fold to a hairpin AltP5, and then C8 would transit to structure C10 by adding the helix AltP5. If there is further elongation to nucleotide 128, the downstream segment would fold to a more stable helix P5, and the structure would transit from C10 to C14 through helix exchanging. Meanwhile, a small portion of the population would transit to C12 from C7 as that from C8 to C14.The self-cleavage structure C14 occupies ∼81.5%, and the inactive structure C13 occupies ∼14.7%. But when the downstream segments 129 to 132 are transcribed, this segment could form base pairs with the segments 80 to 83 of the ribozyme region, structure C12 would transit to C13. But this helix could not coexist with the P2 helix of active structure C14. Although the free energy of structure C13 (ΔG = −56.78 kcal/mol) is lower than that of C14 (ΔG = −53.80 kcal/mol), which occupies most of the population before C13 could be formed due to the cotranscription folding, as the transition rate from C14 to C13 k ≈ 0.04 sec−1 is much slower than the self-cleavage rate k = 40 min−1(Diegelman-Parente and Bevilacqua 2002), most of the native structure is self-cleaved before the native structure C14 transits to the alternative structure C13. For this sequence the experimental results showed that the RNA was almost completely cleaved during transcription and inferred the existence of helix P5 (Diegelman-Parente and Bevilacqua 2002; Chadalavada et al. 2007). Our prediction is consistent with the experimental results.

FIGURE 5.

(A) The population kinetics of main states (C0–C14) formed during the transcription at a transcription rate of 15 nt/sec for HDV ribozyme with 55-nt downstream segment. (B) The folded structures and the transition pathways during the transcription. C0 to C7 are the same as in Figure 1.

DISCUSSION

The HDV ribozyme, a human pathogen, has genomic and antigenomic versions of a small ribozyme embedded in its 1.7-kb RNA genome (Lai 1995; Lazinski and Taylor 1995; Karayiannis 1998). By virtue of their roles in rolling-circle viral replication, these small ribozymes have evolved to function cotranscriptionally in the presence of flanking sequence. To study the cotranscriptional folding kinetics, we have extended the newly developed RNA cotranscription folding kinetics theory (Zhao et al. 2010, 2011; Gong et al. 2015a) and the transition node approximation method (Gong et al. 2015b) with pseudoknots (Chen et al. 2014). In this approach, the transition rates for an elementary step are as follows: (i) formation, (ii) disruption of a helix stem, and (iii) helix formation with concomitant partial melting of an incompatible helix, which are calculated with the free energy landscape. Based on the creation/disruption and exchange of helices, we investigated the cotranscription folding behavior of the HDV ribozyme at discrete steps. The cotranscriptional folding kinetics for the 99-nt HDV ribozyme sequence, two upstream flanking sequences, and one downstream flanking sequence are studied with this new model. Our results indicate that this method is reliable for calculating the cotranscriptional folding kinetics of long mRNA chains. According to our calculations, we found that the cotranscriptional folding kinetics for the 99-nt HDV ribozyme is different from its refolding, which folds into an intermediate trap state, while during transcription, it can effectively avoid the trap intermediates and quickly fold to the cleavage-active state. It is dramatically different from its refolding kinetics. However, the existence of the 30-nt upstream flanking sequence can inhibit the ribozyme region folding into the active native state through forming an alternative helix Alt1 with the segments 70–90. The longer upstream flanking sequence 54 nt itself forms a stable hairpin structure that sequesters the formation of Alt1 helix and leads to rapid formation of cleavage-active structure. Although the 55-nt downstream flanking sequence could invade the already folded active structure during transcription by forming a more stable helix with the ribozyme region, the slow transition rate could keep the structure in the cleavage-active structure to perform the activity. Our method, based on the transition node approximation, can efficiently reduce the conformation ensemble to study the cotranscriptional folding kinetics of long RNA chains. The transcription rate and transcription pausing, which can be regulated by RNAP, concentration of ribonucleoside triphosphate (rNTP), proteins and other cellar environments, could also be incorporated into our methods. However, the current theory has limitations: The free energy parameters of this model are of RNA at 1 M NaCl solution condition, and the effect of the Mg2+ ions on these parameters is neglected. But Mg2+ can significantly stabilize the tertiary interactions (Draper 2004; Draper et al. 2005; Grilley et al. 2006; Tan and Chen 2010, 2011) and thus it may possibly alter the folding pathways. Furthermore, we could incorporate the tertiary interaction into this method. First, we obtain all the tertiary structures formed during transcription from the predicted secondary structures; a few methods have been developed to predict the tertiary structures from the secondary structures (Das and Baker 2007; Parisien and Major 2008; Popenda et al. 2012; Zhao et al. 2012; Xu et al. 2014). Then we could get the tertiary interaction through a molecular dynamical simulation (Wang et al. 2016). Hence, this method should be improved to enhance its efficiency as well as incorporate the Mg2+ effects and the effects of other cofactors, such as divalent counterion-specific binding and tertiary interaction, into the model.

MATERIALS AND METHODS

Master equation

Assuming that there are Ω states in the RNA conformation space, the population kinetics P(t) for each state i in the conformation space at time t can be described by the master equation: where Σj denotes the sum over all the conformations, and k (k) is the rate constant for the transition from state i to state j (state j to state i). The master equation can be written as the following matrix form: , where P(t) = (P1(t), P2(t), …, PΩ(t)) is the fractional population vector, and is the Ω × Ω rate matrix with elements M = k for i = j and for i ≠ j. Solving the master equation can yield the probability kinetics given the initial folding condition t > 0 as the following equation: where n and − λ are the m-th eigenvalue and eigenvector of rate matrix , respectively, and C is the coefficient that is dependent on the initial condition.

Generating conformation space

In our model, the rates for formation (k+) and disruption (k−) of a base stack can be calculated as (Zhao et al. 2010) and , respectively, where k0 is a prefactor, k is the Boltzmann constant, T is the temperature, ΔG+ and ΔG− are the energy barriers for the formation and disruption of a stack. Recently, it has been validated that the kinetic barrier for the formation of a base stack is concomitant entropic decrease: ΔG+ = TΔS, and that for the disruption of a base stack is the enthalpic cost: ΔG− = ΔH, from molecular dynamic simulation (Wang et al. 2016). Thus, the rates for formation and disruption of a base stack (not closing the loop) can be written as follows: The rates for formation and disruption of a loop-closing (and the loop) stack are where − ΔS is the entropy change of the loop, − ΔS and − ΔH are the entropy and enthalpy changes upon formation or disruption of the stack, respectively. The prefactor k0 is equal to 6.6 × 1012 sec−1 for the formation/disruption of a GC base pair and 6.6 × 1013 sec−1 for an AU base pair (Zhang and Chen 2006; Zhao et al. 2010, 2011; Chen and Zhang 2012; Chen et al. 2014). The rate for the formation of a base stack is usually larger than that of disrupting the stack, except the loop closing stack under the folding condition; hence, once the first few stacks in a helix are closed and stabilized, zipping of the subsequent stacks in the helix would be fast and it can quickly slip into the fully folded helix (Zhao et al. 2010). This suggests that it is proper to use the helices as building blocks for the study of the overall (i.e., slower) folding kinetics. In our model, the conformation space is constructed upon the helix-based building blocks and the kinetic move is the addition or deletion of a helix or an exchange between two helices (Chen et al. 2014). In our model (Zhao et al. 2010), RNA structures are constructed by helices, which consist of consecutive base stacks. There are three types of relationships between two helices (Fig. 6):

FIGURE 6.

Relationship of two helices, compatible: (A) and (B), partially compatible: (C) and (D), incompatible: (E).

Compatible: The two helices have no overlapping nucleotides with each other (Fig. 6A and B for the secondary structure and the pseudoknot structure, respectively). Partially compatible: The two helices have partial overlapping nucleotides with each other (Fig. 6C and D for the secondary structure and the pseudoknot structure, respectively). Incompatible: The two helices overlapping with each other (Fig. 6E). Relationship of two helices, compatible: (A) and (B), partially compatible: (C) and (D), incompatible: (E). Each structure must consist of compatible or partially compatible helices. If helix H is compatible with all the helices H (1 ≤ i ≤ m) of the structure containing m helices {H1, H2, …, H, …, H}, then a new structure with m + 1 helices { H1, H2, …, H, …, Hm + 1} can be formed by adding the new helix H + 1 to the m-helix structure. However, if helix H + 1 is partially compatible with helix H and compatible with all other helices, adding the helix H + 1 would involve an ensemble of (m + 1)-helix conformations, which contain partially melted helix H and partially melted helix H + 1. Because some base pairs in helix H would prohibit the formation of certain base pairs in helix H + 1, the disruption of such incompatible base pairs in H would allow the formation of base pairs in H + 1. The free energy of conformations without pseudoknots is calculated by the nearest-neighbor model (Xia et al. 1998; Mathews et al. 1999). The free energy of pseudoknots related conformations is calculated with the model proposed by Chen et al. (2014) based on the model of Eddy (Rivas and Eddy 1999). In the model, the stability of the stacks does not change whether they are involved in a pseudoknot or not, while the free energy of the loop in a pseudoknot is calculated as follows: where G is the free energy of a pseudoknot loop, G is the free energy of loops before the pseudoknot is formed, n is the number of free bases in the pseudoknot, and n is the number of paired bases in the pseudoknot.

Kinetic move set and rate constant calculation

Rate of adding or deleting a helix

As under the folding condition, the zipping pathway is the most probable pathway of forming a helix (Zhao et al. 2010). For the zipping pathway (Fig. 7A), as the rate for forming the first loop-closing stack is much smaller than that for adding a new stack to the existing stack, so the formation of the first stack is the rate-limited for formation of the helix. Hence, the free energy landscape shows a downhill profile after the formation of the third base stack. The rate k of the helix formation (along one specific pathway) could be approximated as the rate for the formation of the three-stack state (Zhao et al. 2010): where k denotes the transition rate from state i to j, K and are the forward and reverse probability of state i:

FIGURE 7.

(A) The free energy landscape along the zipping pathway. (B) Multiple pathways for the formation of a helix, the red solid lines denote the newly formed base pairs.

After the first base stack could be formed anywhere inside the helix for a given RNA molecular, the rate k for formation of a helix is the sum of the rates along all the pathways (Fig. 7B) with the different first (nucleation) base stacks. The rate for deleting the helix can be estimated from the detailed balance condition: where ΔG is the free energy difference between the two structures. (A) The free energy landscape along the zipping pathway. (B) Multiple pathways for the formation of a helix, the red solid lines denote the newly formed base pairs.

Rate of exchanging between two helices

If two helices are incompatible, they cannot coexist in the same structure. The most probable pathway of the conversion from helix A to helix B is the tunneling pathway (Fig. 8), where after the first two base stacks of helix A are disrupted, in each subsequent step, disruption of a stack in helix A is accompanied by formation of a stack in helix B. Based on the tunneling pathway, the rate for helix exchange can be calculated as (Zhao et al. 2010): where k is the rate constant for the formation (disruption) of a base stack in A (B), and k is the rate constant for the disruption (formation) of a base stack in A (B). ΔG is the free energy difference between structure A and structure B.

FIGURE 8.

The free energy landscape for the transition between helices A and B. (Solid line) The tunneling pathway. (Dotted line) Completely unfolding helix A followed by refolding to B. The 5′ end of the RNA chain is denoted by a blue pentagram.

Cotranscriptional folding kinetics

In our model (Zhao et al. 2011), releasing one nucleotide by RNA polymerase (RNAP) to freely form possible structures can be regarded as a transcriptional step. If the transcription speed of a RNA sequence is ν nucleotides per seconds, the (real) time window for M-th nucleotide (nt) to be (newly) transcribed to the (M+1)-th nucleotide will be 1/ν sec. Then the M-nt chain samples, the conformation space Ω in the 1/ν sec and its population distribution is relaxed from [p1(M), p2(M), …, pΩ(M)] to [p1(M), p2(M), …, pΩ(M)], where the p(M), p(M) are the populations of state i at the beginning and the end of the step M, respectively. This is defined as the M-nt step. For each step, the population kinetics is calculated in the same manner. The beginning population of the M-th step can be inherited from the ending population of the (M−1)-th step. According to the possible changes of the structures upon the extension of the chain by one nucleotide, the structures in the current step can by classified as four types (Fig. 9). Types a and b: The newly transcribed M-th nucleotide does not pair with any nucleotides. The new M-th nt chain can retain the same structure as the (M−1)-th nt chain. Type c: The newly transcribed nucleotide can pair with an upstream nucleotide to elongate a helix by one base pair. Because the zipping of a new stack (base pair) is much faster than transcribing a nucleotide (Zhao et al. 2011), the two structures can be recognized as “directly inherited” and thus have the same population. Type d: the newly transcribed nucleotide can pair with another nucleotide to form a new helix, which cannot be formed in the previous step. In this category, all structures in the current step containing the new helix have a population of zero at the beginning of the M-th step. Then the population distribution at the beginning of step M can be summarized by

FIGURE 9.

The population lineage and the transition node approximation at the M-th step. The pentagram and square denote the last nucleotide released by RNAP at steps M−1 and M, respectively. The conformation space at the M-th step can be divided into two ensembles: Ω, the conformation ensemble of states directly inherited from step M−1; and Ω, the conformation ensemble of states newly formed at step M and with zero initial population. : a sub-ensemble of Ω, in which the initial population of each state is larger than 0.03. : a sub-ensemble of Ω, in which each state has the free energy ΔG ≤ (maxE+2) kcal/mol, where maxE is the free energy of the most unstable state in . : an ensemble of all rest states of Ω. The green line shows the assumed transition pathway between state m and state n, whereas the two intermediate states are denoted by i and j.

Transition node approximation

All the possible structures of the M-th step can be divided into two ensembles (Fig. 9): Ω, in which the states are of the a, b, and c types and with nonzero initial population; and Ω, in which the states (type d) contain the newly formed helix at this step and with zero initial population. Most of the initial population at each transcriptional step is mainly concentrated in a few stable and metastable states (Zhao et al. 2010), which comprise the sub-ensemble . The threshold for the initial population of states in is set to be 0.03, which can ensure that above 90% of the population is occupied by states in at each transcription step for HDV. If the new states are much more unstable than these old states in , it is impossible for them to obtain a population. Assuming that the free energy of the most unstable states in is maxE, then the new conformation ensemble Ω can be divided into two sub-ensembles: , in which all the free energy of each conformation is lower than (maxE + 2) kcal/mol; and , which contains the rest conformations in Ω. As the free energy of these structures in is at least +2 kcal/mol higher than those in , so they can occupy far less than 1% of the population even at equilibrium. Thus, for the newly formed conformations, only those in are possible to have the population accumulation. Although states in are unlikely to contribute to the population aggregation, some conformations may be on the main pathways for population flow. So the reduced assemble conformation space consists of structures in Ω, , and parts of . Because the transcriptional time scale of HDV is smaller than 15 sec, if the rate of one transition pathway is much less than (0.0044) sec−1, there almost will be no RNAs folded via this pathway. Hence, if the transition rates along the direction of the population flow slower than (0.0044) sec−1, they are unlikely to contribute to the folding. Thus, we only consider the pathways through states in that can transit to states in with corresponding transition rates greater than (0.0044) sec−1, and if there is no pathway between the two states, then we will relax the criteria to nine nodes or more to avoid isolate states in the transition network. After all the possible transitions are searched, the structures in Ω, , together with the structures in located in the saved transition pathways build up the conformation ensemble used for subsequent calculations.

94 in total

Review 1. Beyond kinetic traps in RNA folding.

Authors: D K Treiber; J R Williamson
Journal: Curr Opin Struct Biol Date: 2001-06 Impact factor: 6.809

Review 2. A guide to ions and RNA structure.

Authors: David E Draper
Journal: RNA Date: 2004-03 Impact factor: 4.942

3. Predicting ion binding properties for RNA tertiary structures.

Authors: Zhi-Jie Tan; Shi-Jie Chen
Journal: Biophys J Date: 2010-09-08 Impact factor: 4.033

4. Mg2+-RNA interaction free energies and their relationship to the folding of RNA tertiary structures.

Authors: Dan Grilley; Ana Maria Soto; David E Draper
Journal: Proc Natl Acad Sci U S A Date: 2006-09-11 Impact factor: 11.205

5. Wild-type is the optimal sequence of the HDV ribozyme under cotranscriptional conditions.

Authors: Durga M Chadalavada; Andrea L Cerrone-Szakal; Philip C Bevilacqua
Journal: RNA Date: 2007-10-23 Impact factor: 4.942

6. Simulating RNA folding kinetics on approximated energy landscapes.

Authors: Xinyu Tang; Shawna Thomas; Lydia Tapia; David P Giedroc; Nancy M Amato
Journal: J Mol Biol Date: 2008-02-13 Impact factor: 5.469

7. Folding kinetics of large RNAs.

Authors: Michael Geis; Christoph Flamm; Michael T Wolfinger; Andrea Tanzer; Ivo L Hofacker; Martin Middendorf; Christian Mandl; Peter F Stadler; Caroline Thurner
Journal: J Mol Biol Date: 2008-03-06 Impact factor: 5.469

8. Salt contribution to RNA tertiary structure folding stability.

Authors: Zhi-Jie Tan; Shi-Jie Chen
Journal: Biophys J Date: 2011-07-06 Impact factor: 4.033

9. Fate of an intervening sequence ribonucleic acid: excision and cyclization of the Tetrahymena ribosomal ribonucleic acid intervening sequence in vivo.

Authors: S L Brehm; T R Cech
Journal: Biochemistry Date: 1983-05-10 Impact factor: 3.162

10. Predicting RNA pseudoknot folding thermodynamics.

Authors: Song Cao; Shi-Jie Chen
Journal: Nucleic Acids Res Date: 2006-05-18 Impact factor: 16.971

4 in total

1. Landscape Zooming toward the Prediction of RNA Cotranscriptional Folding.

Authors: Xiaojun Xu; Lei Jin; Liangxu Xie; Shi-Jie Chen
Journal: J Chem Theory Comput Date: 2022-02-08 Impact factor: 6.006

2. What is the best reference state for building statistical potentials in RNA 3D structure evaluation?

Authors: Ya-Lan Tan; Chen-Jie Feng; Lei Jin; Ya-Zhou Shi; Wenbing Zhang; Zhi-Jie Tan
Journal: RNA Date: 2019-04-17 Impact factor: 4.942

3. Structure folding of RNA kissing complexes in salt solutions: predicting 3D structure, stability, and folding pathway.

Authors: Lei Jin; Ya-Lan Tan; Yao Wu; Xunxun Wang; Ya-Zhou Shi; Zhi-Jie Tan
Journal: RNA Date: 2019-08-07 Impact factor: 4.942

4. Combined small molecule and loss-of-function screen uncovers estrogen receptor alpha and CAD as host factors for HDV infection and antiviral targets.

Authors: Eloi R Verrier; Amélie Weiss; Charlotte Bach; Laura Heydmann; Vincent Turon-Lagot; Arnaud Kopp; Houssein El Saghire; Emilie Crouchet; Patrick Pessaux; Thomas Garcia; Patrick Pale; Mirjam B Zeisel; Camille Sureau; Catherine Schuster; Laurent Brino; Thomas F Baumert
Journal: Gut Date: 2019-03-04 Impact factor: 31.793

4 in total