Literature DB >> 31249438

Computational analysis of non-coding RNAs in Alzheimer's disease.

Ghulam Md Ashraf1, Magdah Ganash2, Alexiou Athanasios3,4.   

Abstract

Latest studies have shown that Long Noncoding RNAs corresponds to a crucial factor in neurodegenerative diseases and next-generation therapeutic targets. A wide range of advanced computational methods for the analysis of Noncoding RNAs mainly includes the prediction of RNA and miRNA structures. The problems that concern representations of specific biological structures such as secondary structures are either characterized as NP-complete or with high complexity. Numerous algorithms and techniques related to the enumeration of sequential terms of biological structures and mainly with exponential complexity have been constructed until now. While BACE1-AS, NATRad18, 17A, and hnRNP Q lnRNAs have been found to be associated with Alzheimer's disease, in this research study the significance of the most known β-turn-forming residues between these proteins is computationally identified and discussed, as a potentially crucial factor on the regulation of folding, aggregation and other intermolecular interactions.

Entities:  

Keywords:  17A; Alzheimer's disease; BACE1-AS; NAT-Rad18; RAD18; hnRNP Q; long noncoding RNAs; secondary structure prediction; strict β-turns; structural alignment

Year:  2019        PMID: 31249438      PMCID: PMC6589468          DOI: 10.6026/97320630015351

Source DB:  PubMed          Journal:  Bioinformation        ISSN: 0973-2063


Background

Noncoding RNAs (ncRNAs) play important roles in many biological mechanisms offering to the researcher's opportunities for efficient biomarkers' detection and disease diagnosis, treatment, prognosis and prevention 1-3. While only 1.5% of the whole genome is corresponding to protein-coding genes 1, various Long Noncoding RNAs (lncRNAs) such as BACE1-AS are closely related to the Alzheimer's disease (AD) 4-6, modulating Aβ formation or impacting apoptosis 7-9. Beta-site Amyloid Precursor Protein Cleaving Enzyme 1 - Antisense Transcript (BACE1-AS) enhances BACE1 mRNA stability by protecting it from degradation 9, concluding to a highly correlation with AD development or progression 5,10-12 as well as lncRNA-17A which play a significant role in Gamma-Aminobutyric Acid Type B Receptor Subunit 2 (GABABR2) signaling and Aβ production 7,9. Additionally, heterogeneous nuclear Ribonucleoprotein Q (hnRNPs) family assist in controlling the maturation of newly formed heterogeneous nuclear RNAs (hnRNAs/pre-mRNAs) into messenger RNAs (mRNAs), stabilize mRNA during their cellular transport and control their translation 13, affecting the dendritic development 14-19. Latest studies also reveal the role of Postreplication repair protein RAD18 (NAT-Rad18) in AD by affecting the DNA repair system, leading to apoptosis and neurodegenration 7. In contrast to protein folding programs, where the tertiary structure is predicted, the majority of the currently available RNA M-folding algorithms concentrate on the secondary structure of the RNA structure. Current RNA prediction algorithms have a polynomial runtime of O(n3) where n is the sequence length. Still, the mere knowledge of the secondary structure can be misleading, as two similar tertiary structures can have different secondary structures 20. The problems that concern representations of certain biological structures such as secondary structures are either characterized as NP-complete or with high complexity. The incompleteness of the corresponding theories contributes to a kind of hybrid problem, where data mining, statistical analysis, biological interpretation, and computational techniques must interact in different phases, in order to produce a solution. Numerous algorithms and techniques related to the enumeration of sequential terms of biological structures and mainly with exponential complexity have been constructed through their bijection with alternative representations such as energy models, plane trees and Motzkin numbers, non-crossing set partitions, Motzkin paths and Dyck paths 21. In contrast to protein folding programs, where the tertiary structure is predicted, the majority of the currently available RNA M-folding algorithms concentrate on the secondary structure of the RNA structure. The first reason for this difference is a pragmatic one. Current RNA prediction algorithms have a polynomial runtime of O(n3) where n is the sequence length. This is fast enough to allow genome-wide analysis on current off-the-shelf computers. The consideration of the tertiary structure, however, leads to a super polynomial-runtime impeding any large-scale application 22. The second reason is related to the kinetic of RNA folding. Secondary structures form first, leading to a set of loops and helices, which once formed, interact to yield the tertiary structure. As a consequence, the determination of the tertiary structure depends strongly on the secondary structure 23. Still, the mere knowledge of the secondary structure can be misleading, as two similar tertiary structures can have different secondary structures 20.

Methodology

Latest studies have already revealed the correlation between specific lncRNAs to AD pathologies and lesions in brain regions like the middle temporal gyrus, the prefrontal cortex, the striatum the cerebellum and the hippocampus and other CNS related disorders 8, 24-27. The secondary structures of four proteins related to AD have been examined in this study BACE1, Rad18, GABABR2 and hnRNPQ targeted from the corresponding lnRNAs BACE1-AS, NAT-Rad18, 17A and hnRNP Q 28. A protein statistics-analysis was initially executed with the QIAGEN CLC Main Workbench (supplementary material available with authors). For the computational analysis the sequences 6EJ3(BACE1_HUMAN), 4F12(GABABR2_HUMAN), 4UX8 (hnRNPQ_HUMAN), 2Y43 (RAD18_HUMAN) were imported from the Protein Databank, avoiding the use of prediction methods for the identification of secondary elements in order to reduce additional errors.

Results

ClustalOmega software, which has been imported in the ESPript 3.0 software for further displaying an analysis of the corresponding secondary structures (Figure 1) 29. In the ESpript output, both the secondary and primary structures are displayed in separate rows, where dots represent gaps, a stands for alpha helix, β for beta strand, TT for strict β-turns, TTT for strict α-turns, alpha helices are shown as squiggles and β-strands as arrows in the multiple alignment representation (Figure 2 - available with authors), in order to identify similarities and patterns between the proteins.
Figure 1

BACE1-AS secondary structure

Few interesting properties are identified in certain positions of BACE1 and GABABR2 (Table 1). In position (65) there is a decrease in hydrophobicity and a simultaneous increase in the antigenicity of BACE1 (Figure 3). In the corresponding aligned positions of (66,67), there is a decrease in hydrophobicity and antigenicity. Furthermore, in positions (64, 65) b-strict turns to occur in both proteins, while the positions (61-64) of BACE1 have the same levels of hydrophobicity and antigenicity. It is noticed from the computational analysis that β-turns are appeared to be part of the spheroproteins surface and their residues are hydrophilic 30. Therefore, it seems that in regions with β-turns hydrophobicity is reduced, affecting the folding of each protein and changing the direction of polypeptide's chain (Table 2). In this study, the regions with this interesting property can be found on the common BACE1 and GABABR2 α-turns. In positions (64,65) of BACE1 and the corresponding GABABR2 aligned positions, there are identically aligned secondary structures of β-turns. In both turns, hydrophobicity shown to be reduced from a stable state, which confirms the statements concerning the hydrophobicity. In the same region BACE1 and GABABR2 switch from positive to negative hydrophobicity (0.06 to -0.22) and (0.14 to -0.28) respectively. Furthermore, in certain β-turns BACE1 consists of aspartic acid and GABABR2 consists of lysine and glutamic acid which are hydrophilic residues. Although, the β-turns consist of different residues in general, they still affect the protein folding precisely in the same way. Several research studies since the 70s, underlie the exceptional role of β-turns while they correspond approximately to the 30% of all the protein residues 31-33. These type of secondary structures are strongly related to protein folding mechanisms depending mainly on their topology, functionality, and stability. According to their classification, β-turns can establish the initiation of folding and in some cases, the substantial destabilization of locally encoded protein features can lead to misfolding 30.
Table 1

Positions of interest with similar properties between BACE1 and GABABR2

Positions616263646566
BACE1_HUMANVEMVDN
Positions65666768
GABR2_HUMANTKEV
Figure 3

Hydrophobicity and antigenicity plots of BACE1 and GABABR2

Table 2

The numbers correspond to BACE1. In the case of a gap in BACE1, the number corresponds to GABABR2 with an additional (*). If there is a gap in both BACE1 and GABABR2, the number corresponds to hnRNPQ with an additional identifier

Regions of interestProteinsStructureHigh similarity
64-65BACE1_HUMAN GABR2_HUMANStrict β-turn65
Strict β-turn
67BACE1_HUMANStart of β167
HNRPQ_HUMANStart of α1
77-81HNRPQ_HUMANα279
RAD18_HUMANη1, the start of α1(81)
99BACE1_HUMANStart of β4
HNRPQ_HUMANStart of α3
RAD18_HUMANEnd of α1
106-109BACE1_HUMAN HNRPQ_HUMANβ-turn (107-108)107
RAD18_HUMANend of α3 (106)
strict α- (107-109)109
118-120BACE1_HUMAN GABR2_HUMANβ-turn (119-120)
RAD18_HUMANend of β2 (119)
end of β1 (118)
136*GABR2_HUMANstart of α3
RAD18_HUMANstart of strict α-turn
144*GABR2_HUMANend of α3
RAD18_HUMANstart of η2
149BACE1_HUMAN GABR2_HUMANstart of strict α-turn
end of β4
155-156BACE1_HUMANstart of β7
GABR2_HUMANend of η2(55)
start of α5(56)
HNRPQ_HUMANstart of α4
RAD18_HUMANβ3
173BACE1_HUMAN GABR2_HUMANend of β-turn
end of α5
178BACE1_HUMAN GABR2_HUMANstart of β8
HNRPQ_HUMANstart of β6
start of α6
211BACE1_HUMAN GABR2_HUMANstart of β9
end of β7
216BACE1_HUMAN GABR2_HUMANend of β9
start of α7
237-240BACE1_HUMAN GABR2_HUMAN HNRPQ_HUMANend of β10 (237)237
RAD18_HUMANend of β8 (239)239
end of α6 (240)240
β-turn (238-239)
242BACE1_HUMAN GABR2_HUMANstart of η3
start of α8
269BACE1_HUMAN GABR2_HUMANend of β12
start of β-turn
270BACE1_HUMAN GABR2_HUMANstart of β-turn
end of β-turn
273-274BACE1_HUMAN GABR2_HUMANend of β13 (273)
start of α10 (274)
285-286BACE1_HUMAN GABR2_HUMANstart of β14 (286)
end of α10 (285)
311-312BACE1_HUMAN GABR2_HUMANend of α2 (312)
HNRPQ_HUMANend of α11 (311)312
start of β-turn (312)
326-327GABR2_HUMANend of β11 (326)326
start of β-turn (327)327
HNRPQ_HUMANend of α7 (327)
330-331BACE1_HUMANend of β16 (330)331
HNRPQ_HUMANstart of η1 (331)
420*-421*GABR2_HUMANend of β-turn (420*)
end of η1 (420*)
HNRPQ_HUMANstart of β13 (421*)
422*-423*BACE1_HUMAN GABR2_HUMANstart of β-turn (423*)422
end of β13 (422*)
334-335BACE1_HUMAN GABR2_HUMANend of β-turn (334)334
start of β14 (335)335
353BACE1_HUMAN GABR2_HUMANstart of β-turn
end of β15
360-361BACE1_HUMAN GABR2_HUMANend of β18 (361)360
HNRPQ_HUMANend of β16 (361)361
end of β-turn (360)

Discussion

A secondary structure S on a sequence s is a set of ordered base pairs (Si, Sj), where i i) If (si, sj) ? S then {si, sj} ?{{U,A}, {G,C}, {G,U}}, where {U,A} and {G,C} are called Watson-Crick pairs and {G,U} is called wobble pair ii) If (si,sj) ? S and (si,sl)) ? S then j = l iii) If ((si, sj) ? S and (sk, sl)) ? S and i < k then l < j or j < k In other words, constraint i) means that only Watson-Crick and wobble ordered base pairs may form. Constraint ii) states that a nucleotide may be involved in at most one ordered base pair. Constraint iii) implies that all ordered base pairs are nested, i.e. that no pseudoknots are allowed in the secondary structure. While these constraints greatly simplify the folding algorithms, none of the above constraints is biologically relevant. Further pseudoknots appear in many important RNAs structures, albeit at a low frequency. For example, in the small ribosomal unit in E.coli from the 447 reported Watson-Crick and wobble ordered base pairs only 8 are pseudoknots 34. Any secondary structure generated under these rules can be decomposed into a unique set of a loop 35. A loop is a substructure which consists of a closing ordered base pair (Si, Sj) and all nucleotides that are accessible from this ordered base pair. A nucleotide sp is accessible from (Si, Sj) if ichildren which correspond to the 5� and 3� nucleotides of the ordered base pair, respectively. In a Shapiro-Zhang tree, the different loops and stacked regions are represented explicitly with special labels 39. Arc annotated sequences focus on representing sequences as straight lines. Arcs indicate base pairings. A similar representation to the arc-annotated sequence is the drawing of this sequence on a circle. Arcs are plotted as curved lines inside this circle. The mountain plot is useful for large RNAs. Plateaus represent unpaired regions; the heights of these mountains are determined by the number of ordered base pairs in which the partial sequences are embedded. Specifically, the mountain plot representation maps the secondary structure into a 2-dimensional graph where the x-axis represents the position along the RNA sequence and the y-axis corresponds to the number of ordered base pairs that enclose nucleotide k. The dot plot representation maps the structure to a matrix where a dot at position (i, j) represents the ordered base pair (Si, Sj). The secondary structure of an RNA molecule is the collection of ordered base pairs that occur in its 3D structure. When the 5�- end of one nucleotide fits the 3�-end of another, a p-bond is formed, while the sequence of p-bonds defines the backbone of the molecules. On the other hand certain ordered base pairs like {C, G},{ A, U}, and {G, U} form h-bonds, which cause folding of the molecular backbone into a configuration of minimal energy 40. In some cases unusual non-canonical ordered base pairs, like {G, U}, {G, A} and {C, A} replace the canonical Watson-Crick ordered base pairs, which maintained a stable helical structure. While these noncanonical pairings allow possible hydrogen-bonding interactions and can be treated as neutral evidence for a helical structure, there seems to be evidence against pairing 41. A secondary structure of size n is closed 40 if there is an h-bond connecting base 1 and n and for known integers n = 2, l = 0, there are S(l) (n-2) secondary structures of size n and rank l, establishing also a bijection between the set of all closed secondary structures Z(l)(n) and the set of all plane trees with exactly n leaves T(l)(n). A constraint satisfaction formulation was also used for RNA prediction problem including genetic mapping 42, physical mapping 43 and structure prediction 44. The ultimate goal of structure prediction is to obtain the three-dimensional structure of biomolecules through computation. The key concept for solving the above-mentioned problem is the appropriate representation of the biological structures. Nowadays, an increasing number of researchers have released novel RNA structure analysis and prediction algorithms for comparative approaches to structure prediction, based on the fact that closed RNA structures can be viewed as mathematical objects obtained by abstracting topologically non-relevant properties of planar folding of singlestranded nucleic acids. There are a lot of approaches on this topic, such as dynamic programming algorithms 45, stochastic algorithms such as Bioambiens calculus 46, comparative methods 47, simulated annealing 48, artificial neural net algorithms and most recently evolutionary algorithms which attempt to mimic the natural folding pathway by using populations based approach 49.

Conclusion

While specific lncRNAs have been already correlated to certain AD lesions, a new computational analysis of the proteins BACE1, Rad18, GABABR2 and hnRNPQ have been presented in this study. Using the QIAGEN CLC Main Workbench, the ClustalOmega software and the ESPript 3.0 software, a detailed analysis of the corresponding secondary structures for the sequences 6EJ3, 4F12, 4UX8, 2Y43 has been executed. The results of our computational analysis identified common properties in aligned positions with high similarity score, identical secondary structure match, increased hydrophilicity, and negative antigenicity, revealing simultaneously strong evidence that the proteins under consideration, may have common functionality in those regions that regulate folding and aggregation and prevent binding of immune factors. These conclusions reveal the significance of the most known α-turn-forming residues, which participate in ligand binding, molecular recognition, protein-protein or protein-nucleic acid interactions and modulation of protein functions and intermolecular interactions, in proteins commonly linked to AD development or progression.

Conflict of Interest

Authors declare no conflict of interest.
  4 in total

1.  Genetic architecture of RNA editing regulation in Alzheimer's disease across diverse ancestral populations.

Authors:  Olivia K Gardner; Derek Van Booven; Lily Wang; Tianjie Gu; Natalia K Hofmann; Patrice L Whitehead; Karen Nuytemans; Kara L Hamilton-Nelson; Larry D Adams; Takiyah D Starks; Michael L Cuccaro; Eden R Martin; Jeffery M Vance; William S Bush; Goldie S Byrd; Jonathan L Haines; Gary W Beecham; Margaret A Pericak-Vance; Anthony J Griswold
Journal:  Hum Mol Genet       Date:  2022-08-25       Impact factor: 5.121

Review 2.  Long Non-Coding RNA: Dual Effects on Breast Cancer Metastasis and Clinical Applications.

Authors:  Qi-Yuan Huang; Guo-Feng Liu; Xian-Ling Qian; Li-Bo Tang; Qing-Yun Huang; Li-Xia Xiong
Journal:  Cancers (Basel)       Date:  2019-11-16       Impact factor: 6.639

3.  Plasmonically Enhanced CRISPR/Cas13a-Based Bioassay for Amplification-Free Detection of Cancer-Associated RNA.

Authors:  Lin Liu; Zheyu Wang; Yixuan Wang; Jingyi Luan; Jeremiah J Morrissey; Rajesh R Naik; Srikanth Singamaneni
Journal:  Adv Healthc Mater       Date:  2021-08-08       Impact factor: 11.092

Review 4.  Heterogeneous Nuclear Ribonucleoproteins: Implications in Neurological Diseases.

Authors:  Yi-Hua Low; Yasmine Asi; Sandrine C Foti; Tammaryn Lashley
Journal:  Mol Neurobiol       Date:  2020-09-30       Impact factor: 5.590

  4 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.