| Literature DB >> 31835885 |
Tatsunori Osone1, Naohiro Yoshida1,2.
Abstract
MicroRNAs are important genes in biological processes. Although the function of microRNAs has been elucidated, the relationship between the sequence and the disease is not sufficiently clear. It is important to clarify the relationship between the sequence and the disease because it is possible to clarify the meaning of the microRNA genetic code consisting of four nucleobases. Since seed theory is based on sequences, its development can be expected to reveal the meaning of microRNA sequences. However, this method has many false positives and false negatives. On the other hand, disease-related microRNA searches using network analysis are not based on sequences, so it is difficult to clarify the relationship between sequences and diseases. Therefore, RNA-RNA interactions which are caused by hydrogen bonding were focused on. As a result, it was clarified that sequences and diseases were highly correlated by calculating the electric field in microRNA which is considered as the torus. It was also suggested that four diseases with different major classifications can be distinguished. Conventionally, RNA was interpreted as a one-dimensional array of four nucleobases, but a new approach to RNA from this study can be expected to provide a new perspective on RNA-RNA interactions.Entities:
Keywords: disease; hydrogen bond; microRNA; regression analysis
Year: 2019 PMID: 31835885 PMCID: PMC6952923 DOI: 10.3390/cells8121615
Source DB: PubMed Journal: Cells ISSN: 2073-4409 Impact factor: 6.600
Figure 1Site where nucleobase hydrogen bonds are located. Each of the three sites was designated as the First site (red circle), the Second site (green circle), and the Third site (blue circle). In addition, third site includes Adenine C2 hydrogen that is not used for RNA–RNA interactions.
Figure 2Placement of nucleobases relative to the circle (a) Structure that is most susceptible to hydrogen bonding with other RNA (Structure A) (b) Structure that is most stable (Structure B) (c) Structure between A and B (Structure C) (d) Figure excerpted and enlarged from part of structure C.
Charge at the site of hydrogen bonding in base pair.
| Nucleobase | First | Second | Third |
|---|---|---|---|
| Adenine | 0.417 | −0.514 | 0.228 |
| Uracil | −0.506 | 0.408 | −0.471 |
| Guanine | −0.490 | 0.414 | 0.413 |
| Cytosine | 0.379 | −0.558 | −0.476 |
Duplicate in each scoring method.
| Methods | Total Number of Duplicate Values | Total Number of Unique Values | Percentage of Unique Values |
|---|---|---|---|
| VS | 960 | 1696 | 64 |
| Sum | 454 | 2202 | 83 |
| EV_C_A | 832 | 1824 | 69 |
| EV_C_B | 706 | 1950 | 73 |
| EV_C_C | 715 | 1941 | 73 |
| EV_S_A | 615 | 2041 | 77 |
| EV_S_B | 78 | 2578 | 97 |
| EV_S_C | 65 | 2591 | 98 |
| ref* | 24 | 2632 | 99 |
* For reference, the duplicates of miRNA sequences were presented.
3D regression line of miRNA in each scoring method.
| Methods | ||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Slope | S.D. |
| Slope | S.D. |
| Slope | S.D. |
| ||||
| VS | 1.04 | 0.01 | 2.E − 123 | 1.00 | 0.86 | 0.01 | 2.E − 126 | 1.00 | 1.10 | 0.01 | 4.E − 133 | 1.00 |
| Sum | −0.65 | 0.07 | 1.E − 14 | −0.67 | 0.45 | 0.09 | 1.E − 06 | 0.45 | 0.13 | 0.10 | 2.E − 01 | 0.11 |
| EV_C_A | 3.39 | 0.01 | 4.E − 150 | 1.00 | NA | NA | NA | NA | NA | NA | NA | NA |
| EV_C_B | 3.38 | 0.01 | 3.E − 148 | 1.00 | 0.02 | 0.00 | 9.E − 07 | 0.46 | 2.54 | 0.50 | 2.E − 06 | 0.45 |
| EV_C_C | 3.39 | 0.01 | 1.E − 149 | 1.00 | 0.02 | 0.00 | 1.E − 12 | 0.63 | 5.36 | 0.67 | 2.E − 12 | 0.62 |
| EV_S_A | 1.41 | 0.05 | 2.E − 48 | 0.93 | NA | NA | NA | NA | NA | NA | NA | NA |
| EV_S_B | 3.56 | 0.01 | 1.E − 149 | 1.00 | 11.08 | 0.01 | 7.E − 216 | 1.00 | 0.03 | 0.00 | 4.E − 153 | 1.00 |
| EV_S_C | 3.56 | 0.01 | 6.E − 150 | 1.00 | 11.12 | 0.03 | 2.E − 172 | 1.00 | 0.03 | 0.00 | 2.E − 141 | 1.00 |
NA: The item could not be calculated because there was no value.
Figure 33D scatter plot of VS calculation results in AD. The X-axis was the First site score, the Y-axis was the Second site score, and the Z-axis was the Third site score. (a) Result when using fold change. (b) Figure enlarging the range of 0 to 200 in Figure 3a. (c) Result when using log2FC.
3D regression line of AD in each scoring method.
| Methods | |||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Slope | S.D. |
| Slope | S.D. |
| Slope | S.D. |
| |||||
|
| VS | 1.04 | 0.00 | 0.E + 00 | 1.00 | 0.85 | 0.00 | 0.E + 00 | 1.00 | 1.13 | 0.00 | 0.E + 00 | 1.00 |
| Sum | −0.25 | 0.04 | 2.E − 09 | −0.30 | 0.12 | 0.08 | 1.E − 01 | 0.08 | 0.16 | 0.04 | 2.E − 05 | 0.21 | |
| EV_C_A | 3.31 | 0.01 | 0.E + 00 | 1.00 | NA | NA | NA | NA | NA | NA | NA | NA | |
| EV_C_B | 3.28 | 0.02 | 0.E + 00 | 1.00 | 0.02 | 0.00 | 7.E − 15 | 0.38 | 2.24 | 0.28 | 1.E − 14 | 0.38 | |
| EV_C_C | 3.29 | 0.01 | 0.E + 00 | 1.00 | 0.02 | 0.00 | 8.E − 33 | 0.56 | 4.77 | 0.36 | 7.E − 33 | 0.56 | |
| EV_S_A | 1.25 | 0.10 | 5.E − 33 | 0.56 | NA | NA | NA | NA | NA | NA | NA | NA | |
| EV_S_B | 3.52 | 0.01 | 0.E + 00 | 1.00 | 11.10 | 0.01 | 0.E + 00 | 1.00 | 0.03 | 0.00 | 0.E + 00 | 1.00 | |
| EV_S_C | 3.53 | 0.01 | 0.E + 00 | 1.00 | 11.10 | 0.01 | 0.E + 00 | 1.00 | 0.03 | 0.00 | 0.E + 00 | 1.00 | |
|
| VS | 1.03 | 0.00 | 0.E + 00 | 1.00 | 0.87 | 0.00 | 0.E + 00 | 1.00 | 1.11 | 0.00 | 0.E + 00 | 1.00 |
| Sum | 0.06 | 0.02 | 5.E − 05 | 0.20 | −0.64 | 0.14 | 4.E − 06 | −0.24 | 0.88 | 0.04 | 2.E − 65 | 0.73 | |
| EV_C_A | 3.26 | 0.00 | 0.E + 00 | 1.00 | NA | NA | NA | NA | NA | NA | NA | NA | |
| EV_C_B | 3.26 | 0.00 | 0.E + 00 | 1.00 | −0.01 | 0.00 | 1.E − 15 | −0.39 | −5.02 | 0.61 | 2.E − 15 | −0.39 | |
| EV_C_C | 3.26 | 0.00 | 0.E + 00 | 1.00 | 0.00 | 0.00 | 6.E − 01 | 0.03 | 0.61 | 1.02 | 5.E − 01 | 0.04 | |
| EV_S_A | 1.39 | 0.00 | 0.E + 00 | 1.00 | NA | NA | NA | NA | NA | NA | NA | NA | |
| EV_S_B | 3.41 | 0.00 | 0.E + 00 | 1.00 | 11.18 | 0.00 | 0.E + 00 | 1.00 | 0.03 | 0.00 | 0.E + 00 | 1.00 | |
| EV_S_C | 3.43 | 0.00 | 0.E + 00 | 1.00 | 10.95 | 0.01 | 0.E + 00 | 1.00 | 0.03 | 0.00 | 0.E + 00 | 1.00 | |
Comparison of x − y slope of 3D regression line in VS.
| Expression ratio | 2σ | 3σ | |||||
|---|---|---|---|---|---|---|---|
| COPD | ST | TB | COPD | ST | TB | ||
|
| AD | × | ◯ | × | × | ◯ | × |
| COPD | ◯ | × | ◯ | × | |||
| ST | ◯ | ◯ | |||||
|
| AD | ◯ | ◯ | ◯ | × | ◯ | × |
| COPD | ◯ | ◯ | ◯ | × | |||
| ST | ◯ | ◯ | |||||
Comparison of slope of the 3D regression line.
| Methods | 2σ | 3σ | |||||
|---|---|---|---|---|---|---|---|
| x − y | y − z | z − x | x − y | y − z | z − x | ||
|
| VS | × | × | × | × | × | × |
| Sum | × | × | × | × | × | × | |
| EV_C_A | × | NA | NA | × | NA | NA | |
| EV_C_B | × | × | × | × | × | × | |
| EV_C_C | × | × | × | × | × | × | |
| EV_S_A | × | NA | NA | × | NA | NA | |
| EV_S_B | × | × | × | × | × | × | |
| EV_S_C | × | × | × | × | × | × | |
|
| VS | ◯ | × | × | × | × | × |
| Sum | × | × | ◯ | × | × | × | |
| EV_C_A | ◯ | NA | NA | ◯ | NA | NA | |
| EV_C_B | ◯ | × | ◯ | ◯ | × | ◯ | |
| EV_C_C | ◯ | ◯ | × | ◯ | × | × | |
| EV_S_A | × | NA | NA | × | NA | NA | |
| EV_S_B | ◯ | ◯ | ◯ | ◯ | × | ◯ | |
| EV_S_C | ◯ | × | ◯ | ◯ | × | × | |
Comparison of slope of the 3D regression line without some outliers.
| Methods | 1σ | 2σ | |||||
|---|---|---|---|---|---|---|---|
| x − y | y − z | z − x | x − y | y − z | z − x | ||
|
| VS | × | × | × | × | × | × |
| Sum | × | × | × | × | × | × | |
| EV_C_A | ◯ | NA | NA | × | NA | NA | |
| EV_C_B | × | × | × | × | × | × | |
| EV_C_C | ◯ | × | × | × | × | × | |
| EV_S_A | × | NA | NA | × | NA | NA | |
| EV_S_B | × | × | × | × | × | × | |
| EV_S_C | × | × | × | × | × | × | |
|
| VS | × | ◯ | × | × | × | × |
| Sum | × | ◯ | × | × | × | × | |
| EV_C_A | × | NA | NA | × | NA | NA | |
| EV_C_B | × | ◯ | ◯ | × | × | × | |
| EV_C_C | × | × | × | × | × | × | |
| EV_S_A | ◯ | NA | NA | × | NA | NA | |
| EV_S_B | × | ◯ | × | × | ◯ | × | |
| EV_S_C | × | ◯ | ◯ | × | × | × | |
Figure 4Slope of 3D regression line for each disease.