| Literature DB >> 32059403 |
Stepan Nersisyan1, Maxim Shkurnikov2, Andrey Poloznikov3,4, Andrey Turchinovich5,6, Barbara Burwinkel5,7, Nikita Anisimov4, Alexander Tonevitsky8,9.
Abstract
One of the main disadvantages of using DNA microarrays for miRNA expression profiling is the inability of adequate comparison of expression values across different miRNAs. This leads to a large amount of miRNAs with high scores which are actually not expressed in examined samples, i.e., false positives. We propose a post-processing algorithm which performs scoring of miRNAs in the results of microarray analysis based on expression values, time of discovery of miRNA, and correlation level between the expressions of miRNA and corresponding pre-miRNA in considered samples. The algorithm was successfully validated by the comparison of the results of its application to miRNA microarray breast tumor samples with publicly available miRNA-seq breast tumor data. Additionally, we obtained possible reasons why miRNA can appear as a false positive in microarray study using paired miRNA sequencing and array data. The use of DNA microarrays for estimating miRNA expression profile is limited by several factors. One of them consists of problems with comparing expression values of different miRNAs. In this work, we show that situation can be significantly improved if some additional information is taken into consideration in a comparison.Entities:
Keywords: TCGA; miRNA microarrays; miRNome of breast cancer
Mesh:
Substances:
Year: 2020 PMID: 32059403 PMCID: PMC7072892 DOI: 10.3390/ijms21041228
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 5.923
Figure 1Number of Gene Expression Omnibus (GEO) accessions corresponding to Affymetrix miRNA Array data.
Figure 2Distribution of expression values in The Cancer Genome Atlas (TCGA) miRNA-seq data. In this figure, a percentage of total miRNA expression covered by the first n most highly expressed miRNAs according to the TCGA miRNA-seq data are shown.
Figure 3Distribution of the 30 best expressed miRNA expressions in the TCGA miRNA-seq data. The logarithm base 2 was utilized.
Figure 4Distribution of hsa-miR-30-5p (a) and hsa-miR-375-3p (b) -transformed expression values in samples of four breast cancer molecular subtypes in the TCGA miRNA-seq data.
Results of the Mann–Whitney U-test applied to expressions of miRNAs in The Cancer Genome Atlas (TCGA) miRNA-seq data.
| Class 1 | Class 2 | ||
|---|---|---|---|
| Luminal A | Luminal B | 0.18 | 0.11 |
| Luminal A | Basal |
|
|
| Luminal A | Her2 |
| 0.77 |
| Luminal B | Basal |
|
|
| Luminal B | Her2 |
| 0.79 |
| Basal | Her2 | 0.12 |
|
Figure 5Median miRNA-seq expression vs. median microarray expression. (a) scatter plot showing joint distribution of median expression values for all miRNAs from microarray and miRNA-seq data; (b) same plot with miRNAs filtered by a 0.75 score value threshold. The logarithm base 2 was utilized.
Figure 6MIMAT number vs expression level. (a) scatter plot showing relation between MIMAT number and -transformed median expression value for all miRNAs from TCGA miRNA-seq data with RPM value greater than 10; (b) the same plot for all miRNAs from microarray data. Color of point indicates score value: green points correspond to high scores and black points correspond to low scores.