Literature DB >> 27784000

Which statistical significance test best detects oncomiRNAs in cancer tissues? An exploratory analysis.

Wei Tang¹, Zhijun Liao^2,3, Quan Zou^3,4.

Abstract

MicroRNAs(miRNAs) often exert their oncogenic and tumor suppressor functions by suppressing protein-coding genes expressions in cancers and thus have a strong association with cancers' generation, development and metastasis. Through comprehensively understanding differentially expressed miRNAs (oncomiRNA) in tumor tissues, we can elucidate the underlying molecular mechanisms in tumorigenesis and develop novel strategies for cancer diagnosis and treatment. The differential expression of miRNAs can now be analyzed through numerous statistical significance tests based on different principles, which are also available in various R packages. However, the results can be notably different. In this study, we compared miRNAs obtained from 6 common significance tests/R packages (t-test, Limma, DESeq, edgeR, LRT and MARS) with the miRNAs archived in two databases; HMDD 2.0 database, which collects experimentally validated differentially expressed miRNAs, and Infer microRNA-disease association database, which contains the potential disease-associated miRNAs by network forecasting. Finally, we sought the MARS method in DEGseq package more effectively searched out differentially expressed miRNAs than other common methods.

Entities: Chemical Disease Gene Species

Keywords: MARS; differential expression; microRNA; oncomiRNA; statistical significance test

Mesh：

Substances：

Year: 2016 PMID： 27784000 PMCID： PMC5356763 DOI： 10.18632/oncotarget.12828

Source DB: PubMed Journal: Oncotarget ISSN： 1949-2553

INTRODUCTION

MicroRNAs (miRNAs) are short (18-25-nucleotide) non-coding RNAs that function as posttranscriptional gene regulators by binding to the 3’UTR of mRNAs, consequently, either repress translation or initiate mRNA degradation [1, 2]. Since their discovery [3, 4], miRNAs have been implicated in the control of various cellular processes [5, 6], including cell proliferation [1], cell death [7-12] and differentiation [3, 13]. Therefore, many miRNAs could function as oncogenic miRNAs (oncomiRNAs), which cause cancer by down-regulating genes through both translational repression and mRNA destabilization mechanisms [14, 15], such as breast tumors [16, 17], esophageal carcinoma [18, 19] and lung cancer [1, 3]. miRNAs are also potential prognostic markers of chronic lymphocytic leukemia [20], colon tumors [15, 21], pancreatic cancer [22], and neuroblastoma [23]. Associations between differentially expressed (DE) miRNAs and the cancer occurrence have been the focus of intense cancer biology investigation [24-28]. Next Generation Sequencing (NGS) technology can rapidly and accurately perform large-scale DNA/RNA sequencing through a series of high-throughput technologies. These technologies facilitate genomic research and are increasingly replacing microarrays with gene expressing profiling of epigenetics and transcriptomics (RNA-seq) [29, 30]. Transcriptomic sequencing includes mRNA, small RNA and non-coding RNA (ncRNA), of which miRNAs are among the most important components [31-33]. Aided by the advantages of NGS, molecular biology has acquired a vast number of large-scale sequence data, which has also posed many challenges for high-throughput analysis. These challenges include finding suitable statistical tests for large data and affirming their statistical assumptions by biological experiments, such as quantitative RT-PCR, northern blot, and overcoming the shortcomings of genetic sequencing technologies through statistical methods, which should fully uncover the essence of biology. Selective miRNAs expression profiling based on high-throughput test can strongly support the prognosis prediction of various cancers [34, 35]. Therefore, a significance test or R package which can efficiently screen out DE miRNAs in tumor tissues will guide the subsequent validation by low-throughput experiments. Various normalizations and statistical hypotheses have been incorporated into statistical significance tests and R packages which can detect DE miRNAs in cancer tissues. For example, the t-test (Student's t-test) is widely used for comparing independent samples by statistical hypothesis test. This test examines whether the expressions of certain miRNAs significantly differ among different parent population samples. A 2011 study compared the miRNAs expressions in 20 patients with glioblastoma and other 20 age- and sex-matched healthy controls [11]. The researchers identified 52 significant DE miRNAs among 1158 tested miRNAs in glioblastoma tissues, however, only 2 miRNAs (miR-128 and miR-342-3p, which are up- and down-regulated respectively) of these 52 miRNAs were validated by low-throughput real-time PCR experiments, which means only two miRNAs were suitable biomarkers for blood-derived glioblastoma-associated characteristic miRNA fingerprints [36]. The Limma package analyses gene expression data obtained from microarrays or RNA-seq technologies. The core capability of this package is the evaluation of differential expression in multifactor-designed experiments by linear modeling [37]. Sun [38] used the Limma package to screen out numerous DE miRNAs in ductal carcinoma in situ compared with normal controls . The DESeq [39] and edgeR [40] package solve the overdispersion problem in RNA sequencing data by applying the negative binomial distribution. Hamfjord [41] used both tools to statistically test the miRNA expression differences in read counts per miRNA between two samples. In DESeq, they treated the tumor and normal samples as independent groups; in edgeR, they considered paired information. According to their results, 37 miRNAs were identified to be DE miRNAs (19 up-regulated and 18 down-regulated) in colorectal cancer both in DESeq and edgeR [41], however, among these miRNAs, 16 miRNAs had not been validated in previously documented experiments. Thus, DESeq and edgeR both have limited screening ability for DE miRNAs. In 2010, Wang [42] proposed MA-plot-based method with random sampling (MARS) in DEGseq package. This method incorporates the random sampling method and is based on MA plot, which is widely used to detect and visualize the intensity-dependent ratios in microarray data [43]. DEGseq package also includes another commonly used method called Likelihood Ratio Test (LRT) [44]. Despite the wide range of statistical significance tests and R packages for detecting DE miRNAs in RNA expression profiles, few studies have considered which method gives the most accurate result. Along with the flourishing development of bioinformatics and applications of machine learning [24, 26, 45, 46], literature mining [47, 48] has greatly assisted biomedicine and genomics research. The HMDD 2.0 database (http://www.cuilab.cn/hmdd) collects experimental evidences of DE miRNAs and disease associations through literature mining [47]. The Infer microRNA-disease association database (http://lab.malab.cn/soft/ifmda) [49] predicts the underlying interactions between miRNAs and disease for further confirmation of biological experiments. The Infer method constructs a heterogeneous network that connects the disease similarity subnetwork to the miRNA similarity subnetwork by validated experimental miRNA-disease associations. The HMDD 2.0 database contains the experimentally validated DE miRNAs, and the Infer microRNA-disease association database offers miRNAs that are potentially associated with diseases. In this study, we selected 5 miRNA expression profiles of cancers (BRCA, ESCA, LUAD, PAAD and THCA), and their corresponding controls miRNA profiles. We then screened the DE miRNAs in the cancer tissues by the six abovementioned methods (t-test, Limma, DESeq, edgeR, LRT and MARS), then compared and classified the six sets of results with the miRNAs in HMDD 2.0 and Infer microRNA-disease association. By calculating the Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC), we identified the method with the best screening results. Among the six methods, MARS delivered the highest performance.

RESULTS

Expression analysis of miRNAs in cancer vs normal groups

After applying the six methods to the five datasets (Figure 1 and Table 1), we acquired the significant DE miRNAs in cancer tissues (P or P < 0.05) among the miRNAs. The numbers of DE miRNAs returned by the six methods have huge differences (Figure 2).

Figure 1

The flowchart to analyze miRNA and to compare with the other two databases in this study

Note: DE: differentially expressed. Database 1: HMDD 2.0 database. Database 2: Infer microRNA-disease association database

Table 1.1

Number of samples of selected miRNA sequencing datasets from the TCGA database

Cancers	NT samples	TN samples
BRCA	90	90
ESCA	13	13
LUAD	45	45
PAAD	4	4
THCA	57	57

Figure 2

The histogram of the number of DE miRNAs obtained from 6 methods on 5 datasets

The DE miRNAs were obtained from 6 methods(t-test, Limma, DESeq, edgeR, LRT and MARS) and the threshold: P or P < 0.05

Note: NT: normal sample; TN: tumor sample; Database 1: HMDD 2.0; Database 2: Infer microRNA-disease association; BRCA: Breast invasive carcinoma ESCA: Esophageal carcinoma LUAD: Lung adenocarcinoma PAAD: Pancreatic adenocarcinoma THCA: Thyroid carcinoma. The DE miRNAs in BRCA obtained by those 6 methods were compared in a Venny distribution using the Venny web server 2.0 (http://bioinfogp.cnb.csic.es/tools/venny/index.html) [50] (Figure 3). The miRNAs obtained from the six methods were also very different. For example, in BRCA dataset, hsa-mir-4482 was classified as a DE miRNA by Limma and edgeR, but as a normally expressed miRNA in DESeq.

Figure 3

The venn diagrams of BRCA dataset

BRCA dataset: The DE miRNAs were obtained from 6 methods(t-test, Limma, DESeq, edgeR, LRT and MARS) and the threshold P or P< 0.05, since venn diagrams based on 6 sets looks not intuitionistic, so we choose every 4 sets in 6 sets to draw venn diagrams.

These two figures indicated the large differences on the DE miRNAs obtained from those 6 methods. The differences not only lie in the number of those miRNAs but also in the predicted DE miRNAs.

The flowchart to analyze miRNA and to compare with the other two databases in this study

Note: DE: differentially expressed. Database 1: HMDD 2.0 database. Database 2: Infer microRNA-disease association database

The histogram of the number of DE miRNAs obtained from 6 methods on 5 datasets

The DE miRNAs were obtained from 6 methods(t-test, Limma, DESeq, edgeR, LRT and MARS) and the threshold: P or P < 0.05

The venn diagrams of BRCA dataset

Compared with the miRNAs in HMDD 2.0 and Infer microRNA-disease association

The total number of DE miRNAs can be varied by adjusting the threshold (P or P) of the six methods. Here, we classified the miRNAs obtained by the six methods with miRNAs in the HMDD 2.0 and Infer microRNA-disease association database. As these classifications are binary classifications (Table 2), we can estimate the screening DE miRNA performance of these methods by plotting the ROCs and computing their AUCs. When integrating the miRNAs from HMDD 2.0 and Infer microRNA-disease association, we incremented the k-value from 0.5 to 1 in 0.05 steps as a weight factor (Figure 4, Figures S1-S4; Table S1). When individually considering these two databases, we constructed separate ROCs for the miRNA comparisons between each statistical method and HMDD 2.0, and between each method and Infer microRNA-disease association (Figrue 5, Figures S5-S8; Table S2).

Table 2

The binary classifier

	True class
Hypothesized class	TP(True Positives)	FP(False Positives)
Hypothesized class	FN(False Negatives)	TN(True Negatives)

Note:

TP: true positive, the number of predicted miRNAs by statistical methods and also that appear in HMDD 2.0 or Infer microRNA-disease association

FP: false positive, the number of predicted miRNAs by statistical methods but not appear in HMDD 2.0 and Infer microRNA-disease association

TN: true negative, the number of not predicted miRNAs by statistical methods and also not appear in HMDD 2.0 and Infer microRNA-disease association

FN: false negative, the number of not predicted miRNAs by statistical methods but still appear in HMDD 2.0 or Infer microRNA-disease association

Hypothesized class: the predicted miRNA by those 6 methods

True class: the miRNA achieved in those 2 databases, HMDD 2.0 and Infer microRNA-disease association

Figure 4

The ROC of 6 methods on BRCA dataset based on integrated HMDD 2.0 and Infer microRNA-disease association

These ROC are obtained from classification of miRNAs obtained from 6 methods (t-test, Limma, DESeq, edgeR, LRT and MARS) on 5 datasets based on the true class in integrated HMDD 2.0 and Infer microRNA-disease association and k-value is the weighting coefficient, which is arithmetic progression from 0.5-1 with the step size equaling to 0.05

Figure 5

The ROC of 6 methods on BRCA dataset based on independently HMDD 2.0 and Infer microRNA-disease association

These ROC are obtained from classification of miRNAs obtained from 6 methods (t-test, Limma, DESeq, edgeR, LRT and MARS) on 5 datasets based on the true class in independently HMDD 2.0 and Infer microRNA-disease association.

Note: TP: true positive, the number of predicted miRNAs by statistical methods and also that appear in HMDD 2.0 or Infer microRNA-disease association FP: false positive, the number of predicted miRNAs by statistical methods but not appear in HMDD 2.0 and Infer microRNA-disease association TN: true negative, the number of not predicted miRNAs by statistical methods and also not appear in HMDD 2.0 and Infer microRNA-disease association FN: false negative, the number of not predicted miRNAs by statistical methods but still appear in HMDD 2.0 or Infer microRNA-disease association Hypothesized class: the predicted miRNA by those 6 methods True class: the miRNA achieved in those 2 databases, HMDD 2.0 and Infer microRNA-disease association

The ROC of 6 methods on BRCA dataset based on integrated HMDD 2.0 and Infer microRNA-disease association

The ROC of 6 methods on BRCA dataset based on independently HMDD 2.0 and Infer microRNA-disease association

DISCUSSION

Abnormal mRNA expression are induced by DE miRNAs, which prevents the mRNA from executing its regular biological functions [51-53], which is a primary cause of cancer. Altered miRNA expression will likely contribute to the initiation and progression of human cancers [10, 11, 13, 14, 16, 47], and the relationship between miRNAs and cancers has become a major focus in cancer research. Vast numbers of miRNA expression profiles have been generated throughout the past decade, as rapid NGS development has continuously lowered the gene sequencing cost. Although DE miRNAs in tumor tissues can be detected by various available methods, the accuracy of these methods remains a critical issue. DE genes have been ubiquitously detected by the t-test, which is popular for its simple calculation and easily understandable characteristics. Even though the standard error in the t-test is based on a small sample size, some miRNAs with miniscule standard error will still inevitably exist among the great number of miRNAs. Consequently, the t-test will increase the false positives prediction for these miRNAs [54, 55]. Examining the ROC of PAAD dataset which has only 4 cancer samples and 4 control samples, we could observe clearly that the t-test cannot selectively screen the DE miRNAs in this dataset (Figure S3, Figure S7). In addition, the performance of t-test method on PAAD dataset was worse than any other methods when we compare the results with both integrated and independent considerations of HMDD 2.0 and Infer microRNA-disease association. To improve the estimates stability in the traditional t-test, Limma introduces a prior distribution which can strengthen the sample variance estimation. The results of ROCs clearly showed that Limma delivered much better performance than the t-test in the small sample case, such as ESCA and PAAD datasets (Figure S1, Figure S3, Figure S5, Figure S7; Table S2). Limma also outperformed the t-test in the remaining datasets. The DESeq and edgeR packages are based on the negative binomial (NB) distribution. The NB model corrects the overdispersion problem in RNA sequence data by an additional term in the variance of the Poisson model. The variance parameter is estimated differently in DESeq and edgeR; DESeq estimates the mean-dependent dispersion by a local regression method, whereas edgeR assumes that the mean and variance are related and thus share a single common estimate of the dispersion parameter across the read counts. edgeR also weakens each miRNA's the dispersion degree through an empirical Bayes method [40]. We note that many statistical methods cannot properly handle the small sample sizes which are very common in RNA sequencing experiments, for example, in DESeq, which are based on generalized linear models, small sample sizes consequently violated the assumptions of its statistical tests [39, 40]. As a result, DESeq was almost the worst performer in our experiments (Figures S1-S8). The edgeR method also inflated the type I error rates in the simulations. The results of both DESeq and edgeR deviated largely in relatively small samples, such as the PAAD (4) and ESCA (13) dataset (Table 1.1). The AUCs of both methods were also quite different (Tables S1-S2); in the ESCA dataset, the AUC in edgeR was 0.18-0.2 larger than in DESeq, and in the PAAD dataset, the DESeq was even ineffective due to the very small sample size. In other three datasets, which has relatively larger sample sizes than PAAD and ESCA, reduced the differences between performances in edgeR and DESeq (Tables S1-S2). Our simulations also confirmed a higher computational speed of edgeR than DESeq, moreover, the latter could even incur memory leakage at relatively large sample sizes. According to the technical characteristic of RNA-Seq, Wang [42] proposed the MARS method in DEGseq, which can detects DE genes from MA plots and its test hypothesis is based on a random sampling model [42]. The DEGseq package includes LRT as well. Both methods were demonstrated higher DE miRNA screening performance in all five datasets than the other four methods (t-test, Limma, DESeq and edgeR). The results showed some AUCs of MARS and LRT were close to 0.90 and some were much higher than 0.9, suggesting that the both methods are very effective on detecting DE miRNAs (Figures S1-S8, Tables S1-S2). Moreover, the AUCs of the PAAD and ESCA dataset confirmed that MARS and LRT can also effectively identify DE miRNAs even in small samples. Although both MARS and LRT performed well, MARS achieved a higher True Positive Rate (TPR) at low False Positive Rate (FPR) than LRT in the BRCA, ESCA, LUAD and THCA datasets, in PAAD dataset, MARS and LRT achieved nearly identical TPR at the same low FPR (Figures S3 and S7), which indicated that MARS could correctly identify DE miRNAs with fewer misidentified miRNAs than LRT (Figures S1-S8). In summary, among the six tested methods, MARS could most accurately detecte the DE miRNAs in cancer tissues. The detection of DE miRNAs in cancers (oncomiRNAs) has always been among the most important issue in cancer biology research. Prior accurate computational detection of DE miRNAs will effectively reduce the cost of clinical experiments. Current computational analyses focus on statistical significance tests [56], literature mining and networking prediction [57]. However, few works have considered all of these approaches. In the present study, we integrated these three approaches to maximize the uniformity of the results to see which method could most accurately detect DE miRNAs. The DE miRNAs detected by the best performer (MARS) were highly consistent with the miRNAs extracted from literature mining and network prediction. This supports our inference that MARS outperforms other statistical significance tests (such as t-tests) in DE miRNAs detection. Complex genetic regulatory mechanisms in high-level organisms is considered to be achieved through controlled and coordinated miRNAs networks. The associations between miRNAs and disease are not only conducive to develop novel therapeutic applications for cancer patients by miRNA delivery and inhibition, but also help to construct the RNA network which is crucial to understand the underlying mechanisms of genetic network. In future work, we will continue to focus on improving the accuracy of oncomiRNA detection through integrating significance tests, literature mining and network prediction with including more data resources and also involving machine learning methods modify the MARS method.

MATERIALS AND METHODS

Flowchart

Figure 1 is a flowchart of the present study. After a comprehensive analysis of miRNA expression in cancerous and normal tissue samples, the results were compared with the miRNAs in HMDD 2.0 and the Infer microRNA-disease associations. Finally, we identified the best DE miRNA detection method among the six statistical significance tests/R packages.

Source data and sequence expression analysis

In this study, we selected five deep sequencing miRNA datasets from The Cancer Genome Atlas (TCGA) pilot project (https://tcga-data.nci.nih.gov/tcga/). All of these data were sequenced by the BCGSC (IlluminaHiSeq_miRNAseq) sequencing platform. HMDD 2.0 database collects DE miRNAs in various cancer types which were validated by biological experiments. However, many of the cancer types in HMDD 2.0 have not collected sufficiently experimentally validated miRNAs, only 5 cancer types not only collect more than 50 validated DE miRNAs in HMDD 2.0, included the miRNA sequences of breast invasive carcinoma (BRCA), esophageal carcinoma (ESCA), lung adenocarcinoma (LUAD), pancreatic adenocarcinoma (PAAD) and thyroid carcinoma (THCA), but also have corresponding miRNAs in Infer microRNA-disease association database. So in order to assure the statistical significance, we selected these 5 datasets (BRCA, ESCA, LUAD, PAAD and THCA) for further analysis. As the sample sizes differed between tumor and normal samples, we randomly selected corresponding tumor samples with the same and similar characteristics (Table 1). After obtaining the miRNA expression profiles from TCGA (the original sequencing data had been subjected to mapping analysis), we analyzed the miRNAs through abovementioned statistical methods (t-test, Limma, DESeq, edgeR, LRT and MARS), and computed the corresponding P or P (the associated FDR (False Discovery Rate)) value of each miRNA in the five datasets. Results were deemed significant at the P or P = 0.05 level.

Statistical comparison with HMDD 2.0 and Infer microRNA-disease association

If the P or P value was below 0.05, the miRNA expression between the cancer and normal samples was considered as statistically significant, and the miRNA was assumed as a DE miRNA in the tumor tissue. By varying P or P, we can vary the numbers of miRNAs that pass the hypothesis test. All of the tested methods generated a P or P value for each miRNA in the miRNAs differential expression analysis. Therefore, the comparison between the miRNAs obtained from these six methods and those in HMDD 2.0 database and Infer microRNA-disease association database can be regarded as a binary classification process. Here, the predicted outputs are the miRNAs obtained from the significance tests/R packages, and the true classes are the miRNAs of the corresponding cancers in HMDD 2.0 and Infer microRNA-disease association. The matrix so constructed is the basis of many common metrics (Table 2) The performance of a binary classification could be characterized by two basic measurements; the recall rate and precision rate. However, these measurements are of limited usefulness due to their single-valued feature [58], we instead computed the True Positive Rate (TPR) and False Positive Rate (FPR), then plotted the ROC, which could be quantified by the AUC. In biomedical applications, the ROC is commonly used to judge the performance of a discriminant across varying decision thresholds [59, 60], and it has become increasingly important in the classification of unequally distributed categories, since its unique attributes can handle unequal costs incurred by classification errors [59, 60]. Thus, the ROC could provide a better metric than the accuracy measure in certain classifiers [61, 62]. In the binary classification of our DE miRNA analysis, we specified a threshold for the obtained outcome, such as 0.01, and assigned respectively all instances above and below this value as negative (no differential expression) and positive (differential expression). Increasing the threshold to 0.05 will increase the number of true positive instances, and thereby the proportion of true positives among all positive instances (the true positive rate, or TPR) increases. However, a higher threshold also classifies more negative instances as positive, increasing the false positive rate (FPR). The AUC of the ROC is another indicator of the classifier performance [63, 64]. The TPR and FPR are respectively calculated as: Where True Positive (TP) and False Positive (FP) denote the numbers of positive and negative samples that are classified as positive, respectively, and True Negative (TN) and False Negative (FN) represent the corresponding values of the negative samples. Because the AUC is a portion of a unit square area, its value always lies in the range 0-1.0. Another important statistical property of AUC is that the value equals the probability that the classifier will rank a randomly chosen positive instance higher than a randomly chosen negative instance. A higher AUC shifts the ROC toward the upper-left of the square, indicating higher performance of the classifier [60]. As the miRNAs in HMDD 2.0 were validated by biological experiments and those miRNAs in Infer microRNA-disease association were predicted by network, we must assign a weighting coefficient when considering the true classes in an integrated manner. Here, we applied a simple linear function. Specifically, we modified the TP and FN by introducing a parameter k: In Eqs. (3) and (4), sum1 and sum2 are the sums of the miRNAs appearing in HMDD 2.0 (correctly classified positives in HMDD 2.0) and Infer microRNA-disease association (correctly classified positives in the Infer database), respectively, and sum3 is the total sum of the miRNAs obtained by all methods. As the miRNAs obtained from Infer microRNA-disease association might not be associated with cancers, they should be weighted less heavily than those in HMDD 2.0, whose associations with cancer are confirmed. Hence, the k-value was varied as 0.5 ≤ k ≤ 1

Table 1.2

The number of selected miRNA from the database 1 and database 2

Cancers	TCGA	Database 1	Database 2
BRCA	1881	243	100
ESCA	1881	83	100
LUAD	1881	157	100
PAAD	1881	117	100
THCA	1881	56	100

Note:

NT: normal sample; TN: tumor sample;

Database 1: HMDD 2.0; Database 2: Infer microRNA-disease association;

BRCA: Breast invasive carcinoma

ESCA: Esophageal carcinoma

LUAD: Lung adenocarcinoma

PAAD: Pancreatic adenocarcinoma

THCA: Thyroid carcinoma.

60 in total

1. ncRDeathDB: A comprehensive bioinformatics resource for deciphering network organization of the ncRNA-mediated cell death system.

Authors: Deng Wu; Yan Huang; Juanjuan Kang; Kongning Li; Xiaoman Bi; Ting Zhang; Nana Jin; Yongfei Hu; Puwen Tan; Lu Zhang; Ying Yi; Wenjun Shen; Jian Huang; Xiaobo Li; Xia Li; Jianzhen Xu; Dong Wang
Journal: Autophagy Date: 2015 Impact factor: 16.016

2. Inferring microRNA-disease associations by random walk on a heterogeneous network with multiple data sources.

Authors: Yuansheng Liu; Xiangxiang Zeng; Zengyou He; Quan Zou
Journal: IEEE/ACM Trans Comput Biol Bioinform Date: 2016-04-05 Impact factor: 3.710

3. MicroRNA expression patterns to differentiate pancreatic adenocarcinoma from normal pancreas and chronic pancreatitis.

Authors: Mark Bloomston; Wendy L Frankel; Fabio Petrocca; Stefano Volinia; Hansjuerg Alder; John P Hagan; Chang-Gong Liu; Darshna Bhatt; Cristian Taccioli; Carlo M Croce
Journal: JAMA Date: 2007-05-02 Impact factor: 56.272

4. MicroRNAs modulate hematopoietic lineage differentiation.

Authors: Chang-Zheng Chen; Ling Li; Harvey F Lodish; David P Bartel
Journal: Science Date: 2003-12-04 Impact factor: 47.728

5. Mammalian microRNAs predominantly act to decrease target mRNA levels.

Authors: Huili Guo; Nicholas T Ingolia; Jonathan S Weissman; David P Bartel
Journal: Nature Date: 2010-08-12 Impact factor: 49.962

6. Differential expression of miRNAs in colorectal cancer: comparison of paired tumor tissue and adjacent normal mucosa using high-throughput sequencing.

Authors: Julian Hamfjord; Astrid M Stangeland; Timothy Hughes; Martina L Skrede; Kjell M Tveit; Tone Ikdahl; Elin H Kure
Journal: PLoS One Date: 2012-04-17 Impact factor: 3.240

7. ViRBase: a resource for virus-host ncRNA-associated interactions.

Authors: Yanhui Li; Changliang Wang; Zhengqiang Miao; Xiaoman Bi; Deng Wu; Nana Jin; Liqiang Wang; Hao Wu; Kun Qian; Chunhua Li; Ting Zhang; Chunrui Zhang; Ying Yi; Hongyan Lai; Yongfei Hu; Lixin Cheng; Kwong-Sak Leung; Xiaobo Li; Fengmin Zhang; Kongning Li; Xia Li; Dong Wang
Journal: Nucleic Acids Res Date: 2014-10-01 Impact factor: 16.971

8. Identification of real microRNA precursors with a pseudo structure status composition approach.

Authors: Bin Liu; Longyun Fang; Fule Liu; Xiaolong Wang; Junjie Chen; Kuo-Chen Chou
Journal: PLoS One Date: 2015-03-30 Impact factor: 3.240

9. FMLNCSIM: fuzzy measure-based lncRNA functional similarity calculation model.

Authors: Xing Chen; Yu-An Huang; Xue-Song Wang; Zhu-Hong You; Keith C C Chan
Journal: Oncotarget Date: 2016-07-19

10. edgeR: a Bioconductor package for differential expression analysis of digital gene expression data.

Authors: Mark D Robinson; Davis J McCarthy; Gordon K Smyth
Journal: Bioinformatics Date: 2009-11-11 Impact factor: 6.937

17 in total

1. MiR-93-5p Promotes Cell Proliferation through Down-Regulating PPARGC1A in Hepatocellular Carcinoma Cells by Bioinformatics Analysis and Experimental Verification.

Authors: Xinrui Wang; Zhijun Liao; Zhimin Bai; Yan He; Juan Duan; Leyi Wei
Journal: Genes (Basel) Date: 2018-01-22 Impact factor: 4.096

2. Integrated analysis of dosage effect lncRNAs in lung adenocarcinoma based on comprehensive network.

Authors: Yunzhen Wei; Zichuang Yan; Cheng Wu; Qiang Zhang; Yinling Zhu; Kun Li; Yan Xu
Journal: Oncotarget Date: 2017-08-03

3. Comparative analysis of genes frequently regulated by drugs based on connectivity map transcriptome data.

Authors: Xinhua Liu; Pan Zeng; Qinghua Cui; Yuan Zhou
Journal: PLoS One Date: 2017-06-02 Impact factor: 3.240

4. Identification of molecular targets for esophageal carcinoma diagnosis using miRNA-seq and RNA-seq data from The Cancer Genome Atlas: a study of 187 cases.

Authors: Jiang-Hui Zeng; Dan-Dan Xiong; Yu-Yan Pang; Yu Zhang; Rui-Xue Tang; Dian-Zhong Luo; Gang Chen
Journal: Oncotarget Date: 2017-05-30

5. Selection of reference genes for microRNA analysis associated to early stress response to handling and confinement in Salmo salar.

Authors: Eduardo Zavala; Daniela Reyes; Robert Deerenberg; Rodrigo Vidal
Journal: Sci Rep Date: 2017-05-11 Impact factor: 4.379

10. Clinical roles of the aberrantly expressed lncRNAs in lung squamous cell carcinoma: a study based on RNA-sequencing and microarray data mining.

Authors: Wen-Jie Chen; Rui-Xue Tang; Rong-Quan He; Dong-Yao Li; Liang Liang; Jiang-Hui Zeng; Xiao-Hua Hu; Jie Ma; Shi-Kang Li; Gang Chen
Journal: Oncotarget Date: 2017-05-22