Literature DB >> 29659721

JAMI: fast computation of conditional mutual information for ceRNA network analysis.

Andrea Hornakova1, Markus List1, Jilles Vreeken1,2, Marcel H Schulz1,2.   

Abstract

Motivation: Genome-wide measurements of paired miRNA and gene expression data have enabled the prediction of competing endogenous RNAs (ceRNAs). It has been shown that the sponge effect mediated by protein-coding as well as non-coding ceRNAs can play an important regulatory role in the cell in health and disease. Therefore, many computational methods for the computational identification of ceRNAs have been suggested. In particular, methods based on Conditional Mutual Information (CMI) have shown promising results. However, the currently available implementation is slow and cannot be used to perform computations on a large scale.
Results: Here, we present JAMI, a Java tool that uses a non-parametric estimator for CMI values from gene and miRNA expression data. We show that JAMI speeds up the computation of ceRNA networks by a factor of ∼70 compared to currently available implementations. Further, JAMI supports multi-threading to make use of common multi-core architectures for further performance gain. Requirements: Java 8. Availability and implementation: JAMI is available as open-source software from https://github.com/SchulzLab/JAMI. Supplementary information: Supplementary data are available at Bioinformatics online.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29659721      PMCID: PMC6129307          DOI: 10.1093/bioinformatics/bty221

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

MicroRNAs (miRNAs) are ∼23 nt long RNAs that play an important role in the regulation of transcript abundance in mammalian cells. They are estimated to regulate at least half of the genes in the human genome (Friedman ) and thus affect important biological processes and show deregulation in many diseases (Jiang ). Several miRNAs often regulate the same transcript in a combinatorial fashion and many transcripts are regulated by the same miRNAs, leading to complex genome-wide networks of co-regulation (Tsang ). In these competing endogenous RNA (ceRNA) networks, ceRNA genes that carry binding sites for the same miRNA(s) compete over the limited pool of available miRNA molecules (Arvey ; Salmena ; Tay ). Several examples of ceRNA crosstalk have already been verified, including many genes involved in cancer such as PTEN (Poliseno ). This evidence has sparked interest in developing systematic methods for inferring ceRNA interactions from gene and miRNA expression data, reviewed in (Le ). With the emergence of large-scale studies providing gene and miRNA expression data for hundreds of samples, it has become possible to infer ceRNA interactions computationally and several approaches have been suggested to achieve this. Sumazin et al. proposed the use of conditional mutual information in their method HERMES (Sumazin ), which was later implemented as part of the CUPID software package (CUPID step III) (Chiu ). While this method was applied successfully for inferring ceRNA networks for approximately 450 000 gene pairs (Chiu ), the current implementation is very slow and poses a bottleneck for the construction of large-scale networks. This issue has motivated other researchers to design alternative approaches that are faster. For example methods based on linear correlation (Liu ; Paci ; Wang ). However, in contrast to CUPID, the linearity assumption limits the accuracy of these methods (Le ). We thus sought to speed up the computations of CMI values as the only known non-linear alternative for facilitating the efficient construction of large-scale ceRNA networks.

2 Results and discussion

Here, we present JAMI, a novel implementation of the CMI computation step of CUPID (Chiu . Like CUPID, JAMI uses adaptive partitioning for estimating CMI values (Darbellay and Vajda, 1999). This non-parametric estimator is consistent and makes no assumption on the distribution of the data and can thus be used with expression data from any technology. JAMI uses efficient data structures in Java to implement the three-dimensional data partitioning for the computation of CMI values. In contrast to CUPID, JAMI was carefully designed to support multi-threading (Supplementary Fig. S1). In Figure 1, we show that JAMI achieves a substantially better single-threaded runtime compared to CUPID implemented in either Matlab or Java. For the latter comparison, we carefully re-implemented the original CUPID method in Java.
Fig. 1.

Performance comparison between JAMI, CUPID (Matlab) and CUPID (Java). (a) Process user time in seconds. (b) Peak memory usage

Performance comparison between JAMI, CUPID (Matlab) and CUPID (Java). (a) Process user time in seconds. (b) Peak memory usage Both JAMI and CUPID rank expression values before the CMI computation. In CUPID, all expression values of 0 are assigned different ranks. This introduces bias and results in positive CMI values even if genes are not expressed in any sample. To avoid this, we extended JAMI to be zero expression aware, and demonstrate that this has considerable effect on the results (Supplementary Figs S2–S5). Preparing the input for CUPID is tedious and requires separate expression and miRNA interaction files as input for every gene pair of interest. In contrast, JAMI accepts two expression matrices as input, one for gene and one for miRNA expression, and filters these automatically for the data needed. In addition, JAMI offers great flexibility with regards to defining the triplets of interest, making it much more convenient to use JAMI in settings where several genes are of interest. JAMI output files can be directly imported in network analysis tools such as Cytoscape (Shannon ). Moreover, JAMI does not require an expensive Matlab® license like CUPID, making it available to a broader audience. To make sure that JAMI can also be used conveniently in a scripting language, we implemented the RJAMI wrapper package for R (http://github.com/SchulzLab/RJAMI). We illustrate the potential of JAMI by constructing a ceRNA interaction network from the TCGA breast cancer data set (TCGA, 2012) for known ceRNAs (Tay ) (Supplementary Fig. S6, see user manual for a step by step guide). The resulting network appears to be much denser than what is reported in the literature, emphasizing the importance of robust tools for ceRNA network inference from widely available expression data. An open question in the field is whether linear or non-linear methods are better suited for ceRNA network inference (Le ). Answering this question was thus far impeded by the lack of a fast tool for computing CMI values. JAMI overcomes this research barrier and facilitates comparisons with correlation-based method such as sensitivity correlation (Paci ) (Supplementary Fig. S7). In conclusion, JAMI is a fast, freely available and well-documented (http://jami.readthedocs.io/) tool primarily targeted at the inference of ceRNA networks. However, its implementation is general and may be used to study other modulators of gene–gene interactions, e.g. transcription factors (Flores ). Click here for additional data file.
  17 in total

1.  Cytoscape: a software environment for integrated models of biomolecular interaction networks.

Authors:  Paul Shannon; Andrew Markiel; Owen Ozier; Nitin S Baliga; Jonathan T Wang; Daniel Ramage; Nada Amin; Benno Schwikowski; Trey Ideker
Journal:  Genome Res       Date:  2003-11       Impact factor: 9.043

2.  An extensive microRNA-mediated network of RNA-RNA interactions regulates established oncogenic pathways in glioblastoma.

Authors:  Pavel Sumazin; Xuerui Yang; Hua-Sheng Chiu; Wei-Jen Chung; Archana Iyer; David Llobet-Navas; Presha Rajbhandari; Mukesh Bansal; Paolo Guarnieri; Jose Silva; Andrea Califano
Journal:  Cell       Date:  2011-10-14       Impact factor: 41.582

3.  Genome-wide dissection of microRNA functions and cotargeting networks using gene set signatures.

Authors:  John S Tsang; Margaret S Ebert; Alexander van Oudenaarden
Journal:  Mol Cell       Date:  2010-04-09       Impact factor: 17.970

Review 4.  Computational methods for identifying miRNA sponge interactions.

Authors:  Thuc Duy Le; Junpeng Zhang; Lin Liu; Jiuyong Li
Journal:  Brief Bioinform       Date:  2017-07-01       Impact factor: 11.622

5.  A ceRNA hypothesis: the Rosetta Stone of a hidden RNA language?

Authors:  Leonardo Salmena; Laura Poliseno; Yvonne Tay; Lev Kats; Pier Paolo Pandolfi
Journal:  Cell       Date:  2011-07-28       Impact factor: 41.582

6.  Most mammalian mRNAs are conserved targets of microRNAs.

Authors:  Robin C Friedman; Kyle Kai-How Farh; Christopher B Burge; David P Bartel
Journal:  Genome Res       Date:  2008-10-27       Impact factor: 9.043

Review 7.  The multilayered complexity of ceRNA crosstalk and competition.

Authors:  Yvonne Tay; John Rinn; Pier Paolo Pandolfi
Journal:  Nature       Date:  2014-01-16       Impact factor: 49.962

8.  Cupid: simultaneous reconstruction of microRNA-target and ceRNA networks.

Authors:  Hua-Sheng Chiu; David Llobet-Navas; Xuerui Yang; Wei-Jen Chung; Alberto Ambesi-Impiombato; Archana Iyer; Hyunjae Ryan Kim; Elena G Seviour; Zijun Luo; Vasudha Sehgal; Tyler Moss; Yiling Lu; Prahlad Ram; José Silva; Gordon B Mills; Andrea Califano; Pavel Sumazin
Journal:  Genome Res       Date:  2014-11-05       Impact factor: 9.043

9.  Cancer-Related Triplets of mRNA-lncRNA-miRNA Revealed by Integrative Network in Uterine Corpus Endometrial Carcinoma.

Authors:  Chenglin Liu; Yu-Hang Zhang; Qinfang Deng; Yixue Li; Tao Huang; Songwen Zhou; Yu-Dong Cai
Journal:  Biomed Res Int       Date:  2017-02-08       Impact factor: 3.411

10.  Gene regulation, modulation, and their applications in gene expression data analysis.

Authors:  Mario Flores; Tzu-Hung Hsiao; Yu-Chiao Chiu; Eric Y Chuang; Yufei Huang; Yidong Chen
Journal:  Adv Bioinformatics       Date:  2013-03-13
View more
  6 in total

1.  Illuminating lncRNA Function Through Target Prediction.

Authors:  Hua-Sheng Chiu; Sonal Somvanshi; Ting-Wen Chen; Pavel Sumazin
Journal:  Methods Mol Biol       Date:  2021

2.  MethReg: estimating the regulatory potential of DNA methylation in gene transcription.

Authors:  Tiago C Silva; Juan I Young; Eden R Martin; X Steven Chen; Lily Wang
Journal:  Nucleic Acids Res       Date:  2022-05-20       Impact factor: 19.160

3.  miRspongeR: an R/Bioconductor package for the identification and analysis of miRNA sponge interaction networks and modules.

Authors:  Junpeng Zhang; Lin Liu; Taosheng Xu; Yong Xie; Chunwen Zhao; Jiuyong Li; Thuc Duy Le
Journal:  BMC Bioinformatics       Date:  2019-05-10       Impact factor: 3.169

4.  Integrated Bioinformatic Analysis of a Competing Endogenous RNA Network Reveals a Prognostic Signature in Endometrial Cancer.

Authors:  Leilei Xia; Ye Wang; Qi Meng; Xiaoling Su; Jizi Shen; Jing Wang; Haiwei He; Biwei Wen; Caihong Zhang; Mingjuan Xu
Journal:  Front Oncol       Date:  2019-05-29       Impact factor: 6.244

5.  Large-scale inference of competing endogenous RNA networks with sparse partial correlation.

Authors:  Markus List; Azim Dehghani Amirabad; Dennis Kostka; Marcel H Schulz
Journal:  Bioinformatics       Date:  2019-07-15       Impact factor: 6.937

6.  ceRNAR: An R package for identification and analysis of ceRNA-miRNA triplets.

Authors:  Yi-Wen Hsiao; Lin Wang; Tzu-Pin Lu
Journal:  PLoS Comput Biol       Date:  2022-09-09       Impact factor: 4.779

  6 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.