Literature DB >> 28177064

CPSS 2.0: a computational platform update for the analysis of small RNA sequencing data.

Changlin Wan1, Jianing Gao1, Huan Zhang1, Xiaohua Jiang1, Qiguang Zang2, Rongjun Ban1, Yuanwei Zhang1, Qinghua Shi1.   

Abstract

SUMMARY: Next-generation sequencing has been widely applied to understand the complexity of non-coding RNAs (ncRNAs) in the last decades. Here, we present CPSS 2.0, an updated version of CPSS 1.0 for small RNA sequencing data analysis, with the following improvements: (i) a substantial increase of supported species from 10 to 48; (ii) improved strategies applied to detect ncRNAs; (iii) more ncRNAs can be detected and profiled, such as lncRNA and circRNA; (iv) identification of differentially expressed ncRNAs among multiple samples; (v) enhanced visualization interface containing graphs and charts in detailed analysis results. The new version of CPSS is an efficient bioinformatics tool for users in non-coding RNA research.
AVAILABILITY AND IMPLEMENTATION: CPSS 2.0 is implemented in PHP + Perl + R and can be freely accessed at http://114.214.166.79/cpss2.0/. CONTACT: zyuanwei@ustc.edu.cn or qshi@ustc.edu.cn. SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
© The Author(s) 2017. Published by Oxford University Press.

Entities:  

Mesh:

Substances:

Year:  2017        PMID: 28177064      PMCID: PMC5860027          DOI: 10.1093/bioinformatics/btx066

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 Introduction

Development of large-scale sequencing technology has yielded extensive small RNA (sRNA) sequencing data. Many small RNA analysis tools have been developed accordingly, such as Chimira (Vitsios and Enright, 2015), WapRNA (Zhao ), Oasis (Capece ), sRNAtoobox (Rueda ), mirTool 2.0 (Wu ), MAGI (Kim ), ISRNA (Luo ) and our published work, CPSS 1.0 (Zhang ). With increasing evidence showing that small RNAs function in the context of complex regulatory network (Bracken ), a systematic interpretation platform of sRNA data is still in great demand. However, current tools provide simple small RNA profiling rather than a systematic analysis. Their main limitations are as followed: (i) Their analysis cannot provide comprehensiveness and profoundness at the same time. For instance, Chimira is specified in detecting miRNA modification. Although many tools such as sRNAtoolbox and Oasis are equipped with multiple modules, they fail to integrate each other. Users should conduct unnecessary intermediate submission. (ii) Most of existing methods are short of graph presentations of the results. In some cases, even they provide plenty of graphs, lack of clear illustration and appropriate layout does not help to improve their popularity. (iii) Owing to the fixed analysis report, users are not able to modulate parameters after the completion of analysis. To meet the urgent demand, we updated CPSS 1.0 to CPSS 2.0, including the following improvements: (i) Within a single submission, CPSS 2.0 is able to deliver analysis report from ncRNA quantification to miRNA target prediction and annotation of single and multiple datasets. With lncRNA and circRNA added to the system, CPSS 2.0 assembles the most abundant ncRNA modules. The number of supported species is also substantially increased from 10 to 48. All databases and software integrated in CPSS 2.0 are updated to the latest version. (ii) CPSS 2.0 classifies all results into two main categories ‘General Results’ and ‘Functional Analysis’. Each has several subcategories, presenting results in graphs and charts, which is very helpful for users with an intuitive understanding of statistic data. (iii) On each detailed result page, CPSS 2.0 provides search function for user to search specific terms or values. On GO, Pathway and Protein domain detailed pages, user could modify default parameters, P-value and enrichment fold. Taken together, CPSS 2.0 is the most comprehensive webserver so far among all available tools. Detailed comparison in specific modules we deemed essential or important is provided in the Supplementary table S1. We believe that CPSS 2.0 could assist users in a comprehensive and effective manner.

2 Workflow

The overall workflow of CPSS 2.0 is shown in Supplementary Figure S1. Users can submit input data in FASTA format or FASTA files compressed in *.tar.gz format. After genome alignment with Bowtie (Langmead ), CPSS 2.0 first matches genome mapped reads with several reference sequences using Bowtie or Blast in the following order: precursor miRNA, mature, piRNA, circRNA, lncRNA, Rfam. repeats and mRNA, and then classifies them into known miRNAs, known piRNAs, mRNAs, repeat-associated RNA, circRNA, lncRNA and other types of ncRNAs (sRNA, tRNA, snRNAs and snoRNA). Expression of these classified RNAs are normalized based on the absolute counts of mapped reads (Normalized counts are displayed by Reads per Million, RPM). Sequences that are mapped to the reference genome but cannot be assigned to any of the above categories (defined as unclassified sequences) are used to predict novel miRNAs by Mireap (https://sourceforge.net/projects/mireap/). Secondary structure of predicted miRNAs are drawn by RNAfold (Lorenz ). CPSS 2.0 implements miRanda (John ) to identify known and novel miRNA targets, performing on the most abundant known/novel miRNA for single sample and for all the differentially expressed miRNAs among multiple samples or between two experimental groups. To further explore the potential biological function of predicted miRNA targets, CPSS 2.0 annotates them with Gene Ontology (GO), pathway, protein-domain information and extracts enriched annotation terms. Genes included in the enriched terms are matched with STRING (Szklarczyk ) database to retrieve protein-protein interaction (PPI) information. Visualization of the PPI network is implemented by vis.js library (http://visjs.org/). Currently, CPSS 2.0 is compatible with 48 reference genomes across vertebrates, insects, deuterostomes, nematodes and plants. It is ready-to-use for most users with pre-set analysis parameters. For users with advanced needs, parameters for each analysis step can be modified. CPSS 2.0 is able to complete group analysis of 10 samples with default parameters within 3 hours. For single or paired samples, its duration is even shorter. Detailed user instruction as well as materials and methods are provided in Supplementary information.

3 Main additions

3.1 Workflow modify

Due to decreasing sequencing cost, researchers are able to conduct multiple samples sequencing within a single dataset to deliver a more solid scientific conclusion. However, CPSS 1.0 is only able to analyze small RNA sequencing data of single or paired samples, and does not support more than two samples. CPSS 2.0 meets the demand to handle multiple samples dataset. Differently expressed ncRNAs between groups or among samples are retrieved while processing multiple samples. For ncRNAs, whose expressions satisfied the giving P-value and fold change criteria are marked as statistical significance. All significantly expressed miRNAs are selected for further functional analysis, including target prediction, GO, pathway, protein domain and PPI annotation. In order to better understand the underlying biological processes, enrichment analysis is also performed on the annotation terms to identify significantly enriched targets. As CPSS 2.0 mainly focuses on ncRNA detection, quantification and function analysis of predicted miRNA targets, the detection of miRNA modification and editing are removed from current workflow (users could use our DeAnniso (Zhang ) for their interests in the detection and annotation of miRNA isoform). Modules for the detection and quantification of circRNA and lncRNA are provided as addition categories for ncRNA classification.

3.2 Software and database update

CPSS 1.0 can only analyze small RNA sequencing data from 10 species of animals, which is not compatible with biology research in highly specific area. CPSS 2.0 integrates 38 more species, including 17 plants (such as Populus trichocarpa, Prunus persica and Zea mays), 31 animals (such as Danio rerio, Drosophila melanogaster and Caenorhabditis elegans). 11 of them are mammals such as Mus musculus and Bos Taurus. And 5 of the 11 mammal species are primates including Homo sapiens, Gorilla gorilla and Macaca mulatta. Reference sequences of all species integrated in CPSS 2.0 for reads alignment are updated to the latest version. Databases used for function annotation of predicted miRNA targets are also updated to the latest version. Detailed information is shown in Supplementary Table S2. CPSS 2.0 removed some redundant softwares implemented in the workflow of CPSS 1.0. CPSS 2.0 employs Bowtie to map sequencing reads to reference genome and part of ncRNA databases. Unnecessary miRNA target prediction tools are also removed. These deleted tools are published as database and cannot be updated frequently to cover the newly discovered miRNAs recorded from sequencing data. CPSS 2.0 implements miRanda for target prediction due to its wide acceptance and high efficiency.

3.3 Interaction and visualization strengthen

CPSS 2.0 has a brand-new user-friendly interface (Fig. 1). Take group analysis as an example, users can select file and assign samples into two groups by optimized parameters of each analyze step or simply staying with default parameters. After clicking the ‘Submit’ button, the job will be uploaded and a progress bar will reflect the real-time status of this submitted job. When an analysis step is completed, the progress bar will be refreshed automatically and the general results of this step will be displayed as well. Moreover, if users click the green characters under the ‘completed green logo’ on the progress bar, they will be directed to the results of that part. By click the ‘HERE’ button at each result section, the detailed results will be shown in a new page. Each detailed result pages includes a search function to find specific terms or values. For GO, pathway and protein domain analysis, users can optimize parameters and rerun analysis at each detailed result page. If email address is provided, once the job is finished, a reminder will be sent by email. Users can also retrieve the analysis results from the stored jobs with a unique ID generated randomly by the server for each job.
Fig. 1.

Parameter and summary result of CPSS 2.0

Parameter and summary result of CPSS 2.0 Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file. Click here for additional data file.
  14 in total

1.  CPSS: a computational platform for the analysis of small RNA deep sequencing data.

Authors:  Yuanwei Zhang; Bo Xu; Yifan Yang; Rongjun Ban; Huan Zhang; Xiaohua Jiang; Howard J Cooke; Yu Xue; Qinghua Shi
Journal:  Bioinformatics       Date:  2012-05-09       Impact factor: 6.937

2.  mirTools 2.0 for non-coding RNA discovery, profiling, and functional annotation based on high-throughput sequencing.

Authors:  Jinyu Wu; Qi Liu; Xin Wang; Jiayong Zheng; Tao Wang; Mingcong You; Zhong Sheng Sun; Qinghua Shi
Journal:  RNA Biol       Date:  2013-05-29       Impact factor: 4.652

3.  ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data.

Authors:  Guan-Zheng Luo; Wei Yang; Ying-Ke Ma; Xiu-Jie Wang
Journal:  Bioinformatics       Date:  2013-12-03       Impact factor: 6.937

4.  wapRNA: a web-based application for the processing of RNA sequences.

Authors:  Wenming Zhao; Wanfei Liu; Dongmei Tian; Bixia Tang; Yanqing Wang; Caixia Yu; Rujiao Li; Yunchao Ling; Jiayan Wu; Shuhui Song; Songnian Hu
Journal:  Bioinformatics       Date:  2011-09-06       Impact factor: 6.937

5.  Speeding up the Consensus Clustering methodology for microarray data analysis.

Authors:  Raffaele Giancarlo; Filippo Utro
Journal:  Algorithms Mol Biol       Date:  2011-01-14       Impact factor: 1.405

6.  STRING v10: protein-protein interaction networks, integrated over the tree of life.

Authors:  Damian Szklarczyk; Andrea Franceschini; Stefan Wyder; Kristoffer Forslund; Davide Heller; Jaime Huerta-Cepas; Milan Simonovic; Alexander Roth; Alberto Santos; Kalliopi P Tsafou; Michael Kuhn; Peer Bork; Lars J Jensen; Christian von Mering
Journal:  Nucleic Acids Res       Date:  2014-10-28       Impact factor: 16.971

7.  sRNAtoolbox: an integrated collection of small RNA research tools.

Authors:  Antonio Rueda; Guillermo Barturen; Ricardo Lebrón; Cristina Gómez-Martín; Ángel Alganza; José L Oliver; Michael Hackenberg
Journal:  Nucleic Acids Res       Date:  2015-05-27       Impact factor: 16.971

8.  Chimira: analysis of small RNA sequencing data and microRNA modifications.

Authors:  Dimitrios M Vitsios; Anton J Enright
Journal:  Bioinformatics       Date:  2015-06-20       Impact factor: 6.937

9.  Oasis: online analysis of small RNA deep sequencing data.

Authors:  Vincenzo Capece; Julio C Garcia Vizcaino; Ramon Vidal; Raza-Ur Rahman; Tonatiuh Pena Centeno; Orr Shomroni; Irantzu Suberviola; Andre Fischer; Stefan Bonn
Journal:  Bioinformatics       Date:  2015-02-19       Impact factor: 6.937

10.  MAGI: a Node.js web service for fast microRNA-Seq analysis in a GPU infrastructure.

Authors:  Jihoon Kim; Eric Levy; Alex Ferbrache; Petra Stepanowsky; Claudiu Farcas; Shuang Wang; Stefan Brunner; Tyler Bath; Yuan Wu; Lucila Ohno-Machado
Journal:  Bioinformatics       Date:  2014-06-06       Impact factor: 6.937

View more
  7 in total

Review 1.  Computational Methods and Online Resources for Identification of piRNA-Related Molecules.

Authors:  Yajun Liu; Aimin Li; Guo Xie; Guangming Liu; Xinhong Hei
Journal:  Interdiscip Sci       Date:  2021-04-22       Impact factor: 2.233

2.  Introduction to Bioinformatics Resources for Post-transcriptional Regulation of Gene Expression.

Authors:  Eliana Destefanis; Erik Dassi
Journal:  Methods Mol Biol       Date:  2022

3.  Automated analysis of small RNA datasets with RAPID.

Authors:  Sivarajan Karunanithi; Martin Simon; Marcel H Schulz
Journal:  PeerJ       Date:  2019-04-10       Impact factor: 2.984

4.  HumiR: Web Services, Tools and Databases for Exploring Human microRNA Data.

Authors:  Jeffrey Solomon; Fabian Kern; Tobias Fehlmann; Eckart Meese; Andreas Keller
Journal:  Biomolecules       Date:  2020-11-20

5.  IsopiRBank: a research resource for tracking piRNA isoforms.

Authors:  Huan Zhang; Asim Ali; Jianing Gao; Rongjun Ban; Xiaohua Jiang; Yuanwei Zhang; Qinghua Shi
Journal:  Database (Oxford)       Date:  2018-01-01       Impact factor: 3.451

Review 6.  Computational tools for plant small RNA detection and categorization.

Authors:  Lionel Morgado; Frank Johannes
Journal:  Brief Bioinform       Date:  2019-07-19       Impact factor: 11.622

Review 7.  Deciphering miRNAs' Action through miRNA Editing.

Authors:  Marta Correia de Sousa; Monika Gjorgjieva; Dobrochna Dolicka; Cyril Sobolewski; Michelangelo Foti
Journal:  Int J Mol Sci       Date:  2019-12-11       Impact factor: 5.923

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.