Literature DB >> 25165094

MIRPIPE: quantification of microRNAs in niche model organisms.

Carsten Kuenne1, Jens Preussner1, Mario Herzog1, Thomas Braun1, Mario Looso1.   

Abstract

UNLABELLED: MicroRNAs (miRNAs) represent an important class of small non-coding RNAs regulating gene expression in eukaryotes. Present algorithms typically rely on genomic data to identify miRNAs and require extensive installation procedures. Niche model organisms lacking genomic sequences cannot be analyzed by such tools. Here we introduce the MIRPIPE application enabling rapid and simple browser-based miRNA homology detection and quantification. MIRPIPE features automatic trimming of raw RNA-Seq reads originating from various sequencing instruments, processing of isomiRs and quantification of detected miRNAs versus public- or user-uploaded reference databases.
AVAILABILITY AND IMPLEMENTATION: The Web service is freely available at http://bioinformatics.mpi-bn.mpg.de. MIRPIPE was implemented in Perl and integrated into Galaxy. An offline version for local execution is also available from our Web site.
© The Author 2014. Published by Oxford University Press.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25165094      PMCID: PMC4816158          DOI: 10.1093/bioinformatics/btu573

Source DB:  PubMed          Journal:  Bioinformatics        ISSN: 1367-4803            Impact factor:   6.937


1 INTRODUCTION

MicroRNAs (miRNAs) are ∼22 nucleotides long and belong to the class of snRNAs. miRNAs serve numerous roles in downregulation (transcript degradation and sequestering, translational suppression) of gene expression. In general, miRNAs are assumed to regulate multiple targets although effects on most targets are relatively mild (Ameres and Zamore, 2013). Isoforms of miRNAs resulting from imperfect digestion by Drosha and Dicer or RNA editing by specialized enzymes represent a challenge during the determination of correct read counts following RNASeq. miRNA variants might be ‘silent’ (3′ modification = isomiR) or target different mRNAs when changes occur in the 5′ regions responsible for complementary binding. Sequence differences between taxa hamper quantification, especially if no genomic or miRNA data for the studied organism are available as in the case of niche model organisms. Sequencing errors can further complicate the identification of miRNAs. These effects should ideally be addressed on multiple levels, including (i) isomiR handling, (ii) enforcement of a minimum read copy number, (iii) clustering of similar miRNAs, (iv) removal of relatively low abundance reads and (v) optional fallback to the miRNA family level. A set of applications in the field attempts to cover these features, but a Web-based tool able to unify all functionalities that can be applied to any organism is critically missing (An ; Giurato ; Wen ).

2 WORKFLOW AND FEATURES

MIRPIPE uses open-source binary tools including the FASTX-Toolkit (Pearson ), Cutadapt (Martin, 2011) and BLASTN (Boratyn ) for data processing. The pipeline was integrated into a Galaxy-based Web platform (Goecks ) but is also available for download and local execution. A detailed explanation of the algorithm can be found in Supplementary File S1. The workflow starts with the upload of a compressed FASTQ/FASTA read file using the Web interface or the MIRPIPE FTP server. MIRPIPE can fully process raw reads originating from Illumina, 454, IonTorrent or Sanger sequencing instruments including adapter trimming. A reference FASTA database bearing mature target miRNAs can either be selected from current miRBase release (Griffiths-Jones ) or can be uploaded by the user. The raw reads are processed to optionally remove an adapter sequence and trim for a minimum quality (default Q20). Only reads of the desired size range are selected to limit the pool to mature miRNAs. Duplicate reads are collapsed to decrease the number of necessary homology searches, and only those sequences represented by a minimum count are kept for further analyses. This measure is intended to remove unique reads, which frequently denote sequencing errors or miRNA variations that are expressed near to the detection limit, preventing reliable quantification. Read counts from isomiRs of the same miRNA are combined. These isomiR read sequences may only differ by the 3′ end and are thus putatively encoded by the same gene. Only one nucleotide may differ between two sequences to be counted as isoforms of the same miRNA, and only the longest sequence is used in the next step to further reduce the amount of homology searches. The remaining read sequences are used for a sequence similarity search versus the chosen reference database of miRNAs. Mature reference miRNAs and their precursors are optionally collated by name on the family level to remove redundancy introduced by organism prefixes and precursor suffixes (e.g. bta-miR-200a, oan-miR-200a-3p > miR-200a). For each read, the detected reference miRNA families are scored based on the minimum number of mismatches. If a read matched equally well versus multiple miRNA families, the respective families are joined by single linkage clustering. This permits the inclusion of reads that cannot be matched uniquely, as well as the exact measurement of the fraction of ambiguously matching reads and thereby the reliability of the match. By default, only those read sequences that are at least 5% as abundant as the most abundant sequence per miRNA family cluster are denoted to reduce the impact of sequencing errors and increase robustness. Counts per miRNA family and cluster are presented for download. Currently, MIRPIPE can complete a job within 0.5–2 h, depending on the file size and the selected reference database. MIRPIPE quantification results can be directly used for differential expression analysis using other tools on our Web site (Supplementary File S1).

3 BENCHMARK

To demonstrate congruent results for MIRPIPE, we compared the results with an miRNA analysis based on a genomic mapping of Illumina HiSeq reads (Lawless ). We identified 96% of the published miRNAs (Supplementary File S2). Furthermore, we compared our tool with a similar approach without the need for a genome sequence by analyzing a public dataset (Zhang ) with the CLC Genomics Workbench. In this case, 84% of the miRNAs were identical (Supplementary File S2). Finally, we checked the predictive efficiency of our tool for niche models based on a human RNA-Seq dataset (Lappalainen ). Here, we performed MIRPIPE versus a reference database bearing (i) the complete miRBase, (ii) miRBase excluding human miRNAs and (iii) miRBase excluding miRNAs of all primates. The absence of closely related reference sequences resulted in only a marginal loss of sensitivity for MIRPIPE, indicating its aptitude for the analysis of niche model organisms (Fig. 1, Supplementary File S2).
Fig. 1.

A) Comparison of MIRPIPE prediction on two gold standard (GS) datasets using full miRBase and reduced miRBase as reference set. (B) Spearman correlation of absolute counts of GS and MIRPIPE. (C) The large number of GS-specific miRNA identifications is caused by low counts, filtered out by MIRPIPE default parameters

A) Comparison of MIRPIPE prediction on two gold standard (GS) datasets using full miRBase and reduced miRBase as reference set. (B) Spearman correlation of absolute counts of GS and MIRPIPE. (C) The large number of GS-specific miRNA identifications is caused by low counts, filtered out by MIRPIPE default parameters Funding: Excellence Cluster Cardio-Pulmonary System (ECCPS); MPI for Heart and Lung Research. Conflict of interest: none declared.
  11 in total

Review 1.  Diversifying microRNA sequence and function.

Authors:  Stefan L Ameres; Phillip D Zamore
Journal:  Nat Rev Mol Cell Biol       Date:  2013-06-26       Impact factor: 94.444

2.  Comparison of DNA sequences with protein sequences.

Authors:  W R Pearson; T Wood; Z Zhang; W Miller
Journal:  Genomics       Date:  1997-11-15       Impact factor: 5.736

3.  miREvo: an integrative microRNA evolutionary analysis platform for next-generation sequencing experiments.

Authors:  Ming Wen; Yang Shen; Suhua Shi; Tian Tang
Journal:  BMC Bioinformatics       Date:  2012-06-21       Impact factor: 3.169

4.  miRBase: microRNA sequences, targets and gene nomenclature.

Authors:  Sam Griffiths-Jones; Russell J Grocock; Stijn van Dongen; Alex Bateman; Anton J Enright
Journal:  Nucleic Acids Res       Date:  2006-01-01       Impact factor: 16.971

5.  iMir: an integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq.

Authors:  Giorgio Giurato; Maria Rosaria De Filippo; Antonio Rinaldi; Adnan Hashim; Giovanni Nassa; Maria Ravo; Francesca Rizzo; Roberta Tarallo; Alessandro Weisz
Journal:  BMC Bioinformatics       Date:  2013-12-13       Impact factor: 3.169

6.  BLAST: a more efficient report with usability improvements.

Authors:  Grzegorz M Boratyn; Christiam Camacho; Peter S Cooper; George Coulouris; Amelia Fong; Ning Ma; Thomas L Madden; Wayne T Matten; Scott D McGinnis; Yuri Merezhuk; Yan Raytselis; Eric W Sayers; Tao Tao; Jian Ye; Irena Zaretskaya
Journal:  Nucleic Acids Res       Date:  2013-04-22       Impact factor: 16.971

7.  miRDeep*: an integrated application tool for miRNA identification from RNA sequencing data.

Authors:  Jiyuan An; John Lai; Melanie L Lehman; Colleen C Nelson
Journal:  Nucleic Acids Res       Date:  2012-12-04       Impact factor: 16.971

8.  Next generation sequencing reveals the expression of a unique miRNA profile in response to a gram-positive bacterial infection.

Authors:  Nathan Lawless; Amir B K Foroushani; Matthew S McCabe; Cliona O'Farrelly; David J Lynn
Journal:  PLoS One       Date:  2013-03-05       Impact factor: 3.240

9.  High-efficiency RNA cloning enables accurate quantification of miRNA expression by deep sequencing.

Authors:  Zhaojie Zhang; Jerome E Lee; Kent Riemondy; Emily M Anderson; Rui Yi
Journal:  Genome Biol       Date:  2013       Impact factor: 13.583

10.  Transcriptome and genome sequencing uncovers functional variation in humans.

Authors:  Tuuli Lappalainen; Michael Sammeth; Marc R Friedländer; Peter A C 't Hoen; Jean Monlong; Manuel A Rivas; Mar Gonzàlez-Porta; Natalja Kurbatova; Thasso Griebel; Pedro G Ferreira; Matthias Barann; Thomas Wieland; Liliana Greger; Maarten van Iterson; Jonas Almlöf; Paolo Ribeca; Irina Pulyakhina; Daniela Esser; Thomas Giger; Andrew Tikhonov; Marc Sultan; Gabrielle Bertier; Daniel G MacArthur; Monkol Lek; Esther Lizano; Henk P J Buermans; Ismael Padioleau; Thomas Schwarzmayr; Olof Karlberg; Halit Ongen; Helena Kilpinen; Sergi Beltran; Marta Gut; Katja Kahlem; Vyacheslav Amstislavskiy; Oliver Stegle; Matti Pirinen; Stephen B Montgomery; Peter Donnelly; Mark I McCarthy; Paul Flicek; Tim M Strom; Hans Lehrach; Stefan Schreiber; Ralf Sudbrak; Angel Carracedo; Stylianos E Antonarakis; Robert Häsler; Ann-Christine Syvänen; Gert-Jan van Ommen; Alvis Brazma; Thomas Meitinger; Philip Rosenstiel; Roderic Guigó; Ivo G Gut; Xavier Estivill; Emmanouil T Dermitzakis
Journal:  Nature       Date:  2013-09-15       Impact factor: 49.962

View more
  12 in total

1.  MicroRNAs Regulating Autophagy in Neurodegeneration.

Authors:  Qingxuan Lai; Nikolai Kovzel; Ruslan Konovalov; Ilya A Vinnikov
Journal:  Adv Exp Med Biol       Date:  2021       Impact factor: 2.622

2.  The miR-206/133b cluster is dispensable for development, survival and regeneration of skeletal muscle.

Authors:  Thomas Boettger; Stas Wüst; Hendrik Nolte; Thomas Braun
Journal:  Skelet Muscle       Date:  2014-12-12       Impact factor: 4.912

Review 3.  Computational Prediction of miRNA Genes from Small RNA Sequencing Data.

Authors:  Wenjing Kang; Marc R Friedländer
Journal:  Front Bioeng Biotechnol       Date:  2015-01-26

4.  LimiTT: link miRNAs to targets.

Authors:  Julia Bayer; Carsten Kuenne; Jens Preussner; Mario Looso
Journal:  BMC Bioinformatics       Date:  2016-05-11       Impact factor: 3.169

5.  PmiRExAt: plant miRNA expression atlas database and web applications.

Authors:  Anoop Kishor Singh Gurjar; Abhijeet Singh Panwar; Rajinder Gupta; Shrikant S Mantri
Journal:  Database (Oxford)       Date:  2016-04-13       Impact factor: 3.451

6.  ADMIRE: analysis and visualization of differential methylation in genomic regions using the Infinium HumanMethylation450 Assay.

Authors:  Jens Preussner; Julia Bayer; Carsten Kuenne; Mario Looso
Journal:  Epigenetics Chromatin       Date:  2015-12-01       Impact factor: 4.954

7.  miRNAs Do Not Regulate Circadian Protein Synthesis in the Dinoflagellate Lingulodinium polyedrum.

Authors:  Steve Dagenais-Bellefeuille; Mathieu Beauchemin; David Morse
Journal:  PLoS One       Date:  2017-01-19       Impact factor: 3.240

8.  Transmission of microRNA antimiRs to mouse offspring via the maternal-placental-fetal unit.

Authors:  Jonas Hönig; Ivana Mižíková; Claudio Nardiello; David E Surate Solaligue; Maximilian J Daume; István Vadász; Konstantin Mayer; Susanne Herold; Stefan Günther; Werner Seeger; Rory E Morty
Journal:  RNA       Date:  2018-03-14       Impact factor: 4.942

9.  Identification of exosome-like nanoparticle-derived microRNAs from 11 edible fruits and vegetables.

Authors:  Juan Xiao; Siyuan Feng; Xun Wang; Keren Long; Yi Luo; Yuhao Wang; Jideng Ma; Qianzi Tang; Long Jin; Xuewei Li; Mingzhou Li
Journal:  PeerJ       Date:  2018-07-31       Impact factor: 2.984

Review 10.  Deciphering miRNAs' Action through miRNA Editing.

Authors:  Marta Correia de Sousa; Monika Gjorgjieva; Dobrochna Dolicka; Cyril Sobolewski; Michelangelo Foti
Journal:  Int J Mol Sci       Date:  2019-12-11       Impact factor: 5.923

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.