Literature DB >> 25425034

piRBase: a web resource assisting piRNA functional study.

Peng Zhang1, Xiaohui Si2, Geir Skogerbø1, Jiajia Wang1, Dongya Cui1, Yongxing Li1, Xubin Sun1, Li Liu1, Baofa Sun1, Runsheng Chen3, Shunmin He3, Da-Wei Huang4.   

Abstract

piRNAs are a class of small RNAs that is most abundantly expressed in the animal germ line. Presently, substantial research is going on to reveal the functions of piRNAs in the epigenetic and post-transcriptional regulation of transposons and genes. A piRNA database for collection, annotation and structuring of these data will be a valuable contribution to the field, and we have therefore developed the piRBase platform which integrates various piRNA-related high-throughput data. piRBase has the largest collection of piRNAs among existing databases, and contains at present 77 million piRNA sequences from nine organisms. Repeat-derived and gene-derived piRNAs, which possibly participate in the regulation of the corresponding elements, have been given particular attention. Furthermore, epigenetic data and reported piRNA targets were also collected. To our knowledge, this is the first piRNA database that systematically integrates epigenetic and post-transcriptional regulation data to support piRNA functional analysis. We believe that piRBase will contribute to a better understanding of the piRNA functions. Database URL: http://www.regulatoryrna.org/database/piRNA/
© The Author(s) 2014. Published by Oxford University Press.

Entities:  

Mesh:

Substances:

Year:  2014        PMID: 25425034      PMCID: PMC4243270          DOI: 10.1093/database/bau110

Source DB:  PubMed          Journal:  Database (Oxford)        ISSN: 1758-0463            Impact factor:   3.451


Introduction

piRNAs are a recently discovered class of small RNAs that bind to PIWI proteins. piRNAs are mainly expressed in the germline, although expression is also observed in somatic cells. In most species, the piRNAs range in size between 24 and 33 nt, whereas in Caenorhabditis elegans, the small RNAs corresponding to piRNAs are 21 nt in length and are commonly called 21U RNAs. piRNAs share a strong preference for a 5′-uridine residue. Genomic mapping have shown that piRNAs mostly originate from a limited number of clustered loci, each cluster being several kilobases in extension, and in which piRNAs may be encoded by one or both strands (1, 2). The amount of publicly available piRNA data is presently increasing rapidly. piRNAs were first shown to function in post-transcriptional regulation of transposons. Reuter et al. (3) discovered that extensive complementarity between piRNAs and targeted transposon transcript was required for cleaving of targets in male germ cells by the protein MIWI, the mouse homologue of PIWI, and that the cleavage position was located 10 nt downstream of the 5′-end of the guide piRNA. Enrichment for L1- and IAP-derived piRNAs in mouse testes similarly showed a 10-nt distance between the 5′-ends of sense and antisense partners (4). Kiuchi et al. (5) found a 10-nt overlap between piRNAs derived from the Fem and Masc mRNAs in silkworm embryos, suggesting that piRNAs might participate in post-transcriptional silencing of coding genes by cleaving the corresponding mRNAs. In addition, piRNAs appear to induce mRNA deadenylation and decay in mouse elongating spermatids (6) and in the Drosophila embryo (7). Epigenetic roles for piRNAs have also been discovered. In the fruit fly, PIWI binds to heterochromatin protein 1a (HP1a), which, upon methylation of histone H3K9, maintains the heterochromatin state of specific chromosomal regions (8, 9). It has also been reported that the PIWI protein can reactivate the euchromatin state of some chromosomal regions (10). Upon mutation of the PIWI proteins in mouse testes, the DNA methylation of retrotransposon genes is lost and the elements show increased expression (4, 11–13). Besides this, the levels of histone modification H3K9me3 on sequences flanking full-length L1-A copies were reduced in Miwi2 knockout spermatogonia (14). These results indicate that the piRNAs function in the establishment of DNA methylation and H3K9me3 marks on retrotransposons. Another report indicated that the Piwi/piRNA complex from the Aplysia central nervous system facilitates methylation of a conserved CpG island in the promoter of the breast cancer-related CERB2 gene (15). The varied roles and rapidly increasing numbers of piRNAs underscore the need for a web analysis platform for piRNAs. In RNAcentral (16), the main database for RNA sequences, piRNABank (17) is the only piRNA database. Outside the RNAcentral, the piRNAQuest database (18) also focuses on piRNAs. As both piRNABank and piRNAQuest only contain limited amounts of piRNA data (Table 1) and annotations, and barely touches on the functions of the piRNAs, we have developed a new database named piRBase. piRBase has assembled a larger amount of piRNA data than the presently existing databases, and is the only database that includes epigenetic data and experimentally or computationally generated piRNA target data.
Table 1.

Current numbers of unique piRNA sequences in piRBase and other piRNA databases

SpeciespiRNABankpiRNAQuestpiRBase
Human321944174932826
Mouse7287889007851664769
Rat627136675863182
D. melanogaster44417 a021027419
C. elegans0028219
Zebrafish356550 a01330692
Chicken00508437
X. tropicalis006142904
Silkworm001174963
Platypus147 a00

aThe number of unique piRNAs may be less than shown.

Current numbers of unique piRNA sequences in piRBase and other piRNA databases aThe number of unique piRNAs may be less than shown. Currently, piRBase contains 77 million piRNA sequences from nine organisms (Figure 1A), including data from worm (C. elegans), chicken, frog (Xenopus tropicalis) and silkworm (Bombyx mori) piRNAs which had previously not been collected by other piRNA databases. The amount of piRNA sequences derived from mouse, fruit fly and zebrafish is also much larger than in the other two databases (Table 1). More details on distinct piRNAs are provided, such as experimental method by which the piRNA was obtained, the tissues expressing the piRNAs, and annotations of the piRNA loci.
Figure 1.

Overview of piRNA sequences in piRBase. (A) The percentage of unique piRNAs from each species in piRBase. (B) The amount of piRNA sequences obtained by different experimental methods. (C) Length distribution of unique piRNA sequences in piRBase. Caenorhabditis elegans piRNAs are not included as all reported sequences are 21 nt long.

Overview of piRNA sequences in piRBase. (A) The percentage of unique piRNAs from each species in piRBase. (B) The amount of piRNA sequences obtained by different experimental methods. (C) Length distribution of unique piRNA sequences in piRBase. Caenorhabditis elegans piRNAs are not included as all reported sequences are 21 nt long.

Construction and content of the piRBase database

More than 77 million piRNA sequences and their corresponding annotations have been collected by piRBase. The data were collected from the literature and external databases. Processed piRNA sequences (txt or fasta files) have been preferred to raw sequencing data (sra or fastq files). We have put much effort into harvesting piRNA datasets from the literature and in verifying that these sequences were regarded as piRNAs by the authors of the respective papers. The piRNAs presently assembled in piRBase were mainly obtained by four experimental methods: (i) small RNA sequencing, (ii) immunoprecipitation of Piwi or Piwi-associated proteins, (iii) Piwi protein crosslinking-immunoprecipitation and (iv) chromatography. The amounts of piRNA sequences obtained by each method are displayed in Figure 1B. Figure 1C shows the length distribution of unique piRNA sequences in piRBase. After mapping the piRNAs to the genome, we took particular care to identify piRNAs that are derived from repeat elements and from coding genes, as these piRNAs might participate in the post-transcriptional regulation of the corresponding elements. In addition, piRBase also collected information on predicted and experimentally verified piRNA targets, DNA methylation data of tissues expressing piRNAs, and H3K9me3 data that may be related to piRNA function. The data collection and processing steps are illustrated in Figure 2 and in the Supplementary computational procedures.
Figure 2.

Database construction pipeline. The database was constructed in three major steps: manual literature mining, data annotation and analysis and data storage in a MySQL relational database with a Web interface.

Database construction pipeline. The database was constructed in three major steps: manual literature mining, data annotation and analysis and data storage in a MySQL relational database with a Web interface.

piRNA annotation

We have regarded the piRNA sequences from a separate RNA library as one dataset in piRBase. The piRNAs in piRBase are thus derived from more than 130 datasets (Supplementary Table S1). For every distinct piRNA sequence, we provide information including the piRBase piRNA name, NCBI and RNAdb piRNA aliases, NCBI piRNA accession number, organism of origin, sequence, sequence length, information on the datasets reporting the piRNA, PubMed id of the corresponding literature and the experimental method by which the piRNA was obtained. The piRBase piRNA name is unique for each piRNA record, and identical piRNA sequences from the same organism are combined as a single record. In order to ascertain the origin of every piRNA sequence, we have mapped all piRNAs collected in piRBase to its corresponding genome using bowtie (19). No more than one mismatch was allowed, and only the best hits were reported (see Supplementary computational procedures for more detailed information).

Data supporting functional analysis

Repeat/gene-derived piRNAs

According to the mapping result mentioned above, piRNAs mapping to RefSeq genes (20) or repeat elements annotated by RepeatMasker (21) are identified. These piRNAs are in piRBase referred to as gene- and repeat-derived piRNAs, respectively.

Post-transcriptional regulation data

Potential piRNA target genes with evidence of post-transcriptional regulation in mouse elongating spermatids (6) and in fruit fly embryos (7) were mined from the literature. For each piRNA–mRNA pair, we have recorded the piRNA, the region of the gene targeted by the piRNA and the piRNA functional mechanism. Experimentally verified piRNA–target relationships were noted. Thus far, this type of information only extends to mouse and fruit fly piRNA targets.

Epigenetic data

DNA methylation data for tissues expressing piRNAs were collected from the UCSC and GEO databases (22–25). The tissues include human brain, human testis, mouse brain, mouse testis, mouse spermatocytes, mouse spermatids, chicken testis, zebrafish testis and Xenopus tropicalis testis. Two forms of DNA methylation data have been collected: percentages of DNA methylation levels at the single-nucleotide scale, and non-methylated islands. H3K9me3 ChIP-seq data for Miwi2 Het and Miwi2 KO mouse germ cells have been downloaded from the NCBI database to facilitate analysis of piRNA function in histone modification (14). The data supporting the functional analysis are listed in Table 2.
Table 2.

List of data supporting the functional analysis in piRBase

SpeciesRepeat-/gene-derived piRNAsPost-transcriptional piRNA targetDNA methylation dataH3K9me3 data
HumanTestisTestis, brain
MouseTestis, germ cellGerm cellTestis, germ cell, brainGerm cell
RatTestis
D. melanogasterTestis, ovaryEmbryo
C. elegansWhole worm
ZebrafishOvary, testisTestis
ChickenEmbryoTestis
X. tropicalisEgg, gastrulaTestis
Silkworm
List of data supporting the functional analysis in piRBase

Data storage

In order to store the piRNA data and to facilitate piRNA function analysis, we constructed the piRBase Database and established a user-friendly Web interface. The piRBase is a MySQL relational database. The Web interface is built on PHP and JavaScript. For interactive data visualization, we have installed the UCSC Genome Browser (26). Alternatively, users can access the piRBase data from a download page and perform their own analyses.

Web interface

Browse and search piRNA annotations

Browsing piRNAs and datasets

Users can browse the piRNAs by organism (Figure 3A) or browse the piRNAs of each individual dataset (Figure 3B). While browsing the piRNAs, detailed information on each piRNA is displayed by a click on the piRNA name. The detailed information page lists general information on the piRNA, the datasets containing the piRNA, its location in genome and the literature reporting it. The piRNA locus can be viewed in Genome Browser via the link in the detailed information page. The users can also view the piRNA description in NCBI by clicking on the accession number (Figure 3C).
Figure 3.

Screenshots of the browse and search pages. (A) The ‘Browse piRNAs’ page. (1) The drop down list box enables the users to browse piRNAs of a specific organism. (2 and 3) Clicking on the links delivers the detailed information page of the corresponding piRNA in piRBase and NCBI, respectively. (B) The ‘Browse Datasets’ page. The datasets can be filtered according to organism (4) and users can also browse piRNAs in a particular dataset (5). To learn more about the datasets, external links are provided (6). (C) In the piRNA ‘Detailed Information’ page, links to piRNA loci in the Genome Browser (7) and the annotations for these positions (8) are given. An online Bowtie tool (9) and a link to UCSC Blat (10) are also available for more alignment results. (D) In the piRNA target search result page, links to the piRNA ‘Detailed Information’ page of piRBase (11) and the target location in the Genome Browser (12) are available. (E and F) The locus information of the piRNAs, H3K9me3 marks, DNA methylation and piRNA target sites are shown in the Genome Browser. RefSeq genes and RepeatMasker annotations are also displayed. Screenshots of the Genome Browser show the piRNA target sites (13) and the H3K9me3 levels at a LINE1 locus (chr3:123735167-123741052 of mm9 genome) in Miwi2 HET and Miwi2 KO spermatogonia (14).

Screenshots of the browse and search pages. (A) The ‘Browse piRNAs’ page. (1) The drop down list box enables the users to browse piRNAs of a specific organism. (2 and 3) Clicking on the links delivers the detailed information page of the corresponding piRNA in piRBase and NCBI, respectively. (B) The ‘Browse Datasets’ page. The datasets can be filtered according to organism (4) and users can also browse piRNAs in a particular dataset (5). To learn more about the datasets, external links are provided (6). (C) In the piRNA ‘Detailed Information’ page, links to piRNA loci in the Genome Browser (7) and the annotations for these positions (8) are given. An online Bowtie tool (9) and a link to UCSC Blat (10) are also available for more alignment results. (D) In the piRNA target search result page, links to the piRNA ‘Detailed Information’ page of piRBase (11) and the target location in the Genome Browser (12) are available. (E and F) The locus information of the piRNAs, H3K9me3 marks, DNA methylation and piRNA target sites are shown in the Genome Browser. RefSeq genes and RepeatMasker annotations are also displayed. Screenshots of the Genome Browser show the piRNA target sites (13) and the H3K9me3 levels at a LINE1 locus (chr3:123735167-123741052 of mm9 genome) in Miwi2 HET and Miwi2 KO spermatogonia (14).

Searching piRNAs

Using the web interface, the database can be searched by sequence, piRBase name, NCBI accession number and RNAdb name (27). Searching by sequence requires the complete piRNA sequence and allows up to two mismatches.

Searching for data supporting functional analysis

Searching for repeat-/gene-derived piRNA

Search options for piRNAs derived from genes or repeats are also provided. The result pages are similar to the Browse result pages.

Searching for post-transcriptional regulation data

In order to support piRNA functional analysis, predicted and experimentally verified piRNA targets were collected. The web interface provides a piRNA target search module that users can use to search piRNA–target pairs by the name of functional piRNA, the target gene symbol or the RefSeq accession number. In the result page, a table is displayed that lists the basic information on functional piRNAs and target transcripts. In addition to the link to the detailed piRNA information, there is also a link to the Genome Browser showing the piRNA target sites in the genome (Figure 3D).

Searching for epigenetic data

Users can view DNA methylation levels and H3K9me3 levels at selected chromosome positions via an Epigenetics search module, and the DNA methylation levels of specific genes in the UCSC Genome Browser.

The UCSC genome browser

Selected data are visualized in the Genome Browser in order to facilitate visual exploration (26), and can be accessed from each result page. This includes the piRNA locus, piRNA target sites and H3K9me3 and DNA methylation levels in specific tissues (28). In addition, some basic annotations from external databases, such as RepeatMasker annotations and RefSeq genes are included (Figure 3E and F). For example, to study the regulation of mRNA elimination by piRNAs, users can search piRBase by entering organism and piRNA name in the target mRNA search module. Detailed information on the piRNA–mRNA pair and a link to the Genome Browser will be displayed in the search result (Figure 3D), and the genomic positions corresponding to the piRNA-binding sites can be viewed in the Genome Browser by clicking on the link (Figure 3E). Pezic et al. (14) found that piRNAs target active LINE1s to establish repressive H3K9me3 marks in mouse spermatogonia. One of the reported LINE1s is located in chr3:123735167-123741052 of the mouse genome (mm9). Compared with Miwi2 KO spermatogonia, the H3K9me3 level of this region is higher in Miwi2 HET spermatogonia, and this is shown in the Genome Browser (Figure 3F).

Downloading

The Download module provides two ways to download datasets. Users can either choose to download specific packages, or they can download piRNA data by submitting the piRBase piRNA name.

Future directions

The number of piRNAs that are being reported is increasing rapidly. We will therefore update piRBase and integrate more information supporting piRNA functional analysis at intervals depending on the rate with which new data appear, expecting to issue new versions of the database about once every half year. In the future, we will also integrate piRNA datasets that provide only raw sequencing data. We will continue to develop the piRNA target prediction software, and special attention will be paid to the possibility of constructing piRNA-gene regulatory networks and elucidate piRNA action in distinct environments.

Supplementary data

Supplementary data are available at Database Online.
  28 in total

1.  Repbase update: a database and an electronic journal of repetitive elements.

Authors:  J Jurka
Journal:  Trends Genet       Date:  2000-09       Impact factor: 11.639

2.  A novel class of small RNAs bind to MILI protein in mouse testes.

Authors:  Alexei Aravin; Dimos Gaidatzis; Sébastien Pfeffer; Mariana Lagos-Quintana; Pablo Landgraf; Nicola Iovino; Patricia Morris; Michael J Brownstein; Satomi Kuramochi-Miyagawa; Toru Nakano; Minchen Chien; James J Russo; Jingyue Ju; Robert Sheridan; Chris Sander; Mihaela Zavolan; Thomas Tuschl
Journal:  Nature       Date:  2006-06-04       Impact factor: 49.962

3.  A germline-specific class of small RNAs binds mammalian Piwi proteins.

Authors:  Angélique Girard; Ravi Sachidanandam; Gregory J Hannon; Michelle A Carmell
Journal:  Nature       Date:  2006-06-04       Impact factor: 49.962

4.  Drosophila PIWI associates with chromatin and interacts directly with HP1a.

Authors:  Brent Brower-Toland; Seth D Findley; Ling Jiang; Li Liu; Hang Yin; Monica Dus; Pei Zhou; Sarah C R Elgin; Haifan Lin
Journal:  Genes Dev       Date:  2007-09-15       Impact factor: 11.361

5.  An epigenetic activation role of Piwi and a Piwi-associated piRNA in Drosophila melanogaster.

Authors:  Hang Yin; Haifan Lin
Journal:  Nature       Date:  2007-10-21       Impact factor: 49.962

6.  Heterochromatic silencing and HP1 localization in Drosophila are dependent on the RNAi machinery.

Authors:  Manika Pal-Bhadra; Boris A Leibovitch; Sumit G Gandhi; Madhusudana Rao Chikka; Madhusudana Rao; Utpal Bhadra; James A Birchler; Sarah C R Elgin
Journal:  Science       Date:  2004-01-30       Impact factor: 47.728

7.  Developmentally regulated piRNA clusters implicate MILI in transposon control.

Authors:  Alexei A Aravin; Ravi Sachidanandam; Angelique Girard; Katalin Fejes-Toth; Gregory J Hannon
Journal:  Science       Date:  2007-04-19       Impact factor: 47.728

8.  RNAdb 2.0--an expanded database of mammalian non-coding RNAs.

Authors:  Ken C Pang; Stuart Stephen; Marcel E Dinger; Pär G Engström; Boris Lenhard; John S Mattick
Journal:  Nucleic Acids Res       Date:  2006-12-01       Impact factor: 16.971

9.  NCBI Reference Sequence (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins.

Authors:  Kim D Pruitt; Tatiana Tatusova; Donna R Maglott
Journal:  Nucleic Acids Res       Date:  2005-01-01       Impact factor: 16.971

10.  piRNAQuest: searching the piRNAome for silencers.

Authors:  Arijita Sarkar; Ranjan Kumar Maji; Sudipto Saha; Zhumur Ghosh
Journal:  BMC Genomics       Date:  2014-07-04       Impact factor: 3.969

View more
  61 in total

1.  The RNA landscape of the human placenta in health and disease.

Authors:  Gordon C S Smith; D Stephen Charnock-Jones; Sungsam Gong; Francesca Gaccioli; Justyna Dopierala; Ulla Sovio; Emma Cook; Pieter-Jan Volders; Lennart Martens; Paul D W Kirk; Sylvia Richardson
Journal:  Nat Commun       Date:  2021-05-11       Impact factor: 14.919

2.  Web-based NGS data analysis using miRMaster: a large-scale meta-analysis of human miRNAs.

Authors:  Tobias Fehlmann; Christina Backes; Mustafa Kahraman; Jan Haas; Nicole Ludwig; Andreas E Posch; Maximilian L Würstle; Matthias Hübenthal; Andre Franke; Benjamin Meder; Eckart Meese; Andreas Keller
Journal:  Nucleic Acids Res       Date:  2017-09-06       Impact factor: 16.971

3.  piRNA-823 Is a Unique Potential Diagnostic Non-Invasive Biomarker in Colorectal Cancer Patients.

Authors:  Norhan A Sabbah; Wael M Abdalla; Walid A Mawla; Nagla AbdAlMonem; Amal F Gharib; Ahmed Abdul-Saboor; Abdallah S Abdelazem; Nermin Raafat
Journal:  Genes (Basel)       Date:  2021-04-19       Impact factor: 4.096

4.  2lpiRNApred: a two-layered integrated algorithm for identifying piRNAs and their functions based on LFE-GM feature selection.

Authors:  Yun Zuo; Quan Zou; Jianyuan Lin; Min Jiang; Xiangrong Liu
Journal:  RNA Biol       Date:  2020-03-05       Impact factor: 4.652

5.  Somatic expression of piRNA and associated machinery in the mouse identifies short, tissue-specific piRNA.

Authors:  Bambarendage P U Perera; Zing Tsung-Yeh Tsai; Mathia L Colwell; Tamara R Jones; Jaclyn M Goodrich; Kai Wang; Maureen A Sartor; Christopher Faulk; Dana C Dolinoy
Journal:  Epigenetics       Date:  2019-04-08       Impact factor: 4.528

6.  Specific PIWI-Interacting RNAs and Related Small Noncoding RNAs Are Associated With Ovarian Aging in Ames Dwarf (df/df) Mice.

Authors:  Joseph M Dhahbi; Joe W Chen; Supriya Bhupathy; Hani Atamna; Marcelo B Cavalcante; Tatiana D Saccon; Allancer D C Nunes; Jeffrey B Mason; Augusto Schneider; Michal M Masternak
Journal:  J Gerontol A Biol Sci Med Sci       Date:  2021-08-13       Impact factor: 6.053

7.  HIV-1 Tat and cocaine coexposure impacts piRNAs to affect astrocyte energy metabolism.

Authors:  Mayur Doke; Fatah Kashanchi; Mansoor A Khan; Thangavel Samikkannu
Journal:  Epigenomics       Date:  2022-02-16       Impact factor: 4.778

Review 8.  Piwi-interacting RNAs (piRNAs) and cancer: Emerging biological concepts and potential clinical implications.

Authors:  Wenhao Weng; Hanhua Li; Ajay Goel
Journal:  Biochim Biophys Acta Rev Cancer       Date:  2018-12-30       Impact factor: 10.680

9.  MicroRNA Expression Profile in Penile Cancer Revealed by Next-Generation Small RNA Sequencing.

Authors:  Li Zhang; Pengfei Wei; Xudong Shen; Yuanwei Zhang; Bo Xu; Jun Zhou; Song Fan; Zongyao Hao; Haoqiang Shi; Xiansheng Zhang; Rui Kong; Lingfan Xu; Jingjing Gao; Duohong Zou; Chaozhao Liang
Journal:  PLoS One       Date:  2015-07-09       Impact factor: 3.240

10.  CITED4 Protects Against Adverse Remodeling in Response to Physiological and Pathological Stress.

Authors:  Carolin Lerchenmüller; Charles P Rabolli; Ashish Yeri; Robert Kitchen; Ane M Salvador; Laura X Liu; Olivia Ziegler; Kirsty Danielson; Colin Platt; Ravi Shah; Federico Damilano; Piyusha Kundu; Eva Riechert; Hugo A Katus; Jeffrey E Saffitz; Hasmik Keshishian; Steven A Carr; Vassilios J Bezzerides; Saumya Das; Anthony Rosenzweig
Journal:  Circ Res       Date:  2020-05-18       Impact factor: 17.367

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.