Literature DB >> 30395267

RNAcentral: a hub of information for non-coding RNA sequences.

.   

Abstract

RNAcentral is a comprehensive database of non-coding RNA (ncRNA) sequences, collating information on ncRNA sequences of all types from a broad range of organisms. We have recently added a new genome mapping pipeline that identifies genomic locations for ncRNA sequences in 296 species. We have also added several new types of functional annotations, such as tRNA secondary structures, Gene Ontology annotations, and miRNA-target interactions. A new quality control mechanism based on Rfam family assignments identifies potential contamination, incomplete sequences, and more. The RNAcentral database has become a vital component of many workflows in the RNA community, serving as both the primary source of sequence data for academic and commercial groups, as well as a source of stable accessions for the annotation of genomic and functional features. These examples are facilitated by an improved RNAcentral web interface, which features an updated genome browser, a new sequence feature viewer, and improved text search functionality. RNAcentral is freely available at https://rnacentral.org.

Entities:  

Year:  2019        PMID: 30395267      PMCID: PMC6324050          DOI: 10.1093/nar/gky1034

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

RNAcentral is a comprehensive database of ncRNA sequences from a broad range of species (1). Launched in 2014 (2), RNAcentral provides unified access to the data from 28 different RNA resources, known as Expert Databases (Figure 1).
Figure 1.

A diagram showing the 28 Expert Databases imported into RNAcentral as of September 2018, organised according to their contents. The full list of databases is available at https://rnacentral.org/expert-databases.

A diagram showing the 28 Expert Databases imported into RNAcentral as of September 2018, organised according to their contents. The full list of databases is available at https://rnacentral.org/expert-databases. The primary objective of the RNAcentral database is to provide a comprehensive set of high quality ncRNA sequences to the widest possible audience. nique NA equences are assigned ‘URS’ identifiers, which become the primary entities around which all information is stored, integrated from multiple different sources, and presented. Relevant data (e.g. accessions, genomic locations, functional annotations) for each ncRNA are then displayed within individual sequence pages on the website. The RNAcentral website has four main functionalities: Text search: allows users to search and compare ncRNA sequences from different databases. Sequence search (powered by nhmmer (3)): users may search any nucleotide sequence for similarity to known ncRNA sequences from contributing databases. Genome browser: allows users to view ncRNA annotations in a genomic region of interest. Bulk data download: all data are accessible via the FTP archive and for programmatic data access via an API (https://rnacentral.org/downloads). In this paper we discuss the recent improvements and changes that have expanded RNAcentral's abilities to serve scientists with various backgrounds and data needs. Since the last RNAcentral publication (1), the database has provided 5 releases (versions 6–10) and now imports ncRNA data from 28 databases (seven additional databases since 2017). In addition to increasing the number of sequences, we have added several new data types, including: (1) genomic locations for sequences in selected model organisms, (2) quality control information for all sequences, (3) functional and structural annotations, and (3) miRNA targets. These new data and the improvement in the website functionality are described in detail below.

COMPREHENSIVE GENOME MAPPING

The genomic context of a particular ncRNA can provide important clues pertaining to its function. For example, the location of both long and short RNAs in the Hox cluster of bilaterian animals implicates those RNAs in key developmental processes (4,5). Genomic context can also reveal potential antisense RNAs to their targets. Previously, RNAcentral provided genomic locations for sequences only if the expert databases submitted the coordinates. However, many databases do not capture genomic coordinates on the latest genome assemblies, or indeed at all. Due to this limitation in release 9 23.6% of human ncRNA sequences had no annotated genomic location. To overcome this limitation, we have developed a comprehensive approach to map RNAcentral sequences to their genome locations. We downloaded 296 genomes from Ensembl (6) and all Ensembl Genomes (7) divisions except Bacteria (due to scale). For each species, all RNAcentral sequences that did not already have a genome mapping were mapped to the corresponding genome using blat (8). Exact matches, defined as alignments with an edit distance of zero, were stored. For sequences that did not have an exact match (∼14% of all ncRNAs across all genomes), hits with at least 95% sequence identity were retained. To minimise the chance of spurious hits, we limit the length of insertions for sequences shorter than 100 nucleotides. We evaluated this pipeline by mapping all ncRNA sequences from Ensembl and found it successfully recovered the correct location for >99% indicating it is accurate. After applying this mapping pipeline, the number of sequences with reported genome mappings has increased by a factor of 10 across all species, now providing sequence locations for > 95% from the sequences of many important model organisms (Table 1). The mapping will be updated with each RNAcentral release using the latest genome versions from Ensembl, ensuring that these mappings are always up-to-date.
Table 1.

The increase in the number of sequences with genome coordinates across key species

SpeciesGenome assemblySequence countPercent mappedImprovementPercent of sequences with more than one mapping
Caenorhabditis elegans WBcel23527 13799.7%4.9%2.4%
Dictyostelium discoideum dicty_2.716799.4%9.6%25.3%
Homo sapiens GRCh38.p12204 84799.1%23.6%7.2%
Rattus norvegicus Rnor_6.0131 01397.7%70.6%15.0%
Mus musculus GRCm38.p6182 37996.4%47.1%9.4%
Schizosaccharomyces pombe ASM294v2219696.4%15.8%12.2%
Drosophila melanogaster BDGP6663096.1%34.4%22.0%
The increase in the number of sequences with genome coordinates across key species Users can explore genomic mapping in the context of Ensembl genes and transcripts either on sequence report pages or by navigating to any genome location using the RNAcentral genome browser (https://rnacentral.org/genome-browser). Additionally, GFF3 and BED files can be downloaded from the RNAcentral FTP archive. Genome mapping of ncRNAs within RNAcentral can identify inconsistencies between data sources and thereby facilitate improvements in ncRNA annotations across expert databases. For example, 10 out of 11 novel D. melanogaster snoRNAs described in a recent paper (9) and submitted to the INSDC (10) do not overlap with existing annotated ncRNAs (Figure 2), making these features candidates for review by databases such as FlyBase (11), snoPY (12) and Rfam (13). RNAcentral is developing a pipeline for systematic identification of such annotation anomalies and alerting the relevant databases.
Figure 2.

Novel snoRNA Me18S-G1506 (URS0000A59F5E_7227) found in the ENA database that was mapped with 100% sequence identity to the Drosophila melanogaster genome using the new genome mapping pipeline.

Novel snoRNA Me18S-G1506 (URS0000A59F5E_7227) found in the ENA database that was mapped with 100% sequence identity to the Drosophila melanogaster genome using the new genome mapping pipeline.

QUALITY CONTROLS USING RFAM

RNAcentral aims to provide a comprehensive and high-quality set of ncRNA sequences. In order to accomplish this, we have developed a pipeline to implement quality checks based on Rfam classification of RNA families (13). All RNAcentral sequences were searched against all Rfam families using the Infernal software (14). Although Rfam does not include piRNAs, full-length lncRNAs, and several other ncRNA types (13), the majority of RNAcentral sequences (80%) are matched by one or more Rfam families, demonstrating that classification by Rfam provides broad quality control coverage. The remaining 20% of sequences in RNAcentral that do not match an Rfam family are primarily (60%) from RNA types that Rfam does not model (piRNA, mature miRNAs, lncRNAs) or from a generic biotypes such as other or miscellaneous RNA. This analysis produces a series of warnings which are displayed in search results and on sequence pages. Currently, RNAcentral provides three types of warnings. (i) Potential contamination: triggered when a eukaryotic sequence matches an Rfam family that is only found in bacteria, which could indicate either bacterial contamination or taxonomic misclassification. (ii) Incomplete sequences: triggered when an RNAcentral sequence matches only a small part (<50%) of an Rfam model. (iii) Potential misannotations: triggered when either an rRNA or tRNA sequence does not match the corresponding Rfam families. The distribution of warnings by type is shown in Table 2. The majority (60%) of sequences do not have any warnings. Of those with warnings, most (34% of all sequences) are incomplete sequences. The majority of incomplete sequences are partial rRNAs (5 070 967 or 99%) followed by tRNAs (<1%) and other RNA types (<1%).
Table 2.

The number of sequences with and without Rfam warnings

Warning typeNumber of sequences
No problems detected9 055 240 (60%)
Incomplete sequence5 074 317 (34%)
Potential misannotation778 974 (5%)
Potential contamination162 562 (1%)
The number of sequences with and without Rfam warnings The warnings provided by this quality control are searchable on the browse page, using the ‘QC warning found’ filter on the lower left. For details on searching please refer to the RNAcentral search help at: https://rnacentral.org/help/text-search. Additionally, RNAcentral provides a flat file of all Rfam annotations in the FTP archive (ftp://ftp.ebi.ac.uk/pub/databases/RNAcentral/current_release/rfam/). The file can be used by expert databases to add Rfam links or validate existing RNA annotations by checking if they match the expected Rfam families. It is important to interpret the results of this automatic quality control analysis with caution. For example, eukaryotic sequences found in organelles are expected to match bacterial Rfam models, so the warnings are only a guide to potential issues.

NEW DATA AND FUNCTIONAL ANNOTATIONS

In this section we highlight the new data RNAcentral has imported since last publication. This data not only includes more ncRNA sequences, but also new types of information such as tRNA secondary structures and high-quality miRNA/mRNA interactions.

New expert databases

Since our last publication, RNAcentral has imported ncRNA data from seven new databases, including three Model Organism Databases (MODs): FlyBase (11), Mouse Genome Informatics (MGI) (15), and the Rat Genome Database (RGD) (16). The MODs contribute high-quality, manually reviewed ncRNAs for the species they represent, thereby adding significant value to RNAcentral. Additionally, we have imported ncRNA data from Ensembl (6), GENCODE (17), HGNC (HUGO Gene Nomenclature Committee) (18), and TarBase (19). Ensembl provides automated RNA gene annotations for over 62 vertebrate genomes predicted based on Ensembl ncRNA and lincRNA pipeline (20), while GENCODE provides high quality manual annotations for large lncRNAs found in human and mouse genomes.

tRNA secondary structures imported from GtRNAdb

Following a major upgrade of the tRNAscan-SE software, Genomic tRNA Database (GtRNAdb) (21) now provides a much broader range of tRNA sequences, including tRNAs with possible introns. RNAcentral has imported bacterial, archaeal, fungal, as well as human, rat and mouse sequences from GtRNAdb increasing the coverage from 382 species to 4239. RNAcentral also displays RNA secondary structures provided by GtRNAdb using Forna (22) (Figure 3). This is the first secondary structure dataset integrated into RNAcentral.
Figure 3.

Example secondary structure of tRNA-Ser-TGA-1-2 from Bacillus subtilis (URS000043457D_1423) visualized using Forna. The nucleotides in the Forna diagram are colored by secondary structure element, with helixes in green, hairpin loops in blue and red otherwise.

Example secondary structure of tRNA-Ser-TGA-1-2 from Bacillus subtilis (URS000043457D_1423) visualized using Forna. The nucleotides in the Forna diagram are colored by secondary structure element, with helixes in green, hairpin loops in blue and red otherwise.

miRNA target interactions from TarBase

RNAcentral also imported its first intermolecular interactions data from TarBase v8 (19). TarBase provides hundreds of thousands of experimentally supported microRNA (miRNA) targets derived from >30 experimental methodologies applied to ∼600 cell type/tissues. The integrated dataset incorporates 1507 distinct miRNAs from human and mouse, annotating 559 000 miRNA:gene pairs, corresponding to 33 858 protein coding targets. The interactions are displayed on sequence report pages (Figure 4) and can be queried using the text search.
Figure 4.

New section of the sequence report pages showing target proteins for miRNA hsa-miR-612 (URS0000759916_9606). The table provides links to Ensembl genes and TarBase summary pages and shows experimental methods.

New section of the sequence report pages showing target proteins for miRNA hsa-miR-612 (URS0000759916_9606). The table provides links to Ensembl genes and TarBase summary pages and shows experimental methods.

OTHER IMPROVEMENTS

In addition to new data, RNAcentral has also improved several aspects of the website based upon extensive user feedback. Here, we discuss improvements to the search interface, sequence descriptions, a new sequence feature viewer, as well as a JSON-based submission pipeline.

More informative search results

The text search interface has been substantially improved based on user feedback. First, the search now features a text autocomplete functionality. Secondly, the search supports more filtering options. These include: length, which helps identify only complete sequences; and the new quality checks, which allow users to limit search results to only those sequences without warnings. Finally, it is also now possible to sort search results by length in ascending or descending order (Figure 5A).
Figure 5.

(A) New text search interface options for filtering sequences by length (bottom) or changing the order of the results (top). (B) Structured snippets in text search results. The string matching the query (‘mir-100’) is highlighted in light-gray. The logos of expert databases annotating the sequence are displayed and additional information about the databases can be viewed on mouse hover. (C) A search result showing a sequence with a quality check failure. Here the red warning symbol indicates the sequence has an error, along with the type of issue detected, incomplete sequence here.

(A) New text search interface options for filtering sequences by length (bottom) or changing the order of the results (top). (B) Structured snippets in text search results. The string matching the query (‘mir-100’) is highlighted in light-gray. The logos of expert databases annotating the sequence are displayed and additional information about the databases can be viewed on mouse hover. (C) A search result showing a sequence with a quality check failure. Here the red warning symbol indicates the sequence has an error, along with the type of issue detected, incomplete sequence here. Search results now provide structured snippets (Figure 5B, C). These snippets are a concise summary of the matched sequence showing the gene symbols, sequence length, and a list of databases providing annotation for the entry, as well as any quality check issues. The snippet also explains why the entry is shown by highlighting the matched text (Figure 5B).

Improved sequence descriptions

RNAcentral provides descriptions for all sequences, which are displayed in search results as a summary and on sequences pages. Informative descriptions help to quickly identify sequences of interest among other search results (Figure 5B). RNAcentral has created a rule-based system to take into account expert database annotations to select an informative description for each sequence. In some cases, RNAcentral generates new descriptions to better represent the data from specific databases. For example, sequence URS000075A3E2_9606 is a miRNA encoded at four genome locations, which corresponds to four different descriptions from miRBase. Picking a single description for the unique sequence would not accurately summarize the different locations, so the following description is generated: ‘Homo sapiens (human) microRNA hsa-mir-6859 precursor (hsa-mir-6859 1 to 4)’. As this description shows the full range of precursors that are part of this sequence and is more informative than any one description. The generation of descriptions is done automatically on an ongoing basis.

Displaying sequence features

RNAcentral now contains a sequence feature viewer. This viewer is used to display modifications and Rfam annotations (Figure 6), replacing our previous sequence display with a more informative and accessible summary of annotations on the sequence.
Figure 6.

New section of the sequence report showing Rfam annotations. In this example, an ENA sequence URS00005B7DD8_9606, originally annotated as miscellaneous RNA (misc_RNA), matches a conserved domain of the MALAT1 Rfam family (RF01871) and MEN beta RNA (RF01684). The locations of the Rfam matches are shown in the feature viewer.

New section of the sequence report showing Rfam annotations. In this example, an ENA sequence URS00005B7DD8_9606, originally annotated as miscellaneous RNA (misc_RNA), matches a conserved domain of the MALAT1 Rfam family (RF01871) and MEN beta RNA (RF01684). The locations of the Rfam matches are shown in the feature viewer.

Automatic assignment of GO terms

RNAcentral sequences are automatically annotated with GO terms, propagated from the matching Rfam covariance models. When a ncRNA sequence is matched to one or more Rfam families, the GO terms associated with the Rfam family are transitively assigned to the ncRNA sequence. More than 10 millions of these annotations are available through QuickGO (22) (https://www.ebi.ac.uk/QuickGO/annotations?assignedBy=RNAcentral), as well as in the Gene Ontology Annotation (GOA) Database (https://www.ebi.ac.uk/GOA). RNAcentral is the largest source of GO annotations for ncRNA sequences. Additionally, RNAcentral identifiers (URS) are used as the basis for GO annotations in GOA and QuickGO. These identifiers were chosen because they provide a stable, precise, and comprehensive method for referring to ncRNA sequences (24).

New JSON-based submission process

In order to streamline the submission of data to RNAcentral, we defined a new exchange format and validation software. The new system results in a clear and unambiguous protocol for the preparation, validation and submission of ncRNA data and metadata by the expert databases to RNAcentral, and has made the submission process faster, more reliable and flexible. The current version is based on a corresponding effort by the Alliance of Genome Resources (https://github.com/alliance-genome/agr_schemas), and has been developed with extensive feedback from FlyBase (11), miRBase (25), LNCipedia (26), GtRNAdb (21) and TarBase (19). The schema and a JSON schema validator are available at https://github.com/RNAcentral/rnacentral-data-schema.

USE OF RNAcentral DATA

Here we describe how being part of the RNAcentral Consortium has helped two expert databases to improve their resources. We also present examples of RNAcentral usage by the research community.

HGNC canonical human ncRNA gene set

The HGNC (18) is the only international resource that has the authority to approve gene symbols and names for human genes. HGNC began approving symbols for human small non-coding RNA genes in the 1980s, starting with mitochondrial tRNA genes. Since the identification of many new classes of RNA the naming of ncRNA genes has become one of HGNC’s core activities. HGNC collaborates with several RNAcentral expert databases to name specific classes of small ncRNAs, such as miRBase for miRNAs and GtRNAdb for tRNAs. HGNC also names long non-coding RNA (lncRNA) genes by working directly with research groups and genome annotators. The lncRNA gene names are based on reported function wherever possible, and on genomic location where the function is unknown. Due to its relative completeness, the HGNC ncRNA set was chosen to be the canonical human gene set in RNAcentral, meaning HGNC is promoted above other sources of human data. Each HGNC entry is matched to one RNAcentral sequence through cross references to RefSeq, Ensembl, GtRNAdb and other databases that are manually curated by the HGNC. For example, the HGNC entry for HOTAIR corresponds to RefSeq accession NR_003716, which is found in RNAcentral under the identifier URS000075C808. RNAcentral has helped HGNC by performing quality control checks on its data. This enable HGNC to check the mappings between their gene symbols and Ensembl gene annotations and lncRNAdb, and also augment their links to RefSeq transcript sequences. Following a list sent to HGNC from RNAcentral and a resulting discussion with RefSeq gene annotators, HGNC withdrew the gene symbol HPVC1 (gene name: human papillomavirus (type 18) E5 central sequence-like 1) because there was a lack of evidence for transcription at this locus. RefSeq also withdrew their gene entry for HPVC1.

Functional annotation of miRNAs

Functional annotation of gene products using the Gene Ontology has proven vital for interpretation of scientific studies, especially for large-scale studies where functions and roles of many gene products need to be analysed (27). However, this type of high-quality functional annotation has been lacking for many classes of ncRNAs. There is an abundance of published information about the targets and the functional roles of individual miRNAs in the literature, but that information is not curated or systematically available in any database. Researchers therefore commonly infer functional roles of miRNAs by mining lists of predicted targets (28–30). However, this has been shown to lead to biased and unreliable interpretations of miRNA function (29,30). The Functional Gene Annotation Team at University College London (UCL) started curating experimentally verified GO terms for mature miRNAs in 2014. However, any slight change in a miRNA sequence can mean that it targets different mRNAs for silencing, and potentially different biological processes and pathways. Therefore, to ensure GO annotations are associated with the correct mature miRNA sequence, stable species-specific database identifiers were required (24). The provision of RNAcentral identifiers has allowed the UCL curators to identify miRNA sequences reported in specific publications unambiguously. Since it is common practice for authors to display an alignment of the mRNA with the targeting miRNA sequence in reverse orientation (3′ to 5′), RNAcentral implemented a ‘reverse sequence and search again’ option into the sequence similarity search tool to assist finding the correct miRNA identifier. Occasionally, authors will only show a partial miRNA sequence in a publication. In these cases, a text search in RNAcentral for the miRNA name will return all ncRNA matches, allowing the biocurator to manually cross-check with the published sequence to determine the correct sequence for GO term assignment. As discussed above, ncRNA annotations in QuickGO are based upon RNAcentral identifiers, which makes distributing UCL GO annotations simple. UCL annotations are provided to several high-profile knowledgebases such as Ensembl, NCBI Gene, miRBase, as well as the GO Consortium. Additionally, the experimentally validated interactions between the mature miRNA and its targets are provided as a PSICQUIC web service (http://www.ebi.ac.uk/Tools/webservices/psicquic/view/main.xhtml), named ‘EBI-GOA-miRNA, enabling this data to be used in interaction networks (28). The UCL team has created over 5,000 GO annotations for over 570 miRNAs from human, mouse and rat. These GO annotations are now displayed in the RNAcentral entries for the mature miRNAs (Figure 7), with links out to the QuickGO browser (23) for the full annotation records. The consistent use of RNAcentral identifiers for functional annotation has also facilitated the import and display of miRNA functional data in the miRBase database (25).
Figure 7.

RNAcentral visualization of GO annotations for miRNA hsa-mir-126 (URS0000759B6D_9606) that is involved in heart development.

RNAcentral visualization of GO annotations for miRNA hsa-mir-126 (URS0000759B6D_9606) that is involved in heart development.

Use of RNAcentral by the wider research community

We monitor RNAcentral usage by analyzing paper citations and engaging with the users online and at conferences. One of the main uses of RNAcentral is as a source of reference data. In several studies RNAcentral sequences from an organism or RNA type of interest are downloaded and then the novel ncRNAs are compared against the RNAcentral sequences to classify them or to determine if the ncRNAs have been observed before. For example, RNAcentral data were used to study miRNA expression in breast cancer (31), to annotate the sea anemone genome with ncRNAs and study miRNA-mediated modulation of the host transcriptome in cnidarian-dinoflagellate symbiosis (32), and to understand the physiological regulation of reproduction in goats (33). Additionally, Ensembl regularly imports identifiers and descriptions from RNAcentral. Currently, in Ensembl, there are 579,783 RNAcentral related entries for over 112 species. RNAcentral data are also used in the private sector where the sequences have been used to build a reference database for metagenomics analysis using the MG7 pipeline by a company called Era7 Bioinformatics (https://era7bioinformatics.com/en/page.cfm?id=464). More use cases can be found on a dedicated web page (https://rnacentral.org/use-cases).

Future plans

We are currently working on several improvements such as computing and displaying standardized secondary structures using TRAVeLer (34), a faster release procedure, and more extensive quality controls. We expect RNAcentral to continue growing in utility and reach as more features are added and more databases join the consortium. For example, we plan to extend our genome mapping to include Ensembl Bacteria. We are always open to feedback and our contact information is available at https://rnacentral.org/contact.

DATA AVAILABILITY

RNAcentral is an open source project with all code available in the GitHub organization: https://github.com/rnacentral/.
  34 in total

1.  Gene ontology: tool for the unification of biology. The Gene Ontology Consortium.

Authors:  M Ashburner; C A Ball; J A Blake; D Botstein; H Butler; J M Cherry; A P Davis; K Dolinski; S S Dwight; J T Eppig; M A Harris; D P Hill; L Issel-Tarver; A Kasarskis; S Lewis; J C Matese; J E Richardson; M Ringwald; G M Rubin; G Sherlock
Journal:  Nat Genet       Date:  2000-05       Impact factor: 38.330

2.  BLAT--the BLAST-like alignment tool.

Authors:  W James Kent
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

3.  Functional demarcation of active and silent chromatin domains in human HOX loci by noncoding RNAs.

Authors:  John L Rinn; Michael Kertesz; Jordon K Wang; Sharon L Squazzo; Xiao Xu; Samantha A Brugmann; L Henry Goodnough; Jill A Helms; Peggy J Farnham; Eran Segal; Howard Y Chang
Journal:  Cell       Date:  2007-06-29       Impact factor: 41.582

4.  Infernal 1.1: 100-fold faster RNA homology searches.

Authors:  Eric P Nawrocki; Sean R Eddy
Journal:  Bioinformatics       Date:  2013-09-04       Impact factor: 6.937

5.  RNAcentral: an international database of ncRNA sequences.

Authors:  Anton I Petrov; Simon J E Kay; Richard Gibson; Eugene Kulesha; Dan Staines; Elspeth A Bruford; Mathew W Wright; Sarah Burge; Robert D Finn; Paul J Kersey; Guy Cochrane; Alex Bateman; Sam Griffiths-Jones; Jennifer Harrow; Patricia P Chan; Todd M Lowe; Christian W Zwieb; Jacek Wower; Kelly P Williams; Corey M Hudson; Robin Gutell; Michael B Clark; Marcel Dinger; Xiu Cheng Quek; Janusz M Bujnicki; Nam-Hai Chua; Jun Liu; Huan Wang; Geir Skogerbø; Yi Zhao; Runsheng Chen; Weimin Zhu; James R Cole; Benli Chai; Hsien-Da Huang; His-Yuan Huang; J Michael Cherry; Artemis Hatzigeorgiou; Kim D Pruitt
Journal:  Nucleic Acids Res       Date:  2014-10-28       Impact factor: 16.971

6.  QuickGO: a web-based tool for Gene Ontology searching.

Authors:  David Binns; Emily Dimmer; Rachael Huntley; Daniel Barrell; Claire O'Donovan; Rolf Apweiler
Journal:  Bioinformatics       Date:  2009-09-10       Impact factor: 6.937

7.  Structure, evolution and function of the bi-directionally transcribed iab-4/iab-8 microRNA locus in arthropods.

Authors:  Jerome H L Hui; Antonio Marco; Suzanne Hunt; Janet Melling; Sam Griffiths-Jones; Matthew Ronshaugen
Journal:  Nucleic Acids Res       Date:  2013-01-17       Impact factor: 16.971

8.  nhmmer: DNA homology search with profile HMMs.

Authors:  Travis J Wheeler; Sean R Eddy
Journal:  Bioinformatics       Date:  2013-07-09       Impact factor: 6.937

9.  miRBase: annotating high confidence microRNAs using deep sequencing data.

Authors:  Ana Kozomara; Sam Griffiths-Jones
Journal:  Nucleic Acids Res       Date:  2013-11-25       Impact factor: 16.971

10.  snOPY: a small nucleolar RNA orthological gene database.

Authors:  Maki Yoshihama; Akihiro Nakao; Naoya Kenmochi
Journal:  BMC Res Notes       Date:  2013-10-23
View more
  55 in total

1.  Unification of miRNA and isomiR research: the mirGFF3 format and the mirtop API.

Authors:  Thomas Desvignes; Phillipe Loher; Karen Eilbeck; Jeffery Ma; Gianvito Urgese; Bastian Fromm; Jason Sydes; Ernesto Aparicio-Puerta; Victor Barrera; Roderic Espín; Florian Thibord; Xavier Bofill-De Ros; Eric Londin; Aristeidis G Telonis; Elisa Ficarra; Marc R Friedländer; John H Postlethwait; Isidore Rigoutsos; Michael Hackenberg; Ioannis S Vlachos; Marc K Halushka; Lorena Pantano
Journal:  Bioinformatics       Date:  2020-02-01       Impact factor: 6.937

2.  Mouse Trmt2B protein is a dual specific mitochondrial metyltransferase responsible for m5U formation in both tRNA and rRNA.

Authors:  Ivan Laptev; Ekaterina Shvetsova; Sergey Levitskii; Marina Serebryakova; Maria Rubtsova; Alexey Bogdanov; Piotr Kamenski; Petr Sergiev; Olga Dontsova
Journal:  RNA Biol       Date:  2019-11-27       Impact factor: 4.652

3.  ncRPheno: a comprehensive database platform for identification and validation of disease related noncoding RNAs.

Authors:  Wenliang Zhang; Guocai Yao; Jianbo Wang; Minglei Yang; Jing Wang; Haiyue Zhang; Weizhong Li
Journal:  RNA Biol       Date:  2020-03-26       Impact factor: 4.652

4.  Sieving RNA 3D Structures with SHAPE and Evaluating Mechanisms Driving Sequence-Dependent Reactivity Bias.

Authors:  Travis Hurst; Shi-Jie Chen
Journal:  J Phys Chem B       Date:  2021-01-26       Impact factor: 2.991

5.  Identification of novel lncRNA by reanalysis of RNA-seq data in Zika Virus Infected hiNPCs.

Authors:  Amouda Venkatesan; Aishwarya Barik; Dahrii Paul; Mathavan Muthaiyan; Rajesh Das
Journal:  Virusdisease       Date:  2022-07-01

6.  miEAA 2.0: integrating multi-species microRNA enrichment analysis and workflow management systems.

Authors:  Fabian Kern; Tobias Fehlmann; Jeffrey Solomon; Louisa Schwed; Nadja Grammes; Christina Backes; Kendall Van Keuren-Jensen; David Wesley Craig; Eckart Meese; Andreas Keller
Journal:  Nucleic Acids Res       Date:  2020-07-02       Impact factor: 16.971

7.  SIGNOR 2.0, the SIGnaling Network Open Resource 2.0: 2019 update.

Authors:  Luana Licata; Prisca Lo Surdo; Marta Iannuccelli; Alessandro Palma; Elisa Micarelli; Livia Perfetto; Daniele Peluso; Alberto Calderone; Luisa Castagnoli; Gianni Cesareni
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

8.  HIV-1 Infection Transcriptomics: Meta-Analysis of CD4+ T Cells Gene Expression Profiles.

Authors:  Antonio Victor Campos Coelho; Rossella Gratton; João Paulo Britto de Melo; José Leandro Andrade-Santos; Rafael Lima Guimarães; Sergio Crovella; Paola Maura Tricarico; Lucas André Cavalcanti Brandão
Journal:  Viruses       Date:  2021-02-04       Impact factor: 5.048

9.  FGGA-lnc: automatic gene ontology annotation of lncRNA sequences based on secondary structures.

Authors:  Flavio E Spetale; Javier Murillo; Gabriela V Villanova; Pilar Bulacio; Elizabeth Tapia
Journal:  Interface Focus       Date:  2021-06-11       Impact factor: 4.661

10.  miRNAs generated from Meg3-Mirg locus are downregulated during aging.

Authors:  Ana-Mihaela Lupan; Evelyn-Gabriela Rusu; Mihai Bogdan Preda; Catalina Iolanda Marinescu; Cristina Ivan; Alexandrina Burlacu
Journal:  Aging (Albany NY)       Date:  2021-06-22       Impact factor: 5.682

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.