| Literature DB >> 27049585 |
Abstract
With the rising interest in the regulatory functions of long non-coding RNAs (lncRNAs) in complex human diseases such as cardiovascular diseases, there is an increasing need in public databases offering comprehensive and integrative data for all aspects of these versatile molecules. Recently, a variety of public data repositories that specialized in lncRNAs have been developed, which make use of huge high-throughput data particularly from next-generation sequencing (NGS) approaches. Here, we provide an overview of current lncRNA databases covering basic and functional annotation, lncRNA expression and regulation, interactions with other biomolecules, and genomic variants influencing the structure and function of lncRNAs. The prominent lncRNA antisense noncoding RNA in the INK4 locus (ANRIL), which has been unequivocally associated with coronary artery disease through genome-wide association studies (GWAS), serves as an example to demonstrate the features of each individual database.Entities:
Keywords: ANRIL; Cardiovascular disease; Database; Gene regulation; Non-coding; lncRNA
Mesh:
Substances:
Year: 2016 PMID: 27049585 PMCID: PMC4996844 DOI: 10.1016/j.gpb.2016.03.001
Source DB: PubMed Journal: Genomics Proteomics Bioinformatics ISSN: 1672-0229 Impact factor: 7.691
Overview of current lncRNA databases
Note: Included data types are indicated by colored dots. The color code represents a graduation of databases in three equally-sized groups based on the number of citations per year for the initial and update database publications (red: upper third with most cited databases; blue: middle third; green: lower third; citations were retrieved from Scopus database in March, 2016). Some of the ‘green’ databases were published very recently and therefore have not yet been cited. Data types are grouped in functional categories indicated by background colors. Gray: basic genomic annotation; blue: lncRNA expression; green: molecular interactions; yellow: sequence variants. Accessible web links to all databases are given in Table 2. Hsa, Homo sapiens; Mmu, Mus musculus; Dre, Danio rerio; Ath, Arabidopsis thaliana; Cel, Caenorhabditis elegans.
Web links and data content of the presented lncRNA databases
| ChIPBase 1.1 | 543 ChIP-Seq datasets | ||
| C-It-Loci | 119 RNA-Seq datasets | ||
| Co-LncRNA | 241 RNA-Seq datasets | ||
| DIANA-LncBase | CLIP-validated and predicted miRNA targets on lncRNAs | ||
| GermlncRNA | Germ cell-related expression data | ||
| Linc2go | Functional annotation of predicted RNA interactions | ||
| LincSNP | 5000 lincRNAs and 140,000 disease-associated SNPs | ||
| lnCeDB | >25,000 lncRNA transcripts from GENCODE | ||
| LNCipedia 3.1 | 111,685 lncRNA transcripts from literature and public databases | ||
| LncReg | 1081 manually-curated lncRNA interactions | ||
| lncRNA2Function | 9625 lncRNAs and RNA-Seq data from 19 tissues | ||
| LncRNA2Target | lncRNA–target association from knockdown/overexpression | ||
| lncRNAdb v2.0 | 295 lncRNA genes curated from literature | ||
| LncRNADisease | >1000 literature-extracted lncRNA−disease annotations | ||
| lncRNAMap | RNA-Seq data from GEO and SRA | ||
| lncRNASNP | Predicted effects of SNPs in >30,000 lncRNA transcripts | ||
| lncRNAtor | 243 RNA-Seq studies from public databases | ||
| LncRNAWiki | lncRNAs of GENCODE, NONCODE, LNCipedia, and lncRNAdb | ||
| lncRNome | >17,000 lncRNAs from public databases | ||
| NONCODE 2016 | 527,336 lncRNA transcripts from literature and public databases | ||
| NPInter 3.0 | About 500,000 validated molecular interactions | ||
| NRED | Expression profiles from 8 microarray platforms | ||
| PhyloNONCODE | Conservation annotation for >135,000 lncRNAs | ||
| PLncDB | >13,000 lncRNAs from tiling arrays and RNA-Seq | ||
| SNP@lincTFBS | 5835 lincRNAs, 690 ChIP-Seq datasets, and 140,000 SNPs from dbSNP | ||
| starBase v2.0 | 111 CLIP-Seq datasets | ||
| TF2LncRNA | 22,531 lncRNA transcripts and 425 ChIP-Seq datasets | ||
| zflncRNApedia | ChIP-Seq and RNA-Seq data for 2267 zebrafish lncRNAs |
Figure 1Types of information curated in lncRNA databases
The available data have been grouped into four categories: basic genomic annotation, lncRNA expression, molecular interactions, and sequence variants. Databases for these kinds of information are listed in Table 1. lncRNA, long non-coding RNA; TFBS, transcription factor-binding site; miRNA, micoRNA.