Literature DB >> 23175614

LncRNADisease: a database for long-non-coding RNA-associated diseases.

Geng Chen1, Ziyun Wang, Dongqing Wang, Chengxiang Qiu, Mingxi Liu, Xing Chen, Qipeng Zhang, Guiying Yan, Qinghua Cui.   

Abstract

In this article, we describe a long-non-coding RNA (lncRNA) and disease association database (LncRNADisease), which is publicly accessible at http://cmbi.bjmu.edu.cn/lncrnadisease. In recent years, a large number of lncRNAs have been identified and increasing evidence shows that lncRNAs play critical roles in various biological processes. Therefore, the dysfunctions of lncRNAs are associated with a wide range of diseases. It thus becomes important to understand lncRNAs' roles in diseases and to identify candidate lncRNAs for disease diagnosis, treatment and prognosis. For this purpose, a high-quality lncRNA-disease association database would be extremely beneficial. Here, we describe the LncRNADisease database that collected and curated approximately 480 entries of experimentally supported lncRNA-disease associations, including 166 diseases. LncRNADisease also curated 478 entries of lncRNA interacting partners at various molecular levels, including protein, RNA, miRNA and DNA. Moreover, we annotated lncRNA-disease associations with genomic information, sequences, references and species. We normalized the disease name and the type of lncRNA dysfunction and provided a detailed description for each entry. Finally, we developed a bioinformatic method to predict novel lncRNA-disease associations and integrated the method and the predicted associated diseases of 1564 human lncRNAs into the database.

Entities:  

Mesh:

Substances:

Year:  2012        PMID: 23175614      PMCID: PMC3531173          DOI: 10.1093/nar/gks1099

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

A surprising finding in human transcriptome analysis is that protein-coding sequences only account for a small portion of the genome transcripts (1). The majority of the human genome transcripts are non-coding RNAs, in particular, long-non-coding RNAs (lncRNAs) (2). Normally, lncRNAs tend to be less conserved across species and often show low expression levels and high tissue specificity (3–5). Thus, at the time they were first found, lncRNAs were often considered to be transcriptional noise (5). In recent years, accumulating studies have revealed that a number of lncRNAs are not transcriptional noise but have important functions, for example, affecting gene transcription, targeting RNA polymerase II, regulating splicing and taking part in epigenetics (6). Moreover, according to the theory of competing endogenous RNA (7), lncRNAs may functionally interact with a broad range of RNA molecules through competitively binding with microRNA (miRNAs), suggesting that lncRNAs may have critical roles in a wide range of biological processes. Previous studies produced a large amount of lncRNA-related data, including sequences, expression profiles and functions. Therefore, arranging and annotating these data are important to better understand lncRNAs. Several databases for lncRNAs indeed provide helps in studying lncRNAs (8–10). For example, NRED is a database for lncRNA expression data (10). The lncRNAdb database provides detailed lncRNA information, including sequences, functions, expressions, associated proteins and cellular locations (8). Although the NONCODE database is not specific to lncRNA, it curates the sequences, functions, expressions and cellular location of lncRNAs in the third version (NONCODE v3.0) (9). More recently, researchers have attempted to understand the relationships between lncRNAs and diseases. Studies have reported that lncRNA dysfunctions are associated with a broad range of diseases (5), including cancers (11), cardiovascular diseases (12) and neurodegeneration diseases (13). For example, lncRNA PCA3 is a highly prostate cancer-specific molecules and a PCA3 score has the potential to be a biomarker for prostate cancer aggressiveness (14). The up-regulation of the lncRNA HOTAIR is an independent prognostic factor of tumor recurrence in hepatocellular carcinoma patients after liver transplantation (15). A study confirmed the high specificity and sensitivity of lncRNA UCA1 from urinary sediments in the diagnosis of bladder cancer, suggesting that UCA1 is a potential biomarker for bladder cancer diagnosis (16). Godinho et al. (17) revealed that the lncRNA BCAR4 can be a potential target for antiestrogen-resistant breast cancer treatment because its forced expression in breast cancer cells leads to cell proliferation in the presence of various antiestrogens and in the absence of estrogen. The above studies indicate that lncRNAs may help to understand diseases and help to find potential molecules in disease diagnosis, treatment and prognosis. Therefore, the study of lncRNA–disease associations is becoming one of the most important topics of lncRNAs and diseases. For this reason, a high-quality lncRNA–disease association database will be helpful in studying the roles of lncRNAs in diseases but is still not available. To build such a database, we manually curated lncRNA–disease relations experimentally reported in the literature and created a database, LncRNADisease. We included detailed annotation information for each entry. Moreover, we curated and annotated experimentally supported lncRNA interacting partners. In addition, we developed a bioinformatic method to predict novel lncRNA–disease associations and integrated this method and its predicted results into the database.

DATA SOURCES AND IMPLEMENTATION

First, we downloaded PubMed data, information on non-protein-coding RNA genes, and data on gene–PubMed associations from the National Center for Biotechnology Information. Second, we curated the data manually and retrieved lncRNA–disease pairs. All lncRNA–disease pairs were double-checked by different researchers. Hyperlinks to the original articles in PubMed database were provided. We also annotated the sequence and species information. We further normalized the names of lncRNAs and diseases. In total, we curated 166 diseases, of which cancer (39.8%), cardiovascular disease (10.8%) and neurodegeneration disease (8.4%) were the top three classes (Figure 1A). Moreover, we provided detailed descriptions for the associations of lncRNAs and diseases and curated the dysfunction type for each entry. For example, if an entry’s dysfunction evidence is derived from expression data, the dysfunction type of this entry will be considered as ‘Expression’. The distribution of the dysfunction type is shown in Figure 1B. Aside from lncRNA–disease association data, we also curated experimentally supported lncRNA interactions and cataloged the interactions according to the interacting molecules and the characteristics of the interactions. For example, at the RNA level, lncRNAs may interact with proteins (18), RNAs (19), lncRNAs (20) and miRNAs (21). Their interactions may be binding, regulation and co-expression. At the DNA level, promoters of lncRNA genes may bind with transcription factors (TFs) and be regulated by TFs (22).
Figure 1.

Statistics and distributions of diseases (A) and dysfunction types (B) of lncRNAs in the LncRNADisease database.

Statistics and distributions of diseases (A) and dysfunction types (B) of lncRNAs in the LncRNADisease database. All data were organized in the ‘LncRNADisease’ database using SQLite, a lightweight database management system. The website was developed based on Django, a Python web framework. The database is available at http://cmbi.bjmu.edu.cn/lncrnadisease.

PREDICTING NOVEL LNCRNA–DISEASE ASSOCIATIONS

LncRNADisease was designed not only as a resource for experimentally supported lncRNA–disease association data, but also as a platform for predicting novel lncRNA–disease associations. In this study, we present a method to predict novel lncRNA–disease associations based on the genomic context of a given lncRNA. We previously showed that miRNAs located closely to each other in the genome (particularly miRNAs within 2 kb) and tend to be associated with similar diseases (23,24). Here, we investigated whether or not lncRNAs tend to be associated with a similar disease as their genomic neighbor genes. Thus, we identified the protein-coding genes and miRNAs within 2 kb nts of any lncRNA with reported disease associations. We then identified the lncRNAs with the same associated disease as the neighbor genes/miRNAs. We found 33 lncRNAs associated with the same disease as their neighbor genes/miRNAs. To evaluate the significance, we randomly re-permuted the disease associated with the lncRNAs for 10 000 times and counted the number of lncRNAs associated with the same associated disease as their neighbor genes/miRNAs. As a result, none of the counts was greater than 33 and the expected number was 9, indicating that lncRNAs and their neighbor genes/miRNAs tend to be associated with the same disease (P < 1 × 10−4, randomization test; Figure 2). This result suggests that we can predict potential-associated disease for lncRNA through the disease associated with its neighbor genes/miRNAs. Based on the above observation, we developed a tool to predict novel lncRNA–disease associations and identified potential-associated diseases for all lncRNAs identified in the human genome using this tool. Finally, we integrated the tool and the predicted results into the LncRNADisease database.
Figure 2.

Significance of lncRNAs sharing a common disease with their neighbor genes/miRNAs. Blue triangles indicate the distributions of numbers of lncRNAs associated with the same disease as their neighbor genes/miRNAs in random cases. The red arrow indicates the real number of lncRNAs associated with the same disease as their neighbor genes/miRNAs.

Significance of lncRNAs sharing a common disease with their neighbor genes/miRNAs. Blue triangles indicate the distributions of numbers of lncRNAs associated with the same disease as their neighbor genes/miRNAs in random cases. The red arrow indicates the real number of lncRNAs associated with the same disease as their neighbor genes/miRNAs.

QUERYING THE DATABASE

We provide users several ways to query the LncRNADisease database. First, users can browse the LncRNADisease by lncRNA names or disease names. When clicking one lncRNA or disease in the ‘Browse’ page, LncRNADisease will return a list of matched entries. Second, we provide a ‘fuzzy search’ function for the entries by the full or partial names of lncRNAs or diseases in the ‘Search’ page. The ‘Search’ is case insensitive. We also provide a page for tools to predict novel lncRNA–disease associations. Moreover, all data in the database, including lncRNA–disease associations, predicted lncRNA–disease associations and lncRNA interactions, can be downloaded. The users can also submit novel data into the database. In addition, a detailed tutorial for the usage of the database is available in the ‘Help’ page.

FUTURE EXTENSIONS

The LncRNADisease database represents the first step in this project. Further extensions will be developed. The LncRNADisease database will update the experimentally supported lncRNA–disease association data every 2 months. Meanwhile, some new tools for analyzing lncRNA–disease association data is being developed and will be integrated into the LncRNADisease database in the future. For example, we are developing expression profile- and interacting partner-based methods to predict novel lncRNA–disease associations and expect to integrate these methods into the database in the near future.

DISCUSSION AND CONCLUSION

Increasing studies have shown that lncRNAs have important functions and are associated with a broad range of diseases. LncRNAs are becoming novel potential molecules for disease diagnosis, treatment and prognosis. In this article, we describe an lncRNA and disease association database, LncRNADisease. The LncRNADisease database integrated several types of data, such as experimentally supported lncRNA–disease association data, experimentally supported lncRNA interaction data and predicted lncRNA–disease association data. Moreover, we developed a bioinformatic method to predict potential-associated disease for a novel lncRNA based on its genomic context and integrated this method into LncRNADisease. The important roles of lncRNAs in disease are attracting more biomedical researchers. Therefore, more experimentally supported lncRNA–disease associations are expected to be published in the future and these data will be integrated into the LncRNADisease database. More importantly, although thousands of lncRNA have been identified, only a limited number of lncRNAs have been reported to be associated with diseases. It is increasingly needed to predict potential-associated diseases for lncRNAs through bioinformatic methods. Therefore, another major aim of LncRNADisease is to develop and integrate more bioinformatic methods for analyzing and predicting lncRNA–disease associations. Finally, we believe that LncRNADisease is useful for the studies of lncRNAs and diseases, and will provide more helps in this topic when it integrates more data and tools in the future.

FUNDING

National Basic Research program of China [2012CB517500]; National Natural Science Foundation of China [31000585 and 11021161]. Funding for open access charge: National Basic Research program of China [2012CB517500]. Conflict of interest statement. None declared.
  24 in total

1.  RNA maps reveal new RNA classes and a possible function for pervasive transcription.

Authors:  Philipp Kapranov; Jill Cheng; Sujit Dike; David A Nix; Radharani Duttagupta; Aarron T Willingham; Peter F Stadler; Jana Hertel; Jörg Hackermüller; Ivo L Hofacker; Ian Bell; Evelyn Cheung; Jorg Drenkow; Erica Dumais; Sandeep Patel; Gregg Helt; Madhavan Ganesh; Srinka Ghosh; Antonio Piccolboni; Victor Sementchenko; Hari Tammana; Thomas R Gingeras
Journal:  Science       Date:  2007-05-17       Impact factor: 47.728

2.  Specific expression of long noncoding RNAs in the mouse brain.

Authors:  Tim R Mercer; Marcel E Dinger; Susan M Sunkin; Mark F Mehler; John S Mattick
Journal:  Proc Natl Acad Sci U S A       Date:  2008-01-09       Impact factor: 11.205

Review 3.  Evolution and functions of long noncoding RNAs.

Authors:  Chris P Ponting; Peter L Oliver; Wolf Reik
Journal:  Cell       Date:  2009-02-20       Impact factor: 41.582

4.  The relationship between Prostate CAncer gene 3 (PCA3) and prostate cancer significance.

Authors:  Hein van Poppel; Alexander Haese; Markus Graefen; Alexandre de la Taille; Jacques Irani; Theo de Reijke; Mesut Remzi; Michael Marberger
Journal:  BJU Int       Date:  2011-08-26       Impact factor: 5.588

5.  Global identification of human transcribed sequences with genome tiling arrays.

Authors:  Paul Bertone; Viktor Stolc; Thomas E Royce; Joel S Rozowsky; Alexander E Urban; Xiaowei Zhu; John L Rinn; Waraporn Tongprasit; Manoj Samanta; Sherman Weissman; Mark Gerstein; Michael Snyder
Journal:  Science       Date:  2004-11-11       Impact factor: 47.728

6.  Genetic variants at the 9p21 locus contribute to atherosclerosis through modulation of ANRIL and CDKN2A/B.

Authors:  Ada Congrains; Kei Kamide; Ryousuke Oguro; Osamu Yasuda; Keishi Miyata; Eiichiro Yamamoto; Tatsuo Kawai; Hiroshi Kusunoki; Hiroko Yamamoto; Yasushi Takeya; Koichi Yamamoto; Miyuki Onishi; Ken Sugimoto; Tomohiro Katsuya; Nobuhisa Awata; Kazunori Ikebe; Yasuyuki Gondo; Yuichi Oike; Mitsuru Ohishi; Hiromi Rakugi
Journal:  Atherosclerosis       Date:  2011-11-19       Impact factor: 5.162

7.  Expression of a noncoding RNA is elevated in Alzheimer's disease and drives rapid feed-forward regulation of beta-secretase.

Authors:  Mohammad Ali Faghihi; Farzaneh Modarresi; Ahmad M Khalil; Douglas E Wood; Barbara G Sahagan; Todd E Morgan; Caleb E Finch; Georges St Laurent; Paul J Kenny; Claes Wahlestedt
Journal:  Nat Med       Date:  2008-06-29       Impact factor: 53.440

8.  NONCODE v3.0: integrative annotation of long noncoding RNAs.

Authors:  Dechao Bu; Kuntao Yu; Silong Sun; Chaoyong Xie; Geir Skogerbø; Ruoyu Miao; Hui Xiao; Qi Liao; Haitao Luo; Guoguang Zhao; Haitao Zhao; Zhiyong Liu; Changning Liu; Runsheng Chen; Yi Zhao
Journal:  Nucleic Acids Res       Date:  2011-12-01       Impact factor: 16.971

9.  NRED: a database of long noncoding RNA expression.

Authors:  Marcel E Dinger; Ken C Pang; Tim R Mercer; Mark L Crowe; Sean M Grimmond; John S Mattick
Journal:  Nucleic Acids Res       Date:  2008-10-01       Impact factor: 16.971

10.  An analysis of human microRNA and disease associations.

Authors:  Ming Lu; Qipeng Zhang; Min Deng; Jing Miao; Yanhong Guo; Wei Gao; Qinghua Cui
Journal:  PLoS One       Date:  2008-10-15       Impact factor: 3.240

View more
  368 in total

Review 1.  LncRNAs: emerging players in gene regulation and disease pathogenesis.

Authors:  Mina Kazemzadeh; Reza Safaralizadeh; Ayla Valinezhad Orang
Journal:  J Genet       Date:  2015-12       Impact factor: 1.166

2.  A novel long non-coding RNA in the rheumatoid arthritis risk locus TRAF1-C5 influences C5 mRNA levels.

Authors:  T C Messemaker; M Frank-Bertoncelj; R B Marques; A Adriaans; A M Bakker; N Daha; S Gay; T W Huizinga; R E M Toes; H M M Mikkers; F Kurreeman
Journal:  Genes Immun       Date:  2015-12-17       Impact factor: 2.676

3.  Circulating long noncoding RNA GAS5 as a potential biomarker in breast cancer for assessing the surgical effects.

Authors:  Lu Han; Pei Ma; Song-Mei Liu; Xin Zhou
Journal:  Tumour Biol       Date:  2015-12-10

4.  LincRNA-p21 regulates neointima formation, vascular smooth muscle cell proliferation, apoptosis, and atherosclerosis by enhancing p53 activity.

Authors:  Gengze Wu; Jin Cai; Yu Han; Jinghai Chen; Zhan-Peng Huang; Caiyu Chen; Yue Cai; Hefei Huang; Yujia Yang; Yukai Liu; Zaicheng Xu; Duofen He; Xiaoqun Zhang; Xiaoyun Hu; Luca Pinello; Dan Zhong; Fengtian He; Guo-Cheng Yuan; Da-Zhi Wang; Chunyu Zeng
Journal:  Circulation       Date:  2014-08-25       Impact factor: 29.690

Review 5.  Multifaceted Roles of Long Non-coding RNAs in Head and Neck Cancer.

Authors:  Leslie Duncan; Chloe Shay; Yong Teng
Journal:  Adv Exp Med Biol       Date:  2021       Impact factor: 2.622

6.  Lnc2Cancer v2.0: updated database of experimentally supported long non-coding RNAs in human cancers.

Authors:  Yue Gao; Peng Wang; Yanxia Wang; Xueyan Ma; Hui Zhi; Dianshuang Zhou; Xin Li; Ying Fang; Weitao Shen; Yingqi Xu; Shipeng Shang; Lihua Wang; Li Wang; Shangwei Ning; Xia Li
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

7.  Identification of non-coding RNAs with a new composite feature in the Hybrid Random Forest Ensemble algorithm.

Authors:  Supatcha Lertampaiporn; Chinae Thammarongtham; Chakarida Nukoolkit; Boonserm Kaewkamnerdpong; Marasri Ruengjitchatchawalya
Journal:  Nucleic Acids Res       Date:  2014-04-25       Impact factor: 16.971

Review 8.  Transcriptional regulation and its misregulation in disease.

Authors:  Tong Ihn Lee; Richard A Young
Journal:  Cell       Date:  2013-03-14       Impact factor: 41.582

9.  LncRNADisease 2.0: an updated database of long non-coding RNA-associated diseases.

Authors:  Zhenyu Bao; Zhen Yang; Zhou Huang; Yiran Zhou; Qinghua Cui; Dong Dong
Journal:  Nucleic Acids Res       Date:  2019-01-08       Impact factor: 16.971

Review 10.  A critical overview of long non-coding RNA in glioma etiology 2016: an update.

Authors:  Yuan-Feng Gao; Zhi-Bin Wang; Tao Zhu; Chen-Xue Mao; Xiao-Yuan Mao; Ling Li; Ji-Ye Yin; Hong-Hao Zhou; Zhao-Qian Liu
Journal:  Tumour Biol       Date:  2016-09-15
View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.