Literature DB >> 29106642

MSDD: a manually curated database of experimentally supported associations among miRNAs, SNPs and human diseases.

Ming Yue1, Dianshuang Zhou1, Hui Zhi1, Peng Wang1, Yan Zhang1, Yue Gao1, Maoni Guo1, Xin Li1, Yanxia Wang1, Yunpeng Zhang1, Shangwei Ning1, Xia Li1.   

Abstract

The MiRNA SNP Disease Database (MSDD, http://www.bio-bigdata.com/msdd/) is a manually curated database that provides comprehensive experimentally supported associations among microRNAs (miRNAs), single nucleotide polymorphisms (SNPs) and human diseases. SNPs in miRNA-related functional regions such as mature miRNAs, promoter regions, pri-miRNAs, pre-miRNAs and target gene 3'-UTRs, collectively called 'miRSNPs', represent a novel category of functional molecules. miRSNPs can lead to miRNA and its target gene dysregulation, and resulting in susceptibility to or onset of human diseases. A curated collection and summary of miRSNP-associated diseases is essential for a thorough understanding of the mechanisms and functions of miRSNPs. Here, we describe MSDD, which currently documents 525 associations among 182 human miRNAs, 197 SNPs, 153 genes and 164 human diseases through a review of more than 2000 published papers. Each association incorporates information on the miRNAs, SNPs, miRNA target genes and disease names, SNP locations and alleles, the miRNA dysfunctional pattern, experimental techniques, a brief functional description, the original reference and additional annotation. MSDD provides a user-friendly interface to conveniently browse, retrieve, download and submit novel data. MSDD will significantly improve our understanding of miRNA dysfunction in disease, and thus, MSDD has the potential to serve as a timely and valuable resource.
© The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2018        PMID: 29106642      PMCID: PMC5753252          DOI: 10.1093/nar/gkx1035

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

MicroRNAs (miRNAs) are a class of small, endogenous, non-coding RNAs of ∼22 nt in length that post-transcriptionally regulate the cleavage of target mRNAs or participate in translational repression (1,2). To date, increasing evidence has shown that miRNAs are involved in various physiological processes and play important roles in various human diseases (3,4). Among these studies, the role of miRNA-related single nucleotide polymorphisms (SNPs) is gaining increasing attention. Human miRNA biogenesis and function is a multi-step process. First, an miRNA gene is transcribed to produce a primary miRNA (pri-miRNA) that is then processed into a precursor miRNA (pre-miRNA) and subsequently into a mature miRNA, which ultimately binds to the 3′-UTR of the target messenger RNA (mRNA) (5). Emerging studies have shown that SNPs in pri-miRNAs, pre-miRNAs, mature miRNAs and 3′-UTRs of target mRNA may function as a novel class of regulatory SNPs (commonly called ‘miRSNPs’), which can modify miRNA biogenesis and/or target binding and lead to diverse human diseases (6–8). However, most of these studies have each identified one or several disease-associated miRSNPs, so a wealth of information on experimentally supported miRSNPs is buried among the published literature and is not easily accessible. Thus, the need to develop a database to collect and store the latest experimentally supported miRNA–SNP disease associations is urgent. Due to the important roles and functions of miRSNPs in regulating many aspects of cellular processes relating to disease development, several databases and web tools such as miRNASNP (9), MirSNP (10), PolymiRTS (11), Patrocles (12), SubmiRine (13), MicroSNiPer (14), miRNA–SNiPer (15), Mirsnpscore (16) and mrSNP (17) have been developed. These databases are useful for identifying functional miRSNP candidates. However, most of these databases only focus on predicting SNP effects on putative miRNA targets or RNA secondary structure or collect human SNPs in predicted miRNA–mRNA binding sites. Other databases, miRNASNP v2.0 (9) and miRdSNP (18), map phenotype-associated SNPs from genome-wide association studies to predicted or experimentally validated miRNA targets and do not provide miRNA dysfunctional patterns or experimental methods. To date, no database has been designed to capture the experimentally supported relationships among miRNAs, SNPs and genes. A manually curated database of experimentally supported miRSNPs that are associated with various human diseases will be useful for researchers and can serve as ‘gold standard’ data set for accuracy tests, especially for further experimental designs and verification. To bridge this gap, we have developed MiRNA SNP Disease Database (MSDD), a manually curated database, to collect and integrate experimentally supported disease-associated miRSNPs into a high quality, comprehensive resource (Figure 1). The current version of MSDD documents 525 manually curated relationships between 182 human miRNAs, 197 SNPs, 153 genes and 164 human diseases. We expect that this elaborate database specifically designed for miRNAs, SNPs and human diseases will serve as an important catalyst for future research.
Figure 1.

Data sources and the structure of MSDD.

Data sources and the structure of MSDD.

DATA COLLECTION AND DATABASE CONTENT

To ensure the quality of the database, all MSDD entries were manually collected through several steps that were used to assemble the databases miRTarBase (19), HMDD v2.0 (20) and Lnc2Cancer (21) in the collection process. First, we searched the PubMed database (22) with a list of keywords, such as ‘SNP miRNA,’ ‘polymorphism miRNA,’ ‘SNP microRNA’ and ‘polymorphism microRNA.’ All published literature expounding miRNA–SNP interactions that described associations with human diseases or traits was downloaded to extract the key information. It this step, 2387 published literature were downloaded from the PubMed database (22) (before May 2017). Second, we extracted experimentally supported miRNA–SNP disease associations by manually curating information from published papers. All selected studies were reviewed by at least two researchers. In this step, we retrieved the miRNA, SNP, miRNA target gene, disease name and all identifiers were manually annotated using the controlled vocabularies (e.g. controlled vocabulary for ncRNA classes, MeSH), the information of population, number of samples and minor allele frequency of disease-associated SNPs, the SNP location (relative position of the SNP, such as within the 3′-UTR or pre-miRNA) and allele, the miRNA dysfunctional pattern (the effect of the miRNA on the expression of the target gene, e.g. increase, decrease, loss or gain), the experimental methods used (e.g. western blot, quantitative reverse transcriptase-polymerase chain reaction (qRT-PCR)), the experimental samples (cell line and/or tissue), hyperlinks to the PubMed database and a brief functional description of the miRNA–SNP disease regulation mechanism from the original study. We referred to previous studies and selected miRNA–SNP disease associations for manual curation using strict criteria (23). We only collected high-quality associations with multiple lines of strong experimental evidence, including confirmation by genotyping, western blot, qRT-PCR or luciferase reporter assays. After completing this process, a total of 525 associations between 182 human miRNAs, 197 SNPs, 153 genes and 164 human diseases were manually curated from 397 published papers. Each curated association was given a unique accession number (accession ID: MSDD00###). Moreover, We extracted the annotation information of each miRNA from miRBase (24), including the genome context, stem–loop structure and sequence of mature miRNA. We also downloaded the annotation information of each SNP from dbSNP (25), including ancestral allele (wild-type) and contextual information. Finally, all data in MSDD is stored and managed using MySQL (version 5.7.18). The web interfaces were built in JSP. The data processing programs are written in Java (version 1.8.0), and the web services are built using Apache Tomcat. The MSDD database is freely available at http://www.bio-bigdata.com/msdd/ and http://www.bio-bigdata.net/msdd/.

USER INTERFACE

MSDD provides a user-friendly web interface that enables users to browse, search and retrieve all miRNA–SNP disease associations in the database (Figure 2). In the ‘Browse’ page, users can click on a specific gene, miRNA, SNP or disease name, and a list of matched entries is returned. In the ‘Search’ page, MSDD allows users to search by gene name, miRNA name, SNP ID, disease name or or combinations of these categories. MSDD offers fuzzy keyword searching capabilities, facilitating searches by returning the closest possible matching records. MSDD provides an option in the ‘Search’ page that allows users to filter associations by experimental method and SNP position. MSDD also offers a submission page that enables researchers to submit novel experimentally supported miRNA–SNP disease associations. Once approved by the submission review committee, the submitted record will be included in an updated release. In addition, all data in the database can be downloaded in the ‘Download’ page. MSDD also provides two visualization maps on the ‘Download’ page that enables users to download data by clicking on the appropriate area. One is a human body map that classifies data according to the organ and another map is a tag cloud that displays hotspot data. Finally, a detailed tutorial showing users how to use MSDD is available on the ‘Help’ page.
Figure 2.

A schematic workflow of MSDD.

A schematic workflow of MSDD.

FUTURE EXTENSIONS

More recently, high-throughput technologies, such as next-generation sequencing, have produced extensive data on human disease biology, and the number of validated disease-associated miRNA–SNP interactions will continue to increase in the future. These advances in research will provide an opportunity to further extend MSDD. We will continue to manually curate newly validated miRNA–SNP disease associations and update the database every 2 months. We will incorporate new tools and functional annotations as well as more data sources to improve the utility and content coverage of this database.

DISCUSSION AND CONCLUSION

Over time, studies have increasingly indicated that variants in miRNA sequences may trigger disease by altering the expression or maturation of miRNAs or by interfering with miRNA interactions with mRNA (26,27). In the past few years, many databases have been published to aid researchers in exploring the impact of SNPs on miRNA genes and their targets. For SNPs in miRNA genes, it is convenient to map SNPs to miRNA regions from resources like miRBase (24). For SNPs in miRNA targets, identifying potential binding sites for a given miRNA in genomic sequences is important. However, these studies and databases emphasize the importance of prediction tools in the identification of potential miRNA–SNP relationships (Supplementary Table S1). To the best of our knowledge, none of these resources were developed to specifically collect experimentally supported miRNA–SNP association data in various human diseases. Thus, we developed MSDD, a disease-association database that provides a comprehensive resource on miRNA dysregulation with the modulation of SNPs. In addition to collecting a great number of experimentally supported miRNA–SNP disease associations, MSDD may provide mechanistic insight and experimental evidence into future research. For example, by searching MSDD using ‘miR-34,’ a well-known human miRNA gene family, we found that the present study clearly demonstrates the SNP rs4938723, located in the promoter region of miR-34a/b/c, can significantly affect miRNA expression in various human cancers. This miRNA–SNP association has the potential to be an effective biomarker for different human cancers. In another example, by searching MSDD using ‘gastric cancer’, one of the most common cancer types worldwide, we found that several SNPs in miRNA genes or target sites have been proved to be associated with this cancer by affecting the miRNA-mediated regulatory function. More importantly, some functional miRNA–SNP associations, such as the SNP rs2910164 in the pre-miRNA of miR-146a, are supported both in blood and in tissue, which may be especially useful for cancer specialists who focusing on circulating miRNA cancer biomarker. Data from MSDD can facilitate the understanding of important principles and future research trends. For example, despite that miRSNPs in many functional regions such as the promoter regions of miRNAs are associated with diseases, we found that most of the disease-associated miRSNPs are located four main functional regions, including pri-miRNAs, pre-miRNAs, mature-miRNAs and miRNA target genes (Supplementary Figure S1A). Additionally, we list the top 10 miRNAs, SNPs and disease in the database (Supplementary Figure S2A–C). Remarkably, we found that miR-146a and the SNP rs2910164 make up the most common miRNA–SNP pair in the MSDD, which is involved in 52 diseases. The disease with the highest connectivity is hepatocellular carcinoma, which is associated with 26 miRNAs and 25 SNPs, thus potentially providing further insight into this disease. Finally, we quantified the number of published papers each year that reported miRNA–SNP disease associations and found that the number of publications has generally been increasing dramatically (Supplementary Figure S1B). In particular, from 2012 to 2014, the number of publications has increased in an exponential manner, suggesting that research on miRNA–SNP disease associations has become a hot topic in recent years, thus highlighting the timeliness of developing a special-purpose repository to document these valuable data. In summary, MSDD not only provides a comprehensive miRNA–SNP disease database with experimental support but also presents a more global view on miRNA functions in human diseases. MSDD will serve as a valuable resource for researchers interested in determining the role of miRNA-related SNPs in human diseases. Click here for additional data file.
  27 in total

Review 1.  MicroRNAs in stress signaling and human disease.

Authors:  Joshua T Mendell; Eric N Olson
Journal:  Cell       Date:  2012-03-16       Impact factor: 41.582

2.  SubmiRine: assessing variants in microRNA targets using clinical genomic data sets.

Authors:  Evan K Maxwell; Joshua D Campbell; Avrum Spira; Andreas D Baxevanis
Journal:  Nucleic Acids Res       Date:  2015-03-26       Impact factor: 16.971

3.  Functional SNP in 3'-UTR MicroRNA-Binding Site of ZNF350 Confers Risk for Age-Related Cataract.

Authors:  Shanshan Gu; Han Rong; Guowei Zhang; Lihua Kang; Mei Yang; Huaijin Guan
Journal:  Hum Mutat       Date:  2016-09-16       Impact factor: 4.878

Review 4.  MicroRNAs in development and disease.

Authors:  Danish Sayed; Maha Abdellatif
Journal:  Physiol Rev       Date:  2011-07       Impact factor: 37.312

5.  Association of miRNA-related genetic polymorphisms and prognosis in patients with esophageal squamous cell carcinoma.

Authors:  Pei-Wen Yang; Ya-Chuan Huang; Ching-Yueh Hsieh; Kuo-Tai Hua; Yu-Ting Huang; Tzu-Hsuan Chiang; Jin-Shing Chen; Pei-Ming Huang; Hsao-Hsun Hsu; Shuenn-Wen Kuo; Min-Liang Kuo; Jang-Ming Lee
Journal:  Ann Surg Oncol       Date:  2014-04-26       Impact factor: 5.344

6.  miRdSNP: a database of disease-associated SNPs and microRNA target sites on 3'UTRs of human genes.

Authors:  Andrew E Bruno; Li Li; James L Kalabus; Yuzhuo Pan; Aiming Yu; Zihua Hu
Journal:  BMC Genomics       Date:  2012-01-25       Impact factor: 3.969

7.  miRBase: annotating high confidence microRNAs using deep sequencing data.

Authors:  Ana Kozomara; Sam Griffiths-Jones
Journal:  Nucleic Acids Res       Date:  2013-11-25       Impact factor: 16.971

8.  HMDD v2.0: a database for experimentally supported human microRNA and disease associations.

Authors:  Yang Li; Chengxiang Qiu; Jian Tu; Bin Geng; Jichun Yang; Tianzi Jiang; Qinghua Cui
Journal:  Nucleic Acids Res       Date:  2013-11-04       Impact factor: 16.971

9.  PolymiRTS Database 3.0: linking polymorphisms in microRNAs and their target sites with human diseases and biological pathways.

Authors:  Anindya Bhattacharya; Jesse D Ziebarth; Yan Cui
Journal:  Nucleic Acids Res       Date:  2013-10-24       Impact factor: 16.971

10.  mrSNP: software to detect SNP effects on microRNA binding.

Authors:  Mehmet Deveci; Umit V Catalyürek; Amanda Ewart Toland
Journal:  BMC Bioinformatics       Date:  2014-03-15       Impact factor: 3.169

View more
  20 in total

Review 1.  The roles of microRNAs in mouse development.

Authors:  Brian DeVeale; Jennifer Swindlehurst-Chan; Robert Blelloch
Journal:  Nat Rev Genet       Date:  2021-01-15       Impact factor: 53.242

Review 2.  Epigenetic influences on genetically triggered thoracic aortic aneurysm.

Authors:  Stefanie S Portelli; Elizabeth N Robertson; Cassandra Malecki; Kiersten A Liddy; Brett D Hambly; Richmond W Jeremy
Journal:  Biophys Rev       Date:  2018-09-28

Review 3.  Single nucleotide polymorphisms in piRNA-pathway genes: an insight into genetic determinants of human diseases.

Authors:  Jyoti Roy; Kalyani Anand; Swati Mohapatra; Rojalin Nayak; Trisha Chattopadhyay; Bibekanand Mallick
Journal:  Mol Genet Genomics       Date:  2019-10-14       Impact factor: 3.291

Review 4.  Gain-of-Function Mutations: An Emerging Advantage for Cancer Biology.

Authors:  Yongsheng Li; Yunpeng Zhang; Xia Li; Song Yi; Juan Xu
Journal:  Trends Biochem Sci       Date:  2019-04-29       Impact factor: 13.807

5.  ncRNA-eQTL: a database to systematically evaluate the effects of SNPs on non-coding RNA expression across cancer types.

Authors:  Jiang Li; Yawen Xue; Muhammad Talal Amin; Yanbo Yang; Jiajun Yang; Wen Zhang; Wenqian Yang; Xiaohui Niu; Hong-Yu Zhang; Jing Gong
Journal:  Nucleic Acids Res       Date:  2020-01-08       Impact factor: 16.971

6.  dbMTS: A comprehensive database of putative human microRNA target site SNVs and their functional predictions.

Authors:  Chang Li; Chengcheng Mou; Michael D Swartz; Bing Yu; Yongsheng Bai; Yicheng Tu; Xiaoming Liu
Journal:  Hum Mutat       Date:  2020-04-06       Impact factor: 4.878

7.  The 2018 Nucleic Acids Research database issue and the online molecular biology database collection.

Authors:  Daniel J Rigden; Xosé M Fernández
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

8.  In silico genome-wide miRNA-QTL-SNPs analyses identify a functional SNP associated with mastitis in Holsteins.

Authors:  Qiang Jiang; Han Zhao; Rongling Li; Yaran Zhang; Yong Liu; Jinpeng Wang; Xiuge Wang; Zhihua Ju; Wenhao Liu; Minghai Hou; Jinming Huang
Journal:  BMC Genet       Date:  2019-05-16       Impact factor: 2.797

9.  Single Nucleotide Polymorphisms in MIR143 Contribute to Protection Against Non-Hodgkin Lymphoma (NHL) in Caucasian Populations.

Authors:  Gabrielle Bradshaw; Larisa M Haupt; Eunise M Aquino; Rodney A Lea; Heidi G Sutherland; Lyn R Griffiths
Journal:  Genes (Basel)       Date:  2019-02-27       Impact factor: 4.096

10.  RumimiR: a detailed microRNA database focused on ruminant species.

Authors:  Céline Bourdon; Philippe Bardou; Etienne Aujean; Sandrine Le Guillou; Gwenola Tosser-Klopp; Fabienne Le Provost
Journal:  Database (Oxford)       Date:  2019-01-01       Impact factor: 3.451

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.