| Literature DB >> 27794041 |
Ricardo Lebrón1,2, Cristina Gómez-Martín1,2, Pedro Carpena3, Pedro Bernaola-Galván3, Guillermo Barturen4, Michael Hackenberg5,2, José L Oliver6,2.
Abstract
The 2017 update of NGSmethDB stores whole genome methylomes generated from short-read data sets obtained by bisulfite sequencing (WGBS) technology. To generate high-quality methylomes, stringent quality controls were integrated with third-part software, adding also a two-step mapping process to exploit the advantages of the new genome assembly models. The samples were all profiled under constant parameter settings, thus enabling comparative downstream analyses. Besides a significant increase in the number of samples, NGSmethDB now includes two additional data-types, which are a valuable resource for the discovery of methylation epigenetic biomarkers: (i) differentially methylated single-cytosines; and (ii) methylation segments (i.e. genome regions of homogeneous methylation). The NGSmethDB back-end is now based on MongoDB, a NoSQL hierarchical database using JSON-formatted documents and dynamic schemas, thus accelerating sample comparative analyses. Besides conventional database dumps, track hubs were implemented, which improved database access, visualization in genome browsers and comparative analyses to third-part annotations. In addition, the database can be also accessed through a RESTful API. Lastly, a Python client and a multiplatform virtual machine allow for program-driven access from user desktop. This way, private methylation data can be compared to NGSmethDB without the need to upload them to public servers. Database website: http://bioinfo2.ugr.es/NGSmethDB.Entities:
Mesh:
Substances:
Year: 2016 PMID: 27794041 PMCID: PMC5210667 DOI: 10.1093/nar/gkw996
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Number of methylomes by species and sequence context stored in NGSmethDB
| Species | Reference genome assembly | Sequence context | No. of methylomes |
|---|---|---|---|
| hg19 | CG | 57 | |
| CHG | 54 | ||
| hg38 | CG | 35 | |
| panTro4 | CG | 5 | |
| CHG | 5 | ||
| rheMac3 | CG | 6 | |
| CHG | 6 | ||
| mm10 | CG | 41 | |
| CHG | 41 | ||
| sl2.50 | CG | 8 | |
| CHG | 8 | ||
| CHH | 8 | ||
| sl2.50 | CG | 2 | |
| CHG | 2 | ||
| CHH | 2 | ||
| tair10 | CG | 129 | |
| CHG | 129 | ||
| CHH | 129 | ||
| TOTAL | 667 |
Information stored in NGSmethDB for each single-cytosine. All fields are described as shown in the results of NGSmethDB API client. Each row corresponds to a single-cytosine and each column to a field. For more information, see the manual NGSmethDB.
| Field | Description | Example |
|---|---|---|
| Chromosome | chr22 | |
| Chromosome position | 25174338 | |
| Genotype of methylation context | YG | |
| Methylation context where is the cytosine | CG | |
| Number of reads in which this cytosine is methylated (Watson-strand only) | 22 | |
| Number of reads in which this cytosine is methylated (Crick-strand only) | 27 | |
| Number of reads in which this cytosine is methylated (both strands) | 49 | |
| Number of reads mapped at this chromosome position (Watson-strand only) | 26 | |
| Number of reads mapped at this chromosome position (Crick-strand only) | 33 | |
| Number of reads mapped at this chromosome position (both strands) | 59 | |
| Methylated reads ratio at this chromosome position (Watson-strand only) | 0.85 | |
| Methylated reads ratio at this chromosome position (Crick-strand only) | 0.82 | |
| Methylated reads ratio at this chromosome position (both strands) | 0.83 | |
| Average sequencing quality score at this chromosome position (Watson-strand only) | 39 | |
| Average sequencing quality score at this chromosome position (Crick-strand only) | 37 | |
| Average sequencing quality score at this chromosome position (both strands) | 38 |
Figure 1.Data flow diagram for NGSmethDB indicating the source of primary data, the different types of extracted data and the different ways of data access.
Figure 2.NGSmethDB data shown at the UCSC Genome Browser. A genome region of chromosome 19 (chr19:2, 248, 838-2, 256, 966) encompassing three genes (AMH, encoding for the anti-Mullerian hormone; MIR4321, encoding for the microRNA 4321; and JSRP1 encoding for a junctional sarcoplasmic reticulum protein) is shown. The three main types of NGSmethDB data are shown for different tissues: (i) methylation levels at single-cytosines; (ii) differentially methylated cytosines; and (iii) methylation segments. Third-part annotations are also shown: genes from the Refseq database (63), the strict set of CpG-islands predicted by CpGcluster (64,65) and gene expression levels from the NIH Genotype-Tissue Expression (GTEx) project (66). Online image: https://goo.gl/ElXE4t.