| Literature DB >> 26030752 |
Kevin Chiang1, Jiang Shu1, Janos Zempleni2, Juan Cui1.
Abstract
With the advent of high throughput technology, a huge amount of microRNA information has been added to the growing body of knowledge for non-coding RNAs. Here we present the Dietary MicroRNA Databases (DMD), the first repository for archiving and analyzing the published and novel microRNAs discovered in dietary resources. Currently there are fifteen types of dietary species, such as apple, grape, cow milk, and cow fat, included in the database originating from 9 plant and 5 animal species. Annotation for each entry, a mature microRNA indexed as DM0000*, covers information of the mature sequences, genome locations, hairpin structures of parental pre-microRNAs, cross-species sequence comparison, disease relevance, and the experimentally validated gene targets. Furthermore, a few functional analyses including target prediction, pathway enrichment and gene network construction have been integrated into the system, which enable users to generate functional insights through viewing the functional pathways and building protein-protein interaction networks associated with each microRNA. Another unique feature of DMD is that it provides a feature generator where a total of 411 descriptive attributes can be calculated for any given microRNAs based on their sequences and structures. DMD would be particularly useful for research groups studying microRNA regulation from a nutrition point of view. The database can be accessed at http://sbbi.unl.edu/dmd/.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26030752 PMCID: PMC4451068 DOI: 10.1371/journal.pone.0128089
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1DMD construction workflow and the outline of data content.
List of features available for generation.
| Category | Feature Details | Feature Dimensions | Reference |
|---|---|---|---|
|
| Single Nucleotide Frequency | 4 x 3 | [ |
| Pairwise Nucleotide Frequency | 16 x 3 | [ | |
| Triplet Nucleotide Frequency | 64 x 3 | [ | |
| Quadruplet Nucleotide Frequency | 256 x 3 | [ | |
| A + U Frequency | 1 x 3 | [ | |
| G + C Frequency | 1 x 3 | [ | |
| G + U Frequency | 1 x 3 | [ | |
| Number of Palindromes in Sequence | 1 x 3 | [ | |
| Length | 1 x 3 | [ | |
| Pairs of A-U in Premature microRNA | 1 | [ | |
| Pairs of G-C in Premature microRNA | 1 | [ | |
| Pairs of G-U in Premature microRNA | 1 | [ | |
|
| Nucleotide to RNAfold | 32 | [ |
| Minimum Free Energy, Normalized Minimum Free Energy, Frequency of Minimum Free Energy Structures | 3 | [ | |
| Ensemble Free Energy, Normalized Ensemble Free Energy | 2 | [ | |
| Stem Statistics (Stems, Average Stem Length, Maximum Stem Length, Stem containing AU, Stem containing GC, Stem containing GU) | 6 | [ | |
| Minimum Free Energy Statistics (mfe/G+C frequency, mfe/stems, mfe/unpaired nucleotides, mfe/paired nucleotides, difference in mfe and efe, and ensemble diversity). | 6 | [ | |
| Percentage of sequence composing of pairs. | 1 | [ | |
| Frequency of Nucleotides that occur outside of UA, GU, GC pairs. | 4 | [ | |
| Predicted shape type probability base on RNAshapes | 5 | [ | |
| STOAT | 4 | [ |
1These features may be calculated for the premature sequence, mature sequence, and seed region sequence.
2RNAfold is an external tool that is run with the—p option to generate the partition function and base pairing probability.
3RNAshapes is an external tool that is run with the—t option to specify 5 different shape types.
4STOAT is an external tool that is run with the—x 31 option to signify 31 character states and the—v option to display a verbose option that is easier to parse.
Statistics of microRNAs and species in DMD.
|
|
|
|
|
|
|
|---|---|---|---|---|---|
|
|
| ||||
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
|
|
|
|
|
| |
Fig 2Illustration of searching "bta-mir-29b" and the search results.
Fig 3The DMD entry for 'bta-mir-29b,' which contains sequence information, precursor annotation, and homologous sequences and their associated targets and diseases.
Fig 4Clockwise from top: "bta-mir-29b” shows 645 targets; the protein-protein network visualization; and the gene enrichment analysis and pathway information.
Fig 5The feature generation page showing the entry of “bta-mir-29b” in fasta format and its output in a tab separated values format.