| Literature DB >> 26989147 |
Eli J Draizen1, Alexey K Shaytan1, Leonardo Mariño-Ramírez1, Paul B Talbert2, David Landsman3, Anna R Panchenko3.
Abstract
Compaction of DNA into chromatin is a characteristic feature of eukaryotic organisms. The core (H2A, H2B, H3, H4) and linker (H1) histone proteins are responsible for this compaction through the formation of nucleosomes and higher order chromatin aggregates. Moreover, histones are intricately involved in chromatin functioning and provide a means for genome dynamic regulation through specific histone variants and histone post-translational modifications. 'HistoneDB 2.0--with variants' is a comprehensive database of histone protein sequences, classified by histone types and variants. All entries in the database are supplemented by rich sequence and structural annotations with many interactive tools to explore and compare sequences of different variants from various organisms. The core of the database is a manually curated set of histone sequences grouped into 30 different variant subsets with variant-specific annotations. The curated set is supplemented by an automatically extracted set of histone sequences from the non-redundant protein database using algorithms trained on the curated set. The interactive web site supports various searching strategies in both datasets: browsing of phylogenetic trees; on-demand generation of multiple sequence alignments with feature annotations; classification of histone-like sequences and browsing of the taxonomic diversity for every histone variant. HistoneDB 2.0 is a resource for the interactive comparative analysis of histone protein sequences and their implications for chromatin function. Database URL: http://www.ncbi.nlm.nih.gov/projects/HistoneDB2.0. Published by Oxford University Press 2015. This work is written by US Government employees and is in the public domain in the United States.Entities:
Mesh:
Substances:
Year: 2016 PMID: 26989147 PMCID: PMC4795928 DOI: 10.1093/database/baw014
Source DB: PubMed Journal: Database (Oxford) ISSN: 1758-0463 Impact factor: 3.451
Figure 1.Schematic representation of nucleosome structure and its composition of different histone variants. The nucleosome core is formed by 147 bp of DNA and an octamer of H3, H4, H2A and H2B histones (depicted in blue, green, yellow and red, respectively). H1 linker histone (magenta) is associated with the nucleosome core near the DNA entry exit points. Selected histone variant names for each histone type are shown.
Figure 2.Information flow, main data types and widgets in HistoneDB 2.0 (see text for description).
A summary table of histone variants in HistoneDB 2.0
| Variant | # Curated sequences | # Automatically extracted sequences | # Features | Taxonomic span |
|---|---|---|---|---|
| Canonical H3 | 26 | 1606 | 11 | Eukaryotes |
| cenH3 | 14 | 276 | 15 | Eukaryotes |
| H3.3 | 17 | 541 | 12 | Eukaryotes |
| H3.5 | 2 | 1135 | 11 | Hominids |
| H3.Y | 8 | 89 | 12 | Primates |
| TS H3.4 | 2 | 2110 | 11 | Mammals |
| Canonical H4 | 14 | 7498 | 10 | Eukaryotes |
| Canonical H2A | 39 | 4096 | 17 | Eukaryotes |
| H2A.1 | 2 | 846 | 17 | Mammals |
| H2A.B | 15 | 139 | 21 | Mammals |
| H2A.L | 17 | 186 | 17 | Certain mammals |
| H2A.P | 11 | 95 | 17 | Placentalia |
| H2A.W | 9 | 870 | 19 | Plants |
| H2A.X | 22 | 1142 | 19 | Eukaryotes except nematode |
| H2A.Z | 25 | 2609 | 20 | Eukaryotes |
| macroH2A | 10 | 1215 | 19 | Vertebrates(?) |
| Canonical H2B | 27 | 6633 | 9 | Eukaryotes |
| H2B.1 | 4 | 443 | 9 | Mammals |
| H2B.W | 6 | 245 | 10 | Mammals |
| H2B.Z | 3 | 208 | 9 | Apicomplexa |
| Sperm H2B | 5 | 56 | 10 | Echinoidea(?) |
| subH2B | 11 | 86 | 11 | Primates, rodents, marsupials, and bovids |
| Generic H1 | 18 | 4340 | 7 | Eukaryotes |
| H1.0 | 15 | 681 | 7 | Metazoa |
| H1.10 | 6 | 146 | 7 | Vertebrates |
| OO H1.8 | 2 | 250 | 7 | Mammals |
| scH1 | 2 | 404 | 13 | Saccharomyces(?) |
| TS H1.6 | 8 | 474 | 7 | Mammals |
| TS H1.7 | 2 | 144 | 7 | Mammals |
| TS H1.9 | 4 | 101 | 7 | Mammals |
For each histone variant the numbers of sequences in curated and automatically extracted data sets, a number of annotated sequence features and inferred taxonomic span is given. Question marks denote ambiguous taxonomic spans.
Figure 3A summary of HistoneDB 2.0 web site. (a) information page for a given histone type, (b) a summary page for a typical histone variant, (c) ‘Curated sequences’ tab of the histone variant page, (d) a table view of HMMER scores used to classify the selected sequences from the automatically extracted set (access via ‘Advanced’ menu, ‘Score against all HMMs’ button).