Literature DB >> 15608285

NPRD: Nucleosome Positioning Region Database.

Victor G Levitsky1, Aleksey V Katokhin, Olga A Podkolodnaya, Dagmara P Furman, Nikolay A Kolchanov.   

Abstract

Nucleosome Positioning Region Database (NPRD), which is compiling the available experimental data on locations and characteristics of nucleosome formation sites (NFSs), is the first curated NFS-oriented database. The object of the database is a single NFS described in an individual entry. When annotating results of NFS experimental mapping, we pay special attention to several important functional characteristics, such as the relationship between type of gene activity and nucleosome positioning, the influence of non-histone proteins on nucleosome formation, type of the variant of nucleosome positioning (translational or rotational), indication of tissue types and states of cell activity, description of experimental methods used and accuracy of nucleosome position determination, and the results of applying theoretical and computer methods to the analysis of contextual and conformational DNA properties. At present, the NPRD database contains 438 entries and integrates the data described in 124 original papers. The database URL: http://srs6.bionet.nsc.ru/srs6/. Then click the button 'Databank' and open the link NUCLEOSOME.

Entities:  

Mesh:

Substances:

Year:  2005        PMID: 15608285      PMCID: PMC540003          DOI: 10.1093/nar/gki049

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Nucleosomes are the major structural elements of chromatin. Each nucleosome is formed by a 147 bp DNA fragment wrapped around an octamer composed of pairs of histone molecules of four types. In addition to DNA compacting, the most important function of nucleosomes is their interaction with the molecular components of the nucleus machineries involved in DNA replication, repair and recombination. The key role in gene transcription is also assigned to nucleosomes (1,2). Chromatin is, as a rule, represented by regular arrays of nucleosomes. The factors determining the regularity of nucleosome location in vivo and in vitro are, first, the sequence-directed nucleosome positioning, determined by the interaction of nucleosome formation sites (NFSs) sequences with the histone octamer (3) and, second, their interaction with various non-histone proteins (4,5). So far, the relative role of these factors in nucleosome positioning is vague. Presumably, this is first and foremost connected with insufficient volume of experimental data, preventing their systematization and formalization. The database NUCLEOSOMAL DNA by Ioshikhes and Trifonov (6), comprising 143 entries, compiled the information about only the nucleosomal center positions and techniques used for the nucleosomal mapping; moreover, the information on types of nucleosome positioning and their relationship between transcription regulation and the state of gene activity was absent. In addition, the volume of this database and the degree of representation of the data on nucleosome organization in genomes of higher eukaryotes were evidently insufficient for large-scale research on nucleosome organization of the genomes and, in particular, estimation of the abilities of individual genomic regions to position nucleosomes. To solve these problems, we are developing a curated NFS-oriented Nucleosome Positioning Region Database (NPRD). Along with a detailed description of NFS localization, including their mapping relative to the borders of genes and their structural elements, the database contains important functional characteristics: the relationship between types of gene activity and nucleosome positioning, the influence of non-histone proteins on nucleosome formation, occurrence of translational or rotational nucleosome positioning, indication of tissue types and states of cell activity, description of experimental methods used and the accuracy of nucleosome position determination, and the results of applying theoretical and computer methods to the analysis of contextual and conformational DNA properties. The database in question provides the possibility of formalized description and assessment of the contributions of the factors listed above to nucleosome positioning taking into account the available information about biologically significant characteristics of the regions considered. We think that our database is important for developers of new computer methods of nucleosomal DNA analysis and recognition; and experimenters involved in transcription regulation and chromatin structure investigation.

FORMAT OF THE NPRD DATABASE

The NPRD format developed allows for accumulating, integrating and systematizing miscellaneous experimental data on locations and characteristics of NFSs and detailed information about the other factors influencing nucleosome positioning, extracted from the published sources. Each NPRD entry corresponds to one annotated NFS. Below, we give a description of the fields in the order of their appearance (Table 1).
Table 1.

Description of the NPRD fields

Field nameDescription
AccessionNumberIdentifier of an entry. This is the only unique identifier of an entry. It follows the pattern NXXXXX, where N is a letter and X is a number
NumberThe tag used in original paper
AnnotatorAnnotator name and date of the last editing
TaxonOrganism classification
MethodType of experiment (in vivo, in vitro)
SpeciesOrganism species
DNABankLinkLink to EMBL/GenBank
DescriptionList of names of genes and their products
GeneSource of the sequence: ‘gene’ if sequence is located within a gene, ‘genomic’ if otherwise
RegionGene region. Possible values: ‘5′region’, ‘3′region’, ‘exon’, ‘intron’, ‘5′UTR’, ‘3′UTR’ and ‘CDS’
FunctionGene region function. The format is: ‘Function’; ‘ TRRD link’ (if is available); Possible ‘Function’ values are: ‘promoter’ and ‘enhancer’
MainBoundsMain position of nucleosome center. The format is: ‘AC (EMBL/GenBank accession number)’; ‘start point name (e.g. ST, transcription start; SS, beginning of the sequence; and SR, translation start)’: ‘start point position in EMBL/GenBank entry’; ‘main position of nucleosome center relative to start point’; and [‘probability of main position of nucleosome center (if this position is referenced)’; ‘error in determining the main position of nucleosome center in base pair]. Examples: Z46939; ST: 737; –298;(80%,10);’ and ‘Z46939; ST: 737; –298;();’—if information on probability and error is absent
VarBoundsVariable positions of nucleosome center. The format is similar to that of MainBounds field
MainBoundsMain position of nucleosome borders. The format is similar to that of MainBounds field
VarBoundsVariable positions of nucleosome borders. The format is similar to that of MainBounds field
CommentsComments on nucleosome site
PositioningFactoraThe name(s) of transcription factor(s) or structural non-histone protein(s) influencing nucleosome formation. The result of the factor action (positive effect; negative effect; and no effect)
CommentsaComments on positioning factor
SequenceNucleotide sequence in the simple format
Disorder_PositionaDiscrepancies between the data on nucleotide sequence in the paper annotated and the corresponding sequence in EMBL/GenBank
PosEvidenceExperimental evidence of nucleosome positioning: cell type or source, DNA/histone source and experiment type
NegEvidenceaExperimental evidence of the absence of nucleosome positioning: cell type or source, DNA/histone source and experiment type
ConditionEffectaEffects of various physical and chemical factors (salts, temperature, etc.) on nucleosome formation
KeyWordsKey words
ReferenceComplete bibliographic reference to the annotated paper with a link to PubMed

aDenotes an optional field.

An example of an entry representing HNF-4-alpha gene proximal promoter nucleosome organization is given in Table 2. In active cells, the promoter and enhancer of this gene were occupied by positioned nucleosomes unlike in non-expressing cell lines, where positioning of nucleosomes was random. According to Hatzis and Talianidis model (7), formation of an active pre-initiation complex occurs in a step-by-step fashion: (i) poised or committed state (enhancer and promoter were occupied by the cognate DNA-binding proteins); (ii) recruitment of CBP and P/CAF (histone acetyltransferases), Brg-1 (chromatin remodeling protein) to the enhancer region and assembly of the RNA pol-II holoenzyme at the proximal promoter region; (iii) unidirectional movement of the DNA–protein complex formed on the enhancer along the intervening sequences and spreading of histone hyperacetylation and (iv) formation of a stable enhancer–promoter complex, hyperacetylation of nucleosomes located at the promoter, remodeling of the nucleosome located at the transcription start site and release of RNA pol-II from the promoter. Information on nucleosome positioning and its correlation with the gene activity is presented through the fields (Table 2): ‘Function’, ‘MainBounds’, ‘Comments’, ‘PosEvidence’, ‘NegEvidence’ and ‘KeyWords’.
Table 2.

An example entry in the NPRD database

AccessionNumberN00955
Number1
AnnotatorV. G. Levitsky June 9, 2004
Methodin vivo
SpeciesHomo sapiens (human)
TaxonEukaryota; Metazoa; Chordata; Vertebrata; Mammalia; Eutheria; Primates; Catarrhini; Hominidae; Homo
DNABankLinkGenBank; HS1013A22; AL132772;
DescriptionHepatocyte nuclear factor 4-alpha gene, HNF-4-alpha gene, transcription factor HNF-4 gene, transcription factor 14 gene, HNF4A, NR2A1, TCF14, HNF4
GeneGene
Region5′-region;
Functionpromoter, 5′-UTR, CDS, TRRDUNITS4:P02102;
MainBoundsAL132772; SN:1997; −180 to −1; ();
CommentsThe analysis revealed that the proximal promoter region was occupied by an array of positioned nucleosomes, evidenced by the regularly spaced—about 150 bp ladder of bands…efficient cross-linking of the HNF-4 enhancer and promoter DNAs suggests that the two regions must be in close proximity at the time of transcription initiation and onward
Sequencecagaggctaggccaagactcccagcagatcttcccagaggacggtttgaaaggaaggcagagagggcactgggaggaggcagtgggagggcggagggcgggggccttcggggtgggcgcccagggtagggcaggtggccgcggcgtggaggcagggagaatgcgactctccaaaaccctc
PosEvidenceHuman differentiating CaCo-2 cells (gene active): MNase digestion, restriction enzyme hypersensitivity assay;
NegEvidanceovarian A2780 carcinoma cells (gene repressed): MNase digestion;
Keywordsenhancer–promoter communication, histone hyperacetylation;
AuthorsHatzis, P. and Talianidis, I.
TitleDynamics of enhancer–promoter communication during differentiation-induced gene activation
SourceMol. Cell.
Year2002
Volume10
Issue6
Pages1467–1477
PubMed12504020

An NFS in the human hepatocyte nuclear factor 4-alpha gene promoter is crucial for determining expression status.

ACCESS TO THE NPRD AND DATA RETRIEVAL TOOLS

The database URL: http://srs6.bionet.nsc.ru/srs6/. Then click the button ‘Databank’ and open the link NUCLEOSOME. Sequence Retrieval System (SRS) is used as a basic software tool for accessing the NPRD via the Internet; this provides efficient linking to the nucleotide sequences (EMBL/GenBank) and to the literature sources (PubMed). A system of hyperlinks integrates the NPRD with the informational system TRRD (8) (e.g. Table 2, field ‘Function’), allowing for a quick access to both the experimental information on the regulation of expression of a particular gene, whereto an NFS is localized, and the programs for computer analyses of regulatory sequences of the gene, providing users the possibility of additional data mining.

DATABASE STATISTICS

The database contains 438 entries integrating the data of 124 original papers. Now, the Internet-accessible version contains 122 entries, constituting about one-third of the database volume. Representation of species is shown in Figure 1: sequences of higher eukaryotes constitute over 40% of the database; of them, 33% are mammalian genomes.
Figure 1

Representation of species in the NPRD database.

Figure 2 illustrates the representation in the NPRD regions of genes and other functional components of the genomes. Note that the promoter regions constitute about a half of these regions, presumably indicating an increased interest of researchers to the relationship between nucleosome organization and transcription regulation.
Figure 2

Representation of gene and genomic regions in the NPRD database.

RESEARCH BASED ON DATA STORED IN THE NPRD

We have earlier designed an integrated information system Nucleosomal DNA Organization (9) (http://wwwmgs.bionet.nsc.ru/mgs/gnw/nucleosom/), comprising, along with NPRD, RECON program for nucleosome formation potential prediction (10) and NFS recognition by DNA conformational and physicochemical properties (9). Both programs were trained on NFS data later included in the NPRD dataset. Using the software package RECON, we have carried out the computer analysis of several classes of genome sequences: promoters of various classes, enhancers, coding and non-coding parts of genes, areas of splicing sites and sites of an insert of mobile genetic elements (10–15). The volume of information, compiled in the NPRD, gives the possibility to study contextual characteristics of NFSs, increase the efficiency of their identification in genomic sequences and perform a more detailed research into nucleosome organization of genomes.

FUTURE DEVELOPMENTS

The database NPRD is being constantly developed and supplemented with new experimental data from the available literature sources. We are planning to update the database annually. Concurrently, its integration with the existing databases and its search capabilities are being constantly improved.
  14 in total

Review 1.  Twenty-five years of the nucleosome, fundamental particle of the eukaryote chromosome.

Authors:  R D Kornberg; Y Lorch
Journal:  Cell       Date:  1999-08-06       Impact factor: 41.582

2.  [Locus-controlling regions: description in the LCR-TRRD data base].

Authors:  O A Podkolodnaia; V G Levitskiĭ; N L Podkolodnyĭ
Journal:  Mol Biol (Mosk)       Date:  2001 Nov-Dec

3.  Nucleosome formation potential of exons, introns, and Alu repeats.

Authors:  V G Levitsky; O A Podkolodnaya; N A Kolchanov; N L Podkolodny
Journal:  Bioinformatics       Date:  2001-11       Impact factor: 6.937

4.  Transcription Regulatory Regions Database (TRRD): its status in 2002.

Authors:  N A Kolchanov; E V Ignatieva; E A Ananko; O A Podkolodnaya; I L Stepanenko; T I Merkulova; M A Pozdnyakov; N L Podkolodny; A N Naumochkin; A G Romashchenko
Journal:  Nucleic Acids Res       Date:  2002-01-01       Impact factor: 16.971

5.  NotI flanking sequences: a tool for gene discovery and verification of the human genome.

Authors:  Alexey S Kutsenko; Rinat Z Gizatullin; Ali N Al-Amin; Fuli Wang; Sergei M Kvasha; Raf M Podowski; Yuri G Matushkin; Anita Gyanchandani; Olga V Muravenko; Viktor G Levitsky; Nikolay A Kolchanov; Alexei I Protopopov; Vladimir I Kashuba; Lev L Kisselev; Wyeth Wasserman; Claes Wahlestedt; Eugene R Zabarovsky
Journal:  Nucleic Acids Res       Date:  2002-07-15       Impact factor: 16.971

6.  Dynamics of enhancer-promoter communication during differentiation-induced gene activation.

Authors:  Pantelis Hatzis; Iannis Talianidis
Journal:  Mol Cell       Date:  2002-12       Impact factor: 17.970

Review 7.  Nucleosome sliding: facts and fiction.

Authors:  Peter B Becker
Journal:  EMBO J       Date:  2002-09-16       Impact factor: 11.598

8.  RECON: a program for prediction of nucleosome formation potential.

Authors:  Victor G Levitsky
Journal:  Nucleic Acids Res       Date:  2004-07-01       Impact factor: 16.971

Review 9.  Structure, dynamics, and function of chromatin in vitro.

Authors:  J Widom
Journal:  Annu Rev Biophys Biomol Struct       Date:  1998

10.  [Genetic level of DNA sequences is determined by superposition of many codes].

Authors:  E N Trifonov
Journal:  Mol Biol (Mosk)       Date:  1997 Jul-Aug
View more
  7 in total

1.  "Genome design" model: evidence from conserved intronic sequence in human-mouse comparison.

Authors:  Alexander E Vinogradov
Journal:  Genome Res       Date:  2006-02-03       Impact factor: 9.043

2.  Generalizations of Markov model to characterize biological sequences.

Authors:  Junwen Wang; Sridhar Hannenhalli
Journal:  BMC Bioinformatics       Date:  2005-09-06       Impact factor: 3.169

3.  NXSensor web tool for evaluating DNA for nucleosome exclusion sequences and accessibility to binding factors.

Authors:  Peter Luykx; Ivan V Bajić; Sawsan Khuri
Journal:  Nucleic Acids Res       Date:  2006-07-01       Impact factor: 16.971

4.  Sequence analysis of origins of replication in the Saccharomyces cerevisiae genomes.

Authors:  Wen-Chao Li; Zhe-Jin Zhong; Pan-Pan Zhu; En-Ze Deng; Hui Ding; Wei Chen; Hao Lin
Journal:  Front Microbiol       Date:  2014-11-18       Impact factor: 5.640

Review 5.  Nucleosome-Omics: A Perspective on the Epigenetic Code and 3D Genome Landscape.

Authors:  Siyuan Kong; Yuhui Lu; Shuhao Tan; Rongrong Li; Yan Gao; Kui Li; Yubo Zhang
Journal:  Genes (Basel)       Date:  2022-06-22       Impact factor: 4.141

6.  Organization of developmental enhancers in the Drosophila embryo.

Authors:  Dmitri Papatsenko; Yury Goltsev; Michael Levine
Journal:  Nucleic Acids Res       Date:  2009-08-03       Impact factor: 16.971

7.  Multiple sequence-directed possibilities provide a pool of nucleosome position choices in different states of activity of a gene.

Authors:  Vinesh Vinayachandran; Rama-Haritha Pusarla; Purnima Bhargava
Journal:  Epigenetics Chromatin       Date:  2009-03-16       Impact factor: 4.954

  7 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.