| Literature DB >> 23125372 |
Regina Z Cer1, Duncan E Donohue, Uma S Mudunuri, Nuri A Temiz, Michael A Loss, Nathan J Starner, Goran N Halusa, Natalia Volfovsky, Ming Yi, Brian T Luke, Albino Bacolla, Jack R Collins, Robert M Stephens.
Abstract
The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance.Entities:
Mesh:
Substances:
Year: 2012 PMID: 23125372 PMCID: PMC3531222 DOI: 10.1093/nar/gks955
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Non-B DB v2.0 motif counts by organism
| Species | A-phased repeat | G-quadruplex | Z-DNA | Direct repeat | Inverted repeat | Mirror repeat | STR |
|---|---|---|---|---|---|---|---|
| Horse 2.2 | 351 711 | 332 283 | 340 218 | 512 314 | 4 823 814 | 894 099 | 1 599 735 |
| Human 37.1 | 404 289 | 361 419 | 412 600 | 1 501 567 | 6 365 102 | 1 895 545 | 3 025 648 |
| Mouse 37.1 | 324 077 | 482 833 | 877 676 | 2 393 352 | 5 244 776 | 2 801 083 | 3 649 837 |
| Rat 4.2 | 318 619 | 417 278 | 953 455 | 2 265 975 | 4 895 917 | 2 570 060 | 3 118 948 |
| Dog 2.1 | 310 071 | 487 193 | 375 567 | 1 519 410 | 5 362 953 | 1 977 051 | 3 602 105 |
| Chimp 2.1 | 390 907 | 326 097 | 389 608 | 1 355 391 | 6 150 067 | 1 761 969 | 2 849 252 |
| Macaque 1.1 | 373 161 | 293 036 | 387 648 | 1 358 285 | 5 746 000 | 1 760 255 | 2 827 111 |
| Orangutan 1.2 | 386 971 | 314 395 | 378 166 | 1 297 200 | 6 038 798 | 1 700 296 | 2 800 053 |
| Cow 4.1 | 284 375 | 3 9 834 | 355 796 | 751 541 | 5 131 526 | 979 843 | 1 960 468 |
| Pig 4.1 | 308 331 | 414 700 | 342 753 | 1 077 302 | 4 856 703 | 1 518 336 | 2 737 708 |
| Platypus 1.1 | 47 493 | 74 309 | 35 765 | 118 410 | 718 314 | 149 526 | 451 324 |
| 24 909 | 1219 | 6299 | 33 311 | 283 522 | 62 634 | 125 949 |
Figure 1.A screen shot of the visualization page in non-B DB. The top left panel displays clickable anchor text links to all the available genomes in non-B DB, whereas the bottom panel displays the Circos plot for the human genome. The motifs are color coded as shown in the panel. Clicking on the Circos plots takes the user to the chromosome-wide non-B DNA motif histograms on the top right panel. Users are able to choose the chromosome and non-B DNA motif of interest and compare with available genomic features such as exons, genes and percent GC content. The histograms are available in 100-, 500- and 1000-kb bin sizes. Chromosome 1 and chromosome X are compared side by side as an example. The bottom panel displays the PolyBrowse tracks showing subset motifs for a region of chromosome 1. In ‘direct repeats’ tracks, the main motifs are in green, whereas the subset slipped motifs are in purple. In ‘inverted repeats’ tracks, the main motifs are in pink, whereas the subset cruciform motifs are in brown (not shown). Similarly, in ‘mirror repeats’ tracks, the main motifs are in yellow, whereas the subset triplexes are in blue.
Figure 2.Search by Feature Attributes page flow. The improved graphical interface of the query result page in the top right panel shows the overall summary at the top followed by details on three different genes, KRAS, MYC and PTEN. Each gene has a separate section. Preview data for each motif give the top results for each query in the bottom left panel.
Figure 3.Search by Feature Attributes page with G-quadruplex motif as an example. Search by features allows for multiple filters for each feature. In the case of G-quadruplexes, users can filter the results based on base composition, sequence, number of G islands and number of G runs, and the largest G-quadruplex can be formed. Each filter can have one or more values, such as ‘equal’, ‘not equal’, ‘less than’ and ‘greater than’ allowing flexibility in the filtering process.