| Literature DB >> 16141192 |
David S H Chew1, Kwok Pui Choi, Ming-Ying Leung.
Abstract
Many empirical studies show that there are unusual clusters of palindromes, closely spaced direct and inverted repeats around the replication origins of herpesviruses. In this paper, we introduce two new scoring schemes to quantify the spatial abundance of palindromes in a genomic sequence. Based on these scoring schemes, a computational method to predict the locations of replication origins is developed. When our predictions are compared with 39 known or annotated replication origins in 19 herpesviruses, close to 80% of the replication origins are located within 2% of the genome length. A list of predicted locations of replication origins in all the known herpesviruses with complete genome sequences is reported.Entities:
Mesh:
Substances:
Year: 2005 PMID: 16141192 PMCID: PMC1197138 DOI: 10.1093/nar/gni135
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1A palindrome of length 10.
The list of herpesviruses to be analyzed
| Virus | Abbreviation | Accession | Length | Base composition (A, C, G, T) |
|---|---|---|---|---|
| Alcelaphine herpesvirus 1 | AlHV1 | NC_002531 | 130 608 | (0.27, 0.24, 0.22, 0.26) |
| Ateline herpesvirus 3 | AtHV3 | NC_001987 | 108 409 | (0.32, 0.19, 0.17, 0.31) |
| Bovine herpesvirus 1 | BoHV1 | NC_001847 | 135 301 | (0.14, 0.36, 0.37, 0.14) |
| Bovine herpesvirus 4 | BoHV4 | NC_002665 | 108 873 | (0.30, 0.21, 0.20, 0.29) |
| Bovine herpesvirus 5 | BoHV5 | NC_005261 | 138 390 | (0.12, 0.37, 0.38, 0.13) |
| Callitrichine herpesvirus 3 | CalHV3 | NC_004367 | 149 696 | (0.26, 0.25, 0.25, 0.25) |
| Cercopithecine herpesvirus 1 | CeHV1 | NC_004812 | 156 789 | (0.13, 0.37, 0.38, 0.13) |
| Cercopithecine herpesvirus 15 | CeHV15 | NC_006146 | 171 096 | (0.18, 0.31, 0.31, 0.20) |
| Cercopithecine herpesvirus 17 | MMRV | NC_003401 | 133 719 | (0.24, 0.27, 0.26, 0.23) |
| Cercopithecine herpesvirus 2 | CeHV2 | NC_006560 | 150 715 | (0.12, 0.38, 0.38, 0.12) |
| Cercopithecine herpesvirus 8 | CeHV8 | NC_006150 | 221 454 | (0.26, 0.25, 0.24, 0.25) |
| Cercopithecine herpesvirus 9 | CeHV7 | NC_002686 | 124 138 | (0.29, 0.21, 0.20, 0.30) |
| Equid herpesvirus 1 | EHV1 | NC_001491 | 150 224 | (0.22, 0.29, 0.28, 0.22) |
| Equid herpesvirus 2 | EHV2 | NC_001650 | 184 427 | (0.22, 0.29, 0.28, 0.21) |
| Equid herpesvirus 4 | EHV4 | NC_001844 | 145 597 | (0.25, 0.25, 0.25, 0.25) |
| Gallid herpesvirus 1 | GaHV1 | NC_006623 | 148 687 | (0.26, 0.24, 0.24, 0.26) |
| Gallid herpesvirus 2 | GaHV2 | NC_002229 | 174 077 | (0.28, 0.22, 0.22, 0.28) |
| Gallid herpesvirus 3 | GaHV3 | NC_002577 | 164 270 | (0.23, 0.27, 0.27, 0.23) |
| Human herpesvirus 1 | HSV1 | NC_001806 | 152 261 | (0.16, 0.34, 0.34, 0.16) |
| Human herpesvirus 2 | HSV2 | NC_001798 | 154 746 | (0.15, 0.35, 0.35, 0.15) |
| Human herpesvirus 3 | VZV | NC_001348 | 124 884 | (0.27, 0.23, 0.23, 0.27) |
| Human herpesvirus 4 | EBV | NC_001345 | 172 281 | (0.20, 0.30, 0.29, 0.20) |
| Human herpesvirus 5 strain AD169 | HCMV | NC_001347 | 230 287 | (0.22, 0.28, 0.29, 0.21) |
| Human herpesvirus 5 strain Merlin | HCMV-M | NC_006273 | 235 645 | (0.21, 0.29, 0.29, 0.21) |
| Human herpesvirus 6 | HHV6 | NC_001664 | 159 321 | (0.29, 0.22, 0.21, 0.29) |
| Human herpesvirus 6B | HHV6B | NC_000898 | 162 114 | (0.29, 0.22, 0.21, 0.29) |
| Human herpesvirus 7 | HHV7 | NC_001716 | 153 080 | (0.32, 0.20, 0.17, 0.31) |
| Human herpesvirus 8 | HHV8 | NC_003409 | 137 508 | (0.24, 0.27, 0.26, 0.23) |
| Ictalurid herpesvirus 1 | IcHV1 | NC_001493 | 134 226 | (0.21, 0.28, 0.28, 0.22) |
| Meleagrid herpesvirus 1 | MeHV1 | NC_002641 | 159 160 | (0.26, 0.24, 0.24, 0.26) |
| Murid herpesvirus 1 | MCMV | NC_004065 | 230 278 | (0.20, 0.29, 0.30, 0.21) |
| Murid herpesvirus 2 | RCMV | NC_002512 | 230 138 | (0.19, 0.30, 0.31, 0.20) |
| Murid herpesvirus 4 | MUHV4 | NC_001826 | 119 450 | (0.27, 0.24, 0.23, 0.26) |
| Ostreid herpesvirus 1 | OsHV1 | NC_005881 | 207 439 | (0.31, 0.19, 0.19, 0.30) |
| Pongine herpesvirus 4 | CCMV | NC_003521 | 241 087 | (0.19, 0.31, 0.31, 0.19) |
| Psittacid herpesvirus 1 | PSHV1 | NC_005264 | 163 025 | (0.19, 0.31, 0.30, 0.20) |
| Saimiriine herpesvirus 2 | SaHV2 | NC_001350 | 112 930 | (0.33, 0.18, 0.16, 0.32) |
| Suid herpesvirus 1 | SHV1 | NC_006151 | 143 461 | (0.13, 0.37, 0.37, 0.13) |
| Tupaiid herpesvirus 1 | THV | NC_002794 | 195 859 | (0.17, 0.33, 0.34, 0.17) |
Figure 2Sliding window plots of HCMV and HSV1 using PCS, PLS and BWS0. The first window spans the first through the wth bases, the second the ()th to ()th bases, and so on. The score of a window is the total of the scores of all the palindromes occurring in this window according to PCS, PLS or BWS0.
High scoring windows of PLS and BWS1
Regions with significant clusters of palindromes as found by the PCS
| Virus | Region |
|---|---|
| AlHV1 | 113 456–113 759 |
| AtHV3 | 95 350–100 098 |
| BoHV1 | 77 155–77 168, 102 895–106 948, 113 462–113 636, 124 582–124 756, 131 268–135 221 |
| CalHv3 | 21 899–23 918, 115 406–117 660, 133 180–133 587 |
| CCMV | 88 376–93 659, 206 555–207 582 |
| CeHV1 | 112 833–113 219 |
| CeHV8 | 147 015–147 280, 158 953–164 225 |
| CeHV15 | 5182–10 840, 32 483–36 810, 137 852–139 781, 150 277–152 289 |
| EBV | 6772–11 675, 49 460–54 858 |
| EHV1 | 115 125–119 096, 144 064–148 035 |
| EHV2 | 4911–9106, 147 228–147 250, 171 785–175 980 |
| GaHV3 | 10 409–11 952, 104 965–105 067, 121 153–123 174, 138 321–138 935, 158 536–159 150 |
| HCMV | 90 515–95 115, 195 962–196 203 |
| HCMV-M | 90 881–96 835, 175 177–176 003, 201 246–201 487 |
| HHV6b | 88 469–94 716 |
| HHV7 | 124 985–128 653 |
| HHV8 | 21 913–23 705 |
| MCMV | 92 621–93 412, 142 118–142 186 |
| MeHV1 | 116 644–116 667 |
| MMRV | 3464–3517, 130 148–132 723 |
| MuHV4 | 96 755–105 094 |
| PsHV1 | 128 677–131 155, 151 017–153 495 |
| RCMV | 74 134–76 485, 118 126–118 854 |
| SHV1 | 36 683–41 606 |
| THV | 10 089–11 213 |
For example, for the virus EBV, the region 6772–11 675 bp (and 49 460–54 858 bp) is deemed to contain a high concentration of palindromes. BoHV4, BoHV5, CeHV2, CeHV7, EHV4, GaHV1, GaHV2, HHV6, HSV1, HSV2, IcHV1, OsHV1, SaHV2 and VZV have no significant clusters of palindromes.
Prediction performance of various scoring schemes, PLS and BWS, based on top 3 scoring windows
| Virus | Known ORIs/Names | PCS | PLS | BWS1 | |
|---|---|---|---|---|---|
| BoHV1 | 111 080–111 300 | (oriS) | 1.75 mu | 1.63 mu | 1.63 mu |
| 126918–127 138 | (oriS) | 1.61 mu | 1.87 mu | 1.87 mu | |
| BoHV4 | 97 143–98 850 | (oriLyt) | — | — | — |
| BoHV5 | 113 206–113 418 | (oriLyt) | — | — | 0.06 mu |
| 129 595–129 807 | (oriLyt) | — | — | 0.07 mu | |
| CeHV1 | 61 592–61 789 | (oriL1) | — | 0.057 mu | 0.057 mu |
| 61 795–61 992 | (oriL2) | — | 0.18 mu | 0.18 mu | |
| 132 795–132 796 | (oriS1) | — | 0.13 mu | 0.13 mu | |
| 132 998–132 999 | (oriS2) | — | 0.0016 mu | 0.0016 mu | |
| 149 425–149 426 | (oriS2) | — | 0.016 mu | 0.016 mu | |
| 149 628–149 629 | (oriS1) | — | 0.11 mu | 0.11 mu | |
| CeHV2 | 61 445–61 542 | (oriL) | — | 0.07 mu | 0.07 mu |
| 129 452–129 623 | (oriS) | — | 0.02 mu | 0.02 mu | |
| 144 386–144 557 | (oriS) | — | 0.17 mu | 0.17 mu | |
| CeHV7 | 109 627–109 646 | — | — | — | |
| 118 613–118 632 | — | — | — | ||
| EBV | 7315–9312 | (oriP) | contains ori | 0.41 mu | 0.41 mu |
| 52 589–53 581 | (oriLyt) | contains ori | 0.067 mu | 0.067 mu | |
| EHV1 | 126 187–126 338 | — | — | — | |
| EHV4 | 73 900–73 919 | (oriL) | — | — | — |
| 119 462–119 481 | (oriS) | — | — | — | |
| 138 568–138 587 | (oriS) | — | — | — | |
| GaHV1 | 24 738–25 005 | (oriL) | — | — | — |
| HCMV | 93 201–94 646 | (oriLyt) | contains ori | 0.055 mu | 0.055 mu |
| HHV6 | 67 617–67 993 | (oriLyt) | — | — | — |
| HHV6b | 68 740–69 581 | (oriLyt) | — | 0.024 mu | — |
| HHV7 | 66 685–67 298 | — | — | — | |
| HSV1 | 62 475 | (oriL) | — | 0.11 mu | 0.11 mu |
| 131 999 | (oriS) | — | 1.41 mu | 1.41 mu | |
| 146 235 | (oriS) | — | 1.42 mu | 1.42 mu | |
| HSV2 | 62 930 | (oriL) | — | — | — |
| 132 760 | (oriS) | — | — | — | |
| 148 981 | (oriS) | — | — | — | |
| RCMV | 75 666–78 970 | (oriLyt) | overlaps ori | 0.62 mu | 0.62 mu |
| SHV1 | 63 848–63 908 | (oriL) | — | — | — |
| 114 393–115 009 | (oriS) | — | — | — | |
| 129 593–130 209 | (oriS) | — | — | — | |
| VZV | 110 087–110 350 | — | 0.094 mu | 0.094 mu | |
| 119 547–119 810 | — | 0.22 mu | 0.22 mu | ||
The table shows the distance between each known origin from the nearest significant palindrome cluster for PCS, or the nearest high scoring window for PLS and BWS1 if the center of the cluster or window is within 2 mu of the origin. For example, one of the top 3 scoring windows under the PLS (and BWS) for RCMV is 0.62 map unit away from the RCMV oriLyt.
Figure 3Sensitivity and positive predictive values of the PLS and BWS. In our context, sensitivity is the percentage of known origins that are close to the regions suggested by the prediction; and positive predictive value is the percentage of identified regions that are close to the known origins. The sensitivity and positive predictive values of the PCS are 15 and 25, respectively.
Figure 4Sensitivity and positive predictive values of l ocal BWS.