| Literature DB >> 19543363 |
Abstract
The herpes simplex virus dsDNA genome is distinguished by an unusually high G+C nucleotide content. HSV-1 and HSV-2, for instance, have GC contents of 68% and 70% respectively, while that of the host (human) genome is 41%. To determine how GC content varies with genome location, GC content was measured separately in coding and intergenic regions of HSV-1 DNA. The results showed that the 75 genes constitute a uniform population with a mean GC content of 66.9 +/- 4.1%. In contrast, intergenic regions were found in two non-overlapping populations, one with a mean GC content (69.3 +/- 4.6% n=32) similar to the coding regions and another where the GC content is lower (56.0 +/- 4.9 n=30). Compared to other regions of the genome, intergenic regions with reduced GC content were found to be enriched in local GC minima, CACACA sequences and a primary target sequence (TTAAAA) for retrotransposition events. The results are interpreted to suggest that a high GC content is part of the way HSV-1 protects its genes from invasion by mobile genetic elements active during cell differentiation in the nervous system.Entities:
Keywords: CA repeats.; DNA sequence; G+C content; Herpes simplex virus; L1 retrotransposition; intergenic DNA
Year: 2007 PMID: 19543363 PMCID: PMC2606590 DOI: 10.2174/1874091X00701010033
Source DB: PubMed Journal: Open Biochem J ISSN: 1874-091X
GC Content of Coding and Intergenic Regions in the HSV-1 Genome
| Coding Region | Intergenic Region (rightward) | Coding Region | Intergenic Region (rightward) | ||||||
|---|---|---|---|---|---|---|---|---|---|
| Gene | Length (bp) | GC | Length (bp) | %GC | Gene | Length (bp) | %GC | Length (bp) | %GC |
| RL1 | 747 | 82.9 | 1003 | 68.1 | UL36 | 9495 | 71.3 | 170 | 60.6 |
| RL2 (ex 2) | 1604 | 77.2 | 5588 | 67.8 | UL37 | 3373 | 69.3 | 448 | 68.3 |
| UL1 | 675 | 58.1 | Overlap | UL38 | 1398 | 71.3 | 517 | 61.5 | |
| UL2 | 1005 | 66.1 | 70 | 48.6 | UL39 | 3414 | 65.7 | 70 | 74.3 |
| UL3 | 708 | 63.4 | 160 | 46.3 | UL40 | 1023 | 61.4 | 221 | 53.4 |
| UL4 | 600 | 64.8 | 62 | 69.4 | UL41 | 1470 | 62.7 | 477 | 62.7 |
| UL5 | 2649 | 62.1 | Overlap | UL42 | 1526 | 66.8 | 111 | 55.0 | |
| UL6 | 2031 | 68.3 | Overlap | UL43 | 1305 | 72.4 | 280 | 62.1 | |
| UL7 | 890 | 66.1 | 200 | 55.0 | UL44 | 1516 | 67.8 | 187 | 57.8 |
| UL8 | 2253 | 70.4 | 238 | 72.7 | UL45 | 519 | 68.6 | 247 | 56.7 |
| UL9 | 2555 | 63.5 | Overlap | UL46 | 2157 | 71.1 | 84 | 67.9 | |
| UL10 | 1445 | 65.3 | 154 | 53.2 | UL47 | 2082 | 73.1 | 492 | 63.2 |
| UL11 | 291 | 66.6 | Overlap | UL48 | 1473 | 65.1 | 370 | 64.3 | |
| UL12 | 1881 | 68.5 | 60 | 66.7 | UL49 | 906 | 70.5 | Overlap | |
| UL13 | 1557 | 64.0 | Overlap | UL49A | 1276 | 68.9 | 18 | 50.0 | |
| UL14 | 660 | 65.9 | 106 | 73.6 | UL50 | 1116 | 66.7 | 153 | 48.4 |
| UL15 (ex 1) | 1029 | 64.3 | 127 | 60.6 | UL51 | 735 | 68.4 | 38 | 76.5 |
| UL16 | 1122 | 68.3 | 92 | 81.5 | UL52 | 3177 | 66.1 | Overlap | |
| UL17 | 2112 | 70.1 | 139 | 66.2 | UL53 | 1017 | 61.1 | 540 | 64.6 |
| UL15 (ex 2) | 1179 | 61.5 | 283 | 56.5 | UL54 | 1539 | 69.3 | 225 | 60.0 |
| UL18 | 957 | 65.5 | 354 | 68.1 | UL55 | 561 | 62.4 | 166 | 50.6 |
| UL19 | 4125 | 68.5 | 293 | 68.6 | UL56 | 705 | 66.2 | 3766 | 68.3 |
| UL20 | 669 | 61.3 | 587 | 60.3 | RL2 | 1604 | 77.2 | 1003 | 68.1 |
| UL21 | 1622 | 66.0 | 172 | 60.5 | RL1 | 747 | 82.9 | 1375 | 77.3 |
| UL22 | 2517 | 66.7 | 291 | 60.8 | RS1 | 3897 | 81.4 | 1537 | 74.6 |
| UL23 | 1304 | 63.5 | Overlap | US1 | 1297 | 64.9 | 94 | 61.7 | |
| UL24 | 1008 | 64.6 | 70 | 51.4 | US2 | 876 | 64.3 | 295 | 65.1 |
| UL25 | 1743 | 68.0 | 255 | 60.0 | US3 | 1446 | 63.6 | 78 | 61.5 |
| UL26 | 1908 | 71.4 | 343 | 47.8 | US4 | 717 | 63.7 | 272 | 59.6 |
| UL27 | 3023 | 66.5 | 9 | 77.7 | US5 | 279 | 65.9 | 411 | 58.6 |
| UL28 | 2358 | 69.7 | 305 | 66.6 | US6 | 1185 | 64.3 | 183 | 60.6 |
| UL29 | 3591 | 67.3 | 755 | 63.3 | US7 | 1173 | 65.6 | 287 | 56.1 |
| UL30 | 3708 | 65.8 | Overlap | US8 | 1653 | 66.5 | 419 | 69.2 | |
| UL31 | 921 | 65.6 | Overlap | US9 | 273 | 63.0 | 573 | 57.1 | |
| UL32 | 1791 | 68.0 | Overlap | US10 | 939 | 67.3 | Overlap | ||
| UL33 | 393 | 67.9 | 81 | 70.4 | US11 | 486 | 67.3 | 66 | 68.2 |
| UL34 | 828 | 67.1 | 107 | 67.3 | US12 | 267 | 65.5 | 1549 | 75.1 |
| UL35 | 339 | 65.8 | 146 | 48.6 | RS1 | 3897 | 81.4 | 1261 | 78.5 |
Intergenic regions with reduced GC content are highlighted. Others are in the genome-like group.
Mean GC Content of Selected HSV-1 Gene Groupsa
| Gene group | Mean %GC ± SD | P value |
|---|---|---|
| Essential | 67.3 ± 3.9 (n=34) | 0.55 |
| Core | 66.7 ± 2.8 (n=43) | 0.39 |
| Beta kinetic class | 65.5 ± 2.8 (n=13) | 0.87 |
| All UL | 66.6 ± 3.1 (n=58) | 0.91 |
Source: Essential, Non-essential, Beta (early) kinetic class, Gamma (late) kinetic class: [7]; Core, Non-core: [3]; UL, US: GenBank NC_001806.
Applies to the hypothesis that the GC content of the two gene groups is different.
Intergenic Regions with Genome-Like GC Contenta
| Intergenic | Local GC | CACACA | TTAAAA |
|---|---|---|---|
| RL1-RL2 | No | No | Yes (2) |
| RL2 (ex 2)-UL1 | Yes | Yes (2) | No |
| UL4-UL5 | No | No | No |
| UL8-UL9 | No | No | No |
| UL12-UL13 | No | No | No |
| UL14-UL15 (ex 1) | No | No | No |
| UL16-UL17 | No | No | No |
| UL17-UL15 (ex 2) | No | No | No |
| UL18-UL19 | No | No | No |
| UL19-UL20 | No | Yes | No |
| UL28-UL29 | No | Yes | No |
| UL29-UL30 | Yes | No | No |
| UL33-UL34 | No | No | No |
| UL34-UL35 | No | Yes | No |
| UL37-UL38 | No | No | No |
| UL39-UL40 | No | No | No |
| UL41-UL42 | Yes | No | No |
| UL43-UL44 | Yes | Yes | No |
| UL46-UL47 | No | No | No |
| UL47-UL48 | Yes | Yes | No |
| UL48-UL49 | No | No | No |
| UL51-UL52 | No | No | No |
| UL53-UL54 | No | No | No |
| UL56-RL2 | Yes | No | No |
| RS1-US1 | Yes | No | No |
| US2-US3 | No | No | Yes |
| US8-US9 | No | Yes | Yes |
| US11-US12 | No | No | No |
| US12-RS1 | Yes | No | No |
| Total | 8 | 8 | 4 |
| Total per 10,000 bp | 4.1 | 4.1 | 2.0 |
Intergenic regions with genome-like GC content are those with GC contents in the range of 62.1%-81.5%.
Local GC minima are those identified by visual inspection of the GC content trace as defined in a sliding 120 bp window as shown in Fig. ().
Intergenic Regions with Reduced GC Contentaa
| Intergenic Region | Local GC Min | CACACA | TTAAAA |
|---|---|---|---|
| UL2-UL3 | Yes | No | No |
| UL3-UL4 | Yes | No | No |
| UL7-UL8 | No | Yes | No |
| UL10-UL11 | Yes | Yes | Yes (2) |
| UL15 (ex 1)-UL16 | No | Yes (2) | No |
| UL15 (ex 2)-UL18 | Yes | Yes | No |
| UL20-UL21 | Yes | No | No |
| UL21-UL22 | No | No | No |
| UL22-UL23 | Yes | No | No |
| UL24-UL25 | Yes | No | No |
| UL25-UL26 | No | No | No |
| UL26-UL27 | Yes | Yes (3) | Yes |
| UL35-UL36 | Yes | No | Yes |
| UL36-UL37 | No | No | No |
| UL38-UL39 | Yes | Yes | No |
| UL40-UL41 | No | No | No |
| UL42-UL43 | Yes | Yes (2) | No |
| UL44-UL45 | Yes | No | Yes |
| UL45-UL46 | Yes | Yes | No |
| UL50-UL51 | Yes | Yes | No |
| UL54-UL55 | No | No | No |
| UL55-UL56 | Yes | Yes | No |
| US1-US2 | Yes | No | No |
| US3-US4 | No | No | No |
| US4-US5 | No | Yes | No |
| US5-US6 | Yes | Yes | Yes |
| US6-US7 | No | No | No |
| US7-US8 | Yes | No | No |
| US9-US10 | Yes | No | No |
| Total | 19 | 16 | 6 |
| Total per 10,000 bp | 28.4 | 23.9 | 9.0 |
Intergenic regions with reduced GC content are those with GC contents in the range of 46.3%-61.7%.
Local GC minima are those identified by visual inspection of the GC content trace as defined in a sliding 120 bp window as shown in Fig. ().
CA Repeat and TTAAAA Retrotransposition Insertion Sites in the HSV-1 Genome
| Sequences | Expected | Observed |
|---|---|---|
| CACACA (total) | 49 | 79 |
| CACACA (in genes) | 39 | 55 |
| CACACA (intergenic regions) | 10 | 24 |
| TTAAAA (total) | 5 | 28 |
| TTAAAA (in genes) | 4 | 18 |
| TTAAAA (intergenic regions) | 1 | 10 |
Statistically expected sequence numbers were calculated based on both strands of the 152,261 bp HSV-1 genome with 68% GC content. A 16%; T 16%; G 34%; C 34%. Calculated values were rounded to the nearest whole number. The proportions of gene and intergene regions in the HSV-1 genome were taken as 79% and 21%, respectively [1, 2].