| Literature DB >> 16784526 |
Hendrik Marks1, Xin-Ying Ren, Hans Sandbrink, Mariëlle C W van Hulten, Just M Vlak.
Abstract
BACKGROUND: White Spot Syndrome Virus, a member of the virus family Nimaviridae, is a large dsDNA virus infecting shrimp and other crustacean species. Although limited information is available on the mode of transcription, previous data suggest that WSSV gene expression occurs in a coordinated and cascaded fashion. To search in silico for conserved promoter motifs (i) the abundance of all 4 through 8 nucleotide motifs in the upstream sequences of WSSV genes relative to the complete genome was determined, and (ii) a MEME search was performed in the upstream sequences of either early or late WSSV genes, as assigned by microarray analysis. Both methods were validated by alignments of empirically determined 5' ends of various WSSV mRNAs.Entities:
Mesh:
Substances:
Year: 2006 PMID: 16784526 PMCID: PMC1550435 DOI: 10.1186/1471-2105-7-309
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Frequency of 4- or 5-nucleotide motifs in the 5' upstream regions of the ORFs as compared to the complete genome for the viruses Only the 15 motifs with the highest relative enrichment are shown for each virus. For AcMNPV, HHV1 and WSSV, sequences that are part of the consensus TATA box (TATA(a/t)A) are underlined, while for Vaccinia virus and ASFV sequences only consisting of A and T residues are italics. * Means P ≤ 0.05
| taagc | 393 | (29) | 90 | (114) | 4.0* | taagc | 393 | (29) | 137 | (87) | 3.0* |
| 1314 | (66) | 172 | (149) | 2.3* | 1314 | (66) | 255 | (111) | 1.7* | ||
| 1973 | (101) | 222 | (198) | 1.9* | 1973 | (101) | 363 | (162) | 1.6* | ||
| 1616 | (81) | 170 | (147) | 1.8* | 1616 | (81) | 268 | (116) | 1.4* | ||
| agta | 671 | (49) | 70 | (89) | 1.8* | agta | 671 | (49) | 109 | (69) | 1.4* |
| aagg | 473 | (51) | 41 | (76) | 1.5 | gatad | 867 | (63) | 137 | (87) | 1.4 |
| gatad | 867 | (63) | 74 | (94) | 1.5 | aata | 2230 | (115) | 346 | (154) | 1.3 |
| cacte | 612 | (64) | 52 | (95) | 1.5 | cagtf | 698 | (73) | 106 | (96) | 1.3 |
| aata | 2230 | (115) | 186 | (166) | 1.4 | ctta | 393 | (28) | 59 | (37) | 1.3 |
| atta | 1957 | (98) | 163 | (141) | 1.4 | tcace | 669 | (70) | 99 | (90) | 1.3 |
| gtat | 949 | (68) | 79 | (98) | 1.4 | gcta | 541 | (57) | 77 | (70) | 1.2 |
| cagtf | 698 | (73) | 58 | (105) | 1.4 | cccc | 190 | (42) | 27 | (52) | 1.2 |
| 2716 | (140) | 222 | (198) | 1.4 | tacc | 444 | (47) | 63 | (57) | 1.2 | |
| aggg | 233 | (35) | 19 | (50) | 1.4 | tagt | 737 | (53) | 104 | (64) | 1.2 |
| tagt | 737 | (53) | 58 | (72) | 1.4 | cacte | 612 | (64) | 86 | (78) | 1.2 |
| ttag | 221 | (53) | 17 | (168) | 3.2* | 372 | (192) | 42 | (445) | 2.3* | |
| ctag | 182 | (20) | 13 | (60) | 2.9* | 306 | (159) | 32 | (342) | 2.2* | |
| ctct | 658 | (76) | 38 | (180) | 2.4* | ctag | 182 | (20) | 19 | (44) | 2.1* |
| tagc | 365 | (41) | 21 | (97) | 2.4* | 454 | (234) | 47 | (498) | 2.1* | |
| tcta | 199 | (49) | 11 | (111) | 2.3* | cattg | 339 | (83) | 35 | (176) | 2.1* |
| tttt | 807 | (426) | 44 | (955) | 2.2* | taag | 254 | (60) | 26 | (127) | 2.1* |
| tagg | 342 | (38) | 17 | (77) | 2.0* | ttaa | 356 | (185) | 35 | (374) | 2.0* |
| cata | 369 | (90) | 18 | (180) | 2.0* | ctct | 658 | (76) | 64 | (152) | 2.0* |
| ctta | 254 | (62) | 12 | (121) | 1.9* | ttag | 221 | (53) | 21 | (104) | 2.0* |
| ccta | 342 | (39) | 16 | (75) | 1.9 | cata | 369 | (90) | 34 | (170) | 1.9* |
| tctc | 892 | (103) | 41 | (194) | 1.9 | ctta | 254 | (62) | 23 | (116) | 1.9* |
| ctgt | 936 | (106) | 40 | (186) | 1.8 | tttt | 807 | (426) | 72 | (782) | 1.8* |
| ctac | 492 | (56) | 21 | (99) | 1.8 | cctt | 795 | (92) | 68 | (161) | 1.8 |
| ttcc | 990 | (114) | 42 | (199) | 1.7 | aaat | 324 | (167) | 27 | (286) | 1.7 |
| cact | 433 | (50) | 18 | (85) | 1.7 | ctat | 242 | (59) | 20 | (101) | 1.7 |
| 4421 | (94) | 472 | (140) | 1.5* | gcac | 391 | (66) | 77 | (91) | 1.4* | |
| 5439 | (115) | 564 | (167) | 1.5* | 4528 | (96) | 822 | (122) | 1.3* | ||
| 4528 | (96) | 446 | (133) | 1.4* | 4421 | (94) | 796 | (118) | 1.3* | ||
| 4454 | (94) | 432 | (128) | 1.4* | tact | 2010 | (85) | 353 | (105) | 1.2* | |
| cgcg | 318 | (107) | 30 | (141) | 1.3* | acta | 2179 | (92) | 373 | (111) | 1.2* |
| gggg | 171 | (57) | 16 | (75) | 1.3* | 4454 | (94) | 755 | (112) | 1.2 | |
| acac | 1112 | (94) | 103 | (122) | 1.3* | 5439 | (115) | 911 | (135) | 1.2 | |
| 5203 | (110) | 479 | (142) | 1.3* | tgca | 832 | (70) | 138 | (82) | 1.2 | |
| 3720 | (79) | 337 | (100) | 1.3 | 5203 | (110) | 857 | (127) | 1.2 | ||
| tact | 2010 | (85) | 179 | (106) | 1.3 | ctac | 1264 | (107) | 208 | (123) | 1.2 |
| acta | 2179 | (92) | 190 | (113) | 1.2 | gccg | 384 | (129) | 63 | (148) | 1.2 |
| 4748 | (101) | 411 | (122) | 1.2 | gcga | 500 | (84) | 82 | (97) | 1.2 | |
| tgaa | 1917 | (81) | 165 | (98) | 1.2 | cacg | 586 | (98) | 96 | (113) | 1.2 |
| ctaa | 1830 | (77) | 157 | (93) | 1.2 | accc | 403 | (68) | 66 | (78) | 1.2 |
| gtaa | 1921 | (81) | 164 | (97) | 1.2 | cata | 2186 | (92) | 358 | (106) | 1.2 |
| 2686 | (91) | 261 | (199) | 2.2* | 2686 | (91) | 405 | (154) | 1.7* | ||
| 3146 | (107) | 278 | (214) | 2.0* | 3146 | (107) | 432 | (166) | 1.5* | ||
| 2634 | (89) | 231 | (176) | 2.0* | 2634 | (89) | 357 | (136) | 1.5* | ||
| 2697 | (91) | 227 | (173) | 1.9* | 3895 | (133) | 516 | (198) | 1.5* | ||
| 3895 | (133) | 325 | (250) | 1.9* | 3376 | (114) | 447 | (170) | 1.5* | ||
| 3419 | (117) | 278 | (214) | 1.8* | 3419 | (117) | 445 | (171) | 1.5* | ||
| 3376 | (114) | 274 | (209) | 1.8* | 3022 | (102) | 392 | (149) | 1.5* | ||
| 3022 | (102) | 238 | (182) | 1.8* | 2697 | (91) | 349 | (133) | 1.5* | ||
| 3890 | (131) | 306 | (231) | 1.8* | 3146 | (106) | 406 | (154) | 1.5* | ||
| 3146 | (106) | 247 | (187) | 1.8* | 3890 | (131) | 495 | (187) | 1.4* | ||
| 3895 | (131) | 296 | (224) | 1.7* | ctaa | 1172 | (63) | 147 | (88) | 1.4* | |
| 2697 | (91) | 203 | (155) | 1.7* | 2697 | (91) | 337 | (129) | 1.4* | ||
| 3419 | (115) | 249 | (188) | 1.6* | 3419 | (115) | 426 | (161) | 1.4* | ||
| 6731 | (232) | 480 | (372) | 1.6* | 3895 | (131) | 482 | (182) | 1.4* | ||
| 3890 | (133) | 272 | (209) | 1.6* | 6731 | (232) | 828 | (321) | 1.4* | ||
| 2630 | (60) | 164 | (128) | 2.2* | 2630 | (60) | 290 | (117) | 2.0* | ||
| 3431 | (74) | 201 | (150) | 2.0* | 3431 | (74) | 327 | (126) | 1.7* | ||
| 3538 | (77) | 195 | (146) | 1.9* | accc | 1333 | (88) | 127 | (148) | 1.7* | |
| aacc | 1774 | (79) | 92 | (142) | 1.8* | 3538 | (77) | 333 | (128) | 1.7* | |
| aaaa | 6450 | (134) | 330 | (236) | 1.8* | aaaa | 6450 | (134) | 570 | (210) | 1.6* |
| accc | 1333 | (88) | 68 | (154) | 1.8* | ccgg | 374 | (36) | 33 | (56) | 1.6* |
| accg | 699 | (46) | 33 | (75) | 1.6* | cccc | 1348 | (130) | 118 | (202) | 1.6* |
| tacc | 1424 | (67) | 64 | (103) | 1.5 | cacg | 995 | (65) | 86 | (100) | 1.5* |
| caac | 2792 | (125) | 123 | (190) | 1.5 | aacc | 1774 | (79) | 150 | (119) | 1.5* |
| aata | 4104 | (89) | 179 | (134) | 1.5 | accg | 699 | (46) | 56 | (65) | 1.4 |
| tttt | 6450 | (160) | 281 | (240) | 1.5 | taacj | 1888 | (60) | 151 | (85) | 1.4 |
| taacj | 1888 | (60) | 82 | (90) | 1.5 | gtaa | 1810 | (58) | 144 | (81) | 1.4 |
| cccg | 583 | (56) | 25 | (83) | 1.5 | taag | 1402 | (45) | 109 | (61) | 1.4 |
| ccgg | 374 | (36) | 16 | (53) | 1.5 | gggt | 1333 | (91) | 103 | (124) | 1.4 |
| ccgt | 940 | (64) | 40 | (94) | 1.5 | tttt | 6450 | (160) | 491 | (216) | 1.3 |
| ccggg | 66 | (31) | 7 | (116) | 3.8* | 760 | (57) | 113 | (151) | 2.6* | |
| 760 | (57) | 73 | (195) | 3.4* | ccccc | 292 | (137) | 41 | (342) | 2.5* | |
| cccgg | 66 | (31) | 6 | (99) | 3.2* | 716 | (54) | 95 | (127) | 2.4* | |
| 1202 | (87) | 103 | (263) | 3.0* | 1202 | (87) | 147 | (188) | 2.2* | ||
| taaaa | 1378 | (99) | 102 | (261) | 2.6* | ccggg | 66 | (31) | 8 | (66) | 2.1* |
| aaccg | 203 | (44) | 15 | (116) | 2.6* | taaaa | 1378 | (99) | 167 | (213) | 2.1* |
| gtata | 609 | (67) | 45 | (176) | 2.6* | gtata | 609 | (67) | 73 | (143) | 2.1* |
| 716 | (54) | 51 | (136) | 2.5* | aaccg | 203 | (44) | 23 | (89) | 2.0* | |
| taccc | 331 | (75) | 22 | (178) | 2.4* | aaaaa | 2054 | (142) | 231 | (283) | 2.0* |
| acccg | 138 | (44) | 9 | (102) | 2.3* | tatat | 716 | (56) | 79 | (110) | 2.0* |
| aaaaa | 2054 | (142) | 133 | (325) | 2.3* | acccc | 320 | (103) | 35 | (199) | 1.9* |
| aacca | 644 | (96) | 41 | (216) | 2.3* | accgg | 120 | (38) | 13 | (73) | 1.9* |
| ccgta | 208.0 | (47) | 13 | (105) | 2.2* | cccgg | 66 | (31) | 7 | (58) | 1.9* |
| gacct | 257 | (58) | 16 | (129) | 2.2* | ctcack | 284 | (65) | 30 | (121) | 1.9* |
| cgtcg | 178 | (59) | 11 | (130) | 2.2* | taccc | 331 | (75) | 34 | (137) | 1.8* |
aboth strands, excluding hrs (present for AcMNPV and WSSV)
bexpected occurrence is the occurrence of a 4-mer or 5-mer based on random distribution of nucleotides in the complete genome
cpart of the AcMNPV late initiator sequence (a/g/t)TAAG
dpart of the AcMNPV upstream activating element with sequence (a/t)GATA(a/t)
epart of the AcMNPV downstream activating element with sequence (a/t)CACNG
fsequence of the AcMNPV early initiator CAGT
gpart of the CCATT box
hpart of the Vaccinia virus late initiator sequence TAAAT and/or the intermediate initiator TAAA(a/t)
isequence of the ASFV late initiator TATA
jpart of the WSSV putative late TIS motif ATNAC
kpart of the WSSV putative early initiator (a/c)TCANT
Consensus sequences (4–8 nt) in upstream regions of WSSV genes identified with MEME. Only the best 3 hits of MEME are shown. In case of all WSSV genes, the number of sequences in which the consensus sequence occurred is indicated.
| g | g | g | |
| g | agaagagg | gaggaaga |
Figure 1Alignment of 5' flanking sequences of WSSV early genes. The genes are named after WSSV-TH ORF numbers and the function of their protein product. The transcription initiation site of each gene is encircled. Sequences are aligned by their consensus TATA box, as well as by maximizing the identities around the transcriptional start site. Below, the consensus sequence of the alignment and the Drosophila RNA polymerase II core promoter are shown. Similar sequences of the consensus TIS motif and the initiator of the Drosophila RNA pol II core promoter are underlined. Abbreviations used and references: pk: protein kinase [52]; DNA-pol: DNA polymerase [47]; tds: Thymidylate Synthase [48]; dutp-ase: dUTPase [42]; lat-rel: latency related gene [33]; rr1 and rr2: the large and small subunit of ribonucleotide reductase, respectively [53]; endonuc: endonuclease [54].
Figure 2Sequences upstream of two major structural protein genes . The TISs are indicated by arrows above the sequences. The 5' termini of the different clones sequenced for each gene are underlined. The number beneath the underlining shows the number of similar clones. The start codons of both genes are shaded black (A). Sequence downstream of ORF30 showing the polyadenylation site, indicated by an arrow below the sequence. The 3' terminus of the different clones sequenced is underlined. The number before the arrow represents the number of similar clones sequenced. The stop codon is shaded dark grey and two overlapping poly(A) signals (AATAAA) light grey (B).
Figure 3Alignment of 5' flanking sequences of WSSV late genes. Name of structural protein genes as well as WSSV-TH ORF numbers are indicated. The transcription initiation site of each gene is encircled. For vp19 a minor transcription initiation site is also encircled. The TATA box for vp15 is underlined. The A/T rich region is boxed. Sequences are aligned by maximizing the identities around the transcriptional start site. References: vp28, vp26, vp24, vp19 and vp15 [31]; vp664 [39]; vp75 and vp73 (this study).
Figure 4Alignment of 3' flanking sequences of WSSV genes. The stop codon, polyadenlylation signal, start of polyadenylation and the T rich region are indicated. Sequences are aligned by stop codon and by polyadenylation signal. Abbreviations used and references: ie1: immediate-early 1 [17]; vp466 [12]; vp53a, vp11, vp136b [13]; collag: collagen-like ORF (this study). For abbreviations and references of other genes see Figs. 1 & 3.