| Literature DB >> 17996081 |
Eulyn Pagaling1, Richard D Haigh, William D Grant, Don A Cowan, Brian E Jones, Yanhe Ma, Antonio Ventosa, Shaun Heaphy.
Abstract
BACKGROUND: We are profoundly ignorant about the diversity of viruses that infect the domain Archaea. Less than 100 have been identified and described and very few of these have had their genomic sequences determined. Here we report the genomic sequence of a previously undescribed archaeal virus. <br> RESULTS: Haloarchaeal strains with 16S rRNA gene sequences 98% identical to Halorubrum saccharovorum were isolated from a hypersaline lake in Inner Mongolia. Two lytic viruses infecting these were isolated from the lake water. The BJ1 virus is described in this paper. It has an icosahedral head and tail morphology and most likely a linear double stranded DNA genome exhibiting terminal redundancy. Its genome sequence has 42,271 base pairs with a GC content ofapproximately 65 mol%. The genome of BJ1 is predicted to encode 70 ORFs, including one for a tRNA. Fifty of the seventy ORFs had no identity to data base entries; twenty showed sequence identity matches to archaeal viruses and to haloarchaea. ORFs possibly coding for an origin of replication complex, integrase, helicase and structural capsid proteins were identified. Evidence for viral integration was obtained. <br> CONCLUSION: The virus described here has a very low sequence identity to any previously described virus. Fifty of the seventy ORFs could not be annotated in any way based on amino acid identities with sequences already present in the databases. Determining functions for ORFs such as these is probably easier using a simple virus as a model system.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17996081 PMCID: PMC2194725 DOI: 10.1186/1471-2164-8-410
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1Unrooted phylogenetic tree showing the relationship of the environmental archaeal strain host BJ1B11 for the virus BJ1, to other closely related environmental strains isolated by us and Halorubrum species. The scale bar represents the number of inferred nucleotide substitutions per site. Values at nodes indicate >50% percentage occurrence in 500 bootstrapped trees.
Figure 2Electron micrograph images of BJ1; the scale bar is 500 nm, top panel and 200 nm bottom panel. A schematic diagram of BJ1 annotated with discernible features and the size of these features is also shown. The standard deviation (SD) of measurements from twenty six different particles was determined.
Figure 3Panel a. 0.8% TAE agarose gel showing virus BJ1 genome sensitivity to nucleases. Lanes 1 and 4 , undigested controls; Lane 2, DNAse treated; Lane 3 RNase treated; Lane 5, exonuclease III treated. Panel b, 1% agarose 0.5× TBE pulse field gel; lanes 1 and 4 size markers (kbps), lanes 2 and 3 BJ1 virus genome. Panel c, BamH1 enzyme digest of virus BJ1 genomic DNA, DNA size markers are shown on the left (kbps). The image has been overexposed to show the smaller bands.
Figure 4Top panel. Diagram of the BJ1 genome drawn in a circular form. The major features are shown including the predicted ORFs, blue arrows in the forward direction, green arrows in the reverse. The tRNA gene is in red. ORFs mentioned in the text are numbered. The outer scale bar is in base pairs. The inner curved arrows indicate entirely hypothetical operons. The bottom panel shows the cumulative GC skew.
Predicted ORFs in virus BJ1
| ORF | Start | Stop | aa | Mr | pI | RBS/distance |
| 130 | 990 | 286 | 33 | 4.6 | - | |
| 1146 | 1805 | 219 | 25 | 4.9 | - | |
| 1980 | 2093 | 37 | 3.7 | 8.5 | GGAGGTG-5 | |
| 2207 | 2425 | 72 | 8.1 | 7.0 | - | |
| 2541 | 3191 | 216 | 24 | 4.7 | GAGG-10 | |
| 3178 | 3393 | 71 | 8.2 | 4.3 | - | |
| 3547 | 3993 | 148 | 16 | 4.1 | GAG-6 | |
| 3993 | 4463 | 156 | 18 | 4.3 | AGGAGGTGA-8 | |
| 4456 | 4851 | 131 | 15 | 4.2 | AGGAGGTGA-7 | |
| 4844 | 5218 | 124 | 14 | 4.7 | GGAGGT-6 | |
| 5208 | 5357 | 49 | 5.2 | 3.8 | GAGGTG-8 | |
| 5350 | 5574 | 74 | 8.3 | 4.6 | AGGAGGT-6 | |
| 5571 | 5744 | 57 | 6.1 | 10.4 | GGAGG-5 | |
| 5741 | 5986 | 81 | 8.9 | 5.8 | GGAGG-8 | |
| 5998 | 6417 | 139 | 15 | 4.3 | GAGG-7 | |
| 6637 | 7713 | 358 | 40 | 5.0 | AGGTG-9 | |
| 7919 | 8560 | 213 | 24 | 4.3 | AGGA-8 | |
| 8689 | 8949 | 86 | 9.3 | 4.9 | - | |
| 8950 | 9153 | 67 | 7.9 | 5.2 | GGTG-10 | |
| 9159 | 9446 | 95 | 11 | 4.6 | GGAG-4 | |
| 9660 | 10022 | 120 | 14 | 4.6 | GGA-7 | |
| 10022 | 10153 | 43 | 4.6 | 4.0 | GGTG-8 | |
| 10153 | 10890 | 245 | 28 | 3.9 | GGAGG-8 | |
| 10880 | 11806 | 308 | 34 | 4.3 | GGAGG-9 | |
| 11803 | 11946 | 47 | 5.2 | 4.1 | GGTGA-7 | |
| 11946 | 12671 | 241 | 27 | 4.7 | GGTGA-7 | |
| 12668 | 12760 | 30 | 3.3 | 4.5 | GGAGGTG-6 | |
| 12757 | 13092 | 111 | 12.2 | 5.8 | GAGGTGA-5 | |
| 13092 | 13262 | 56 | 6.2 | 3.8 | GGAGG-8 | |
| 13255 | 14700 | 481 | 52 | 6.2 | AGGAGG-6 | |
| 13270 | 14487 | 405 | 46 | 5.0 | - | |
| 14701 | 14826 | 41 | 4.3 | 4.0 | GGAGGTGA-9 | |
| 14819 | 15307 | 162 | 18 | 4.6 | GAGGTGA-7 | |
| 15310 | 15531 | 73 | 83 | 11.6 | AGGAGGTG-9 | |
| 15489 | 17603 | 704 | 78 | 4.7 | (GAAAA) | |
| 17606 | 18058 | 150 | 17 | 4.4 | GGAGG-9 | |
| Start | Stop | aa | Mr | pI | RBS/distance | |
| 18055 | 18519 | 154 | 18 | 4.3 | (GGGGG) | |
| 18512 | 18817 | 101 | 11 | 5.0 | GAGGTG-8 | |
| 18814 | 19074 | 86 | 9.9 | 6.1 | GAGGTG-9 | |
| 19071 | 19241 | 56 | 5.9 | 10.3 | GGAGG-8 | |
| 19129 | 19806 | 225 | 26 | 6.3 | - | |
| 19803 | 19982 | 59 | 6.4 | 4.0 | GAGGTG-6 | |
| 19973 | 20046 | - | - | - | - | |
| 20365 | 21843 | 492 | 55 | 4.9 | - | |
| 21840 | 21998 | 52 | 5.9 | 4.3 | - | |
| 22001 | 22111 | 36 | 3.9 | 4.8 | - | |
| 22108 | 22416 | 102 | 11 | 4.3 | - | |
| 22416 | 22577 | 53 | 6.1 | 4.2 | - | |
| 22574 | 23083 | 169 | 19 | 4.3 | - | |
| 23080 | 24423 | 447 | 50 | 4.9 | GAGG-8 | |
| 24427 | 26382 | 651 | 73 | 4.5 | - | |
| 26461 | 26586 | 41 | 4.4 | 4.4 | GAG-9 | |
| 26590 | 27933 | 447 | 47. | 3.9 | AGGAGG-9 | |
| 27949 | 29031 | 360 | 40 | 4.2 | GTGA-8 | |
| 29040 | 29219 | 59 | 6.4 | 3.8 | GAGGTGA-4 | |
| 29222 | 29572 | 116 | 12 | 3.9 | - | |
| 29576 | 30451 | 291 | 33 | 4.6 | GGAGG-9 | |
| 30444 | 30761 | 105 | 11 | 4.1 | - | |
| 30758 | 31210 | 150 | 17 | 4.8 | AGG-10 | |
| 31207 | 31734 | 175 | 20 | 4.5 | GGAGGT-5 | |
| 31766 | 32680 | 304 | 32 | 3.8 | GAGGTGA-7 | |
| 32680 | 33177 | 165 | 18 | 4.0 | AGGAGGTGA-8 | |
| 33281 | 34408 | 375 | 38 | 4.1 | - | |
| 34444 | 34731 | 95 | 10 | 4.8 | - | |
| 34771 | 35439 | 222 | 24 | 4.0 | - | |
| 35446 | 36633 | 395 | 42 | 4.1 | TGA-7 | |
| 36634 | 38226 | 530 | 52 | 3.7 | AGGAGGTG-10 | |
| 38229 | 40979 | 916 | 100 | 4.0 | GGAGGTG-15 | |
| 41059 | 41400 | 113 | 12 | 3.8 | GGAG-6 | |
| 41403 | 41843 | 146 | 16 | 4.6 | AGGTG-9 | |
| 41840 | 42151 | 103 | 11 | 3.9 | GGTGA-4 |
Orfs are in the forward direction unless indicated by a -ve sign. v indicates a valine start. aa indicates the number of amino acids. Mr is the molecular mass × 10-3, rounded to the nearest 100. pI is the isoelectric point rounded to one decimal place. rbs/distance is the ribosome binding site sequence and its distance from the start codon.
BJ1 ORFs with identifiable BlastX matches to data base entries.
| ORF | Homologs (% Identity) |
| 9 | 59% similarity (E 10-8) to ORF58 halovirus |
| 10 | 54% similarity (E10-5) to protein |
| 17 | similarity (E 10-13) to protein from |
| 55 | similarity (E 10-7) to a protein of |
| 24 | No significant match to any described protein. InterPro suggests DNA binding protein |
| 5 | similarity (E 10-3) to bacterial proteins with DnaJ domain; role in DNA replication? |
| 6 | 65% similarity (E 10-6) to protein (AAG20925) of |
| 16 | 60% similarity (E 10-67) to a |
| 20 | 54% similarity (E 10-13) to halovirus |
| 21 | 66% similarity (E 10-17) to the |
| 31 | similarity is to a |
| 35 | DNA helicase? 62% similarity (E 10-128) to |
| 39 | 68% similarity (E 0.05) to ArsR-like transcriptional regulator (CAJ51299) from |
| 43 | 56% similarity (E 10-37), to halovirus HF1 protein (AAO61337.1) which may be a YonJ like, small subunit of the DNA polymerase, (COG1311) |
| 48 | 54% similarity to |
| 49 | 43% similarity (E 0.01) to |
| 50 | 54% similarity (E 10-77) to the putative portal protein (NP_665924) of |
| 52 | 49% similarity (E 10-13) to the capsid protein gpD (AAM88683) of halovirus |
| 53 | 48% similarity (E 10-29) to hp32 (CAA56442) of |
| 51 | 51% similarity (E 10-15) to |
Predicted ORFs in the sequence inserted into ORF 32 and their highest BlastX matches. Nucleotide numbering is from the 5' end of the insertion sequence; nucleotide 8685 corresponds to nucleotide 14790 in the BJ1 genomic sequence. The sequence at the site of insertion was tgctcggtcgtcaa/CGACGCCGACGACGGCGA; lower case variant, upper case BJ1 ORF 32. Orfs are in the forward direction with respect to the virus genome unless indicated by a - sign. * indicates a truncated ORF because of incomplete sequencing (V10) or the insertion event itself (V1 and ORF32) aa indicates the number of amino acids.
| ORF | Position | Size | Homologs (% Identity) | |
| Start | Stop | (aa) | ||
| V10* | 2 | 277 | * | 67% – ornithine cyclodeaminase |
| V9- | 749 | 351 | 132 | 36% – hypothetical protein VNG6157H |
| V8- | 1910 | 843 | 355 | 70% – cell division protein pelota |
| V7- | 3051 | 1936 | 371 | 28% – hypothetical protein NP4342A |
| V6 | 3346 | 3753 | 135 | 38% – hypothetical protein rrnAC2062 |
| V5 | 3912 | 4685 | 257 | 38% – Alpha/beta hydrolase fold protein |
| V4 | 4747 | 5058 | 103 | 75% – hypothetical protein HQ2797A |
| V3- | 7408 | 5900 | 502 | 73% – RtcB-like protein 1 |
| V2- | 7934 | 7503 | 143 | 61% – hypothetical protein NP3986A |
| V1* | 8326 | 8684 | 119* | 64% – 3-hydroxy-3-methylglutaryl-coenzyme A reductase (HMG-CoA reductase) |
| 32* | 8685 | 9059 | * | 100% Phage BJ1 hypothetical protein |