| Literature DB >> 17594494 |
Christopher A Desjardins1, Dawn E Gundersen-Rindal, Jessica B Hostetler, Luke J Tallon, Roger W Fuester, Michael C Schatz, Monica J Pedroni, Douglas W Fadrosh, Brian J Haas, Bradley S Toms, Dan Chen, Vishvanath Nene.
Abstract
BACKGROUND: Bracoviruses (BVs), a group of double-stranded DNA viruses with segmented genomes, are mutualistic endosymbionts of parasitoid wasps. Virus particles are replication deficient and are produced only by female wasps from proviral sequences integrated into the wasp genome. Virus particles are injected along with eggs into caterpillar hosts, where viral gene expression facilitates parasitoid survival and therefore perpetuation of proviral DNA. Here we describe a 223 kbp region of Glyptapanteles indiensis genomic DNA which contains a part of the G. indiensis bracovirus (GiBV) proviral genome.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17594494 PMCID: PMC1919376 DOI: 10.1186/1471-2180-7-61
Source DB: PubMed Journal: BMC Microbiol ISSN: 1471-2180 Impact factor: 3.605
Proviral genome segment composition of 60 GiBV BAC clones.
| 1 | 7 | 7 | 20 |
| 2 | 4 | 5 | 30 |
| 3 | 2 | 1 | 60 |
| 4 | 1 | 3 | 30 |
| 5 | 1 | 1 | 60 |
| 6 | 1 | 3 | 60 |
| 7 | 1 | 1 | 60 |
Non-overlapping sets of proviral genome segments found in BAC clones, arbitrarily designated as set 1–7, are shown in column 1. The second column shows the number of proviral genome segments identified in each set. The third column shows the number of BACs which tested positive for that set, and the fourth column shows the number of BACs that were tested for that set. Some segment sets were tested for on less than 60 BAC clones, as once multiple clones were identified for a set of proviral genome segments, the primer pairs representing those sets of segments were removed from PCR experiments to reduce the number of PCRs needed to identify the entire proviral genome.
Figure 1Structural organization of GiBV proviral locus 1. Proviral genome segments are labeled 1p-8p, with the square and pointed ends representing the 5' and 3' ends, respectively, relative to the putative excision motif. Inter-segmental regions are labeled isg1-isg7, and sequence regions outside the proviral genome segment sequences are labeled I-IV. The flanking tandem repeat regions (solid black squares) are labeled L1R1 and L1R2, and their structure is shown in the open boxes as black boxes in parentheses followed by the copy number of repeat as a subscript. The 2 BAC sequences were joined in isg4 (*) allowing the entirety of each proviral segment sequence to originate from a single BAC clone. Colored boxes represent genes; grey boxes are non-packaged genes, light green boxes are hypothetical proteins without gene family assignment, and the remaining colors represent different gene families.
Features of the regions of GiBV proviral locus 1
| I | 1 – 23133 | 23133 | 31 (47/27) | 22 | 4 |
| L1R1 | 23134 – 29250 | 6117 | 38 | 0 | 0 |
| II | 29251 – 34177 | 4927 | 35 (42/32) | 36 | 1 |
| 1p | 34178 – 54542 | 20365 | 37 (38/36) | 38 | 14 |
| isg1 | 54543 – 54769 | 227 | 30 | 0 | 0 |
| 2p | 54770 – 78277 | 23508 | 36 (44/34) | 25 | 8 |
| isg2 | 78278 – 78394 | 117 | 29 | 0 | 0 |
| 3p | 78395 – 94733 | 16339 | 37 (41/35) | 35 | 6 |
| isg3 | 94734 – 94903 | 170 | 26 | 0 | 0 |
| 4p | 94904 – 108614 | 13711 | 36 (41/31) | 42 | 4 |
| isg4 | 108615 – 110126 | 1512 | 27 | 0 | 0 |
| 5p | 110127 – 135963 | 25837 | 37 (41/34) | 41 | 11 |
| isg5 | 135964 – 136085 | 122 | 28 | 0 | 0 |
| 6p | 136086 – 155462 | 19377 | 37 (37/37) | 33 | 9 |
| isg6 | 155463 – 156602 | 1140 | 29 | 0 | 0 |
| 7p | 156603 – 179005 | 22403 | 36 (41/32) | 35 | 7 |
| isg7 | 179006 – 187374 | 8369 | 25 | 0 | 0 |
| 8p | 187375 – 197431 | 10057 | 38 (42/34) | 47 | 3 |
| III | 197432 – 204112 | 6681 | 33 (43/28) | 33 | 2 |
| L1R2 | 204113 – 211240 | 7128 | 37 | 0 | 0 |
| IV | 211241 – 222657 | 11417 | 30 (43/27) | 22 | 2 |
Coordinates are with respect to the sequence of the entire locus. The % G+C column is divided into coding (c) and non-coding (n-c) for regions predicted to encode genes.
Figure 2Neighbor-joining clustering of the regions of proviral locus 1 based on relative dinucleotide frequencies. All proviral genome segments (1p-8p) group together, as do the regions outside the flanking repeats (I and IV). The scale represents the normalized Euclidean distance between regions. Regions < 500 bp (isg1–3, 5) and the flanking repeats (L1R1 and L1R2) were excluded from the analysis, as they have skewed dinucleotide frequencies.
Figure 3Nucleotide conservation extended 30 bp in both directions around the GCT excision site. A) 5' motif of proviral genome segments in GiBV proviral locus 1, in which sequence to the left of the motif represents inter-segmental sequences and sequence to the right of the motif represents proviral genome segment sequences. B) 3' motif of proviral genome segments GiBV proviral locus 1, in which the positions of inter-segmental and proviral genome segment sequences are reversed with respect to A). C) Extended motif from the 8 viral genome segments in proviral locus 1. D) Extended motif from all 30 CcBV viral genome segments. E) Extended motif from 13 of 15 MdBV viral genome segments.
Annotation of proviral locus 1
| GIP_L1_00010 | I | 500 | 4 | FL(2)D protein | |||
| GIP_L1_00020 | I | 369 | 2 | Trans-2-enoyl-CoA reductase | |||
| GIP_L1_00030 | I | 240 | 2 | oxidored-nitro domain-like protein | |||
| GIP_L1_00040 | I | 562 | 3 | hypothetical protein | |||
| GIP_L1_00050 | II | 599 | 4 | s | 5' nucleotidase | ||
| GIP_L1_00060 | 1p | 165 | 1 | s, t | hypothetical protein | 3 | * |
| GIP_L1_00070 | 1p | 98 | 1 | lectin-like protein | * | ||
| GIP_L1_00080 | 1p | 210 | 1 | s, t | conserved hypothetical protein | 3 | 0.29 |
| GIP_L1_00090 | 1p | 266 | 1 | s, t | conserved hypothetical protein | 0.51 | |
| GIP_L1_00100 | 1p | 304 | 1 | s | CrV1-like protein | 5 | 0.77 |
| GIP_L1_00110 | 1p | 161 | 1 | s | Lectin C-type domain | 0.54 | |
| GIP_L1_00120 | 1p | 138 | 1 | conserved hypothetical protein | 3 | 0.81 | |
| GIP_L1_00130 | 1p | 133 | 0 | s | Cystatin domain | 0.38 | |
| GIP_L1_00140 | 1p | 341 | 1 | s | CrV1-like protein | 5 | 0.51 |
| GIP_L1_00150 | 1p | 195 | 1 | s | hypothetical protein | 5 | 1.04 |
| GIP_L1_00160 | 1p | 104 | 1 | hypothetical protein | * | ||
| GIP_L1_00170 | 1p | 219 | 1 | s, g | conserved hypothetical protein | 7 | * |
| GIP_L1_00180 | 1p | 78 | 0 | s | hypothetical protein | * | |
| GIP_L1_00190 | 1p | 198 | 1 | hypothetical protein | 10 | * | |
| GIP_L1_00200 | 2p | 143 | 1 | s | conserved hypothetical protein | 1 | u |
| GIP_L1_00210 | 2p | 494 | 2 | s | P494 protein | 8 | * |
| GIP_L1_00220 | 2p | 97 | 1 | s | hypothetical protein | 9 | * |
| GIP_L1_00230 | 2p | 147 | 1 | s | conserved hypothetical protein | 1 | * |
| GIP_L1_00240 | 2p | 582 | 2 | s | P494 protein | 8 | * |
| GIP_L1_00250 | 2p | 88 | 1 | hypothetical protein | 9 | * | |
| GIP_L1_00260 | 2p | 147 | 1 | s | conserved hypothetical protein | 1 | * |
| GIP_L1_00270 | 2p | 253 | 1 | s | conserved hypothetical protein | * | |
| GIP_L1_00280 | 3p | 320 | 1 | s | conserved hypothetical protein | * | |
| GIP_L1_00290 | 3p | 354 | 1 | s | conserved hypothetical protein | 12 | 0.09 |
| GIP_L1_00300 | 3p | 340 | 1 | s, g | P325 protein | 1 | 0.56 |
| GIP_L1_00310 | 3p | 226 | 1 | s | conserved hypothetical protein | 7 | 0.56 |
| GIP_L1_00320 | 3p | 241 | 1 | s, g | hypothetical protein | 0.29 | |
| GIP_L1_00330 | 3p | 444 | 1 | s | hypothetical protein | 10 | 2.12 |
| GIP_L1_00340 | 4p | 337 | 1 | s, g | P325 protein | 1 | 0.37 |
| GIP_L1_00350 | 4p | 106 | 1 | s | conserved hypothetical protein | 2 | * |
| GIP_L1_00360 | 4p | 597 | 2 | s | Ribonuclease T2 domain | 11 | 1.96 |
| GIP_L1_00370 | 4p | 898 | 2 | conserved hypothetical protein | 4 | 0.64 | |
| GIP_L1_00380 | 5p | 166 | 1 | s, t | hypothetical protein | 3 | * |
| GIP_L1_00390 | 5p | 171 | 1 | s | hypothetical protein | 0.4 | |
| GIP_L1_00400 | 5p | 430 | 1 | s, g | conserved hypothetical protein | 0.55 | |
| GIP_L1_00410 | 5p | 247 | 1 | s | conserved hypothetical protein | 0.06 | |
| GIP_L1_00420 | 5p | 215 | 1 | s | conserved hypothetical protein | 7 | 0.31 |
| GIP_L1_00430 | 5p | 108 | 1 | s, t | hypothetical protein | * | |
| GIP_L1_00440 | 5p | 767 | 1 | s | lipoprotein-like protein | 14 | 0.5 |
| GIP_L1_00450 | 5p | 581 | 0 | s | conserved hypothetical protein | 14 | 0.53 |
| GIP_L1_00460 | 5p | 348 | 1 | s | conserved hypothetical protein | 12 | 0.55 |
| GIP_L1_00470 | 5p | 304 | 1 | s | P325 protein | 1 | 2.21 |
| GIP_L1_00480 | 5p | 170 | 1 | s | conserved hypothetical protein | * | |
| GIP_L1_00490 | 6p | 279 | 1 | g | P325-like protein | 1 | 0.35 |
| GIP_L1_00500 | 6p | 109 | 1 | s | conserved hypothetical protein | 2 | 0.18 |
| GIP_L1_00510 | 6p | 140 | 1 | s | conserved hypothetical protein | 2 | * |
| GIP_L1_00520 | 6p | 100 | 1 | s | conserved hypothetical protein | 2 | 0.57 |
| GIP_L1_00530 | 6p | 101 | 1 | s | conserved hypothetical protein | 2 | 0.21 |
| GIP_L1_00540 | 6p | 106 | 1 | s | conserved hypothetical protein | 2 | * |
| GIP_L1_00550 | 6p | 293 | 1 | Ribonuclease T2 domain | 11 | 0.51 | |
| GIP_L1_00560 | 6p | 118 | 1 | hypothetical protein | * | ||
| GIP_L1_00570 | 6p | 896 | 2 | conserved hypothetical protein | 4 | 0.54 | |
| GIP_L1_00580 | 7p | 1066 | 2 | conserved hypothetical protein | 4 | 0.57 | |
| GIP_L1_00590 | 7p | 478 | 2 | s | conserved hypothetical protein | 6 | 0.75 |
| GIP_L1_00600 | 7p | 119 | 1 | s | conserved hypothetical protein | 13 | * |
| GIP_L1_00610 | 7p | 109 | 1 | s | conserved hypothetical protein | 6 | 0.59 |
| GIP_L1_00620 | 7p | 218 | 1 | conserved hypothetical protein | 0.74 | ||
| GIP_L1_00630 | 7p | 496 | 1 | s | conserved hypothetical protein | 13 | 0.58 |
| GIP_L1_00640 | 7p | 127 | 2 | s | conserved hypothetical protein | 6 | 0.57 |
| GIP_L1_00650 | 8p | 253 | 1 | s, t | EP1-like protein | 6.01 | |
| GIP_L1_00660 | 8p | 177 | 1 | s, g | conserved hypothetical protein | 0.92 | |
| GIP_L1_00670 | 8p | 1132 | 1 | s | dentin-like protein | 0.72 | |
| GIP_L1_00680 | III | 599 | 1 | s | hypothetical protein | ||
| GIP_L1_00690 | III | 130 | 1 | hypothetical protein | |||
| GIP_L1_00700 | IV | 480 | 6 | N-myristoyltransferase | |||
| GIP_L1_00710 | IV | 326 | 3 | Hyaluronidase |
Gene identifier indicates the Genbank locus tag for each predicted gene. Region is the location of genes according to the delineations in Table 2. Sizes of the genes are given in amino acids. Signatures (Sigs) include "s" signal peptide, "t" trans-membrane domain, and "g" potential glycosylphosphatidylinisotol anchor. Family indicates the gene family to which the predicted gene belongs, if any. dN/dS ratios are given when applicable, and an "*" represents insufficient data to calculate a ratio, while a "u" represents a mathematically undefined ratio.
Single Nucleotide Polymorphisms (SNPs) in the viral genome segment sequences
| SNPs | 351 | 107 | 270 | 166 | 216 | 354 | 421 | 174 | 2159 |
| per Kbp | 17.53 | 4.55 | 16.52 | 12.18 | 12.27 | 18.32 | 18.79 | 17.40 | |
| Non-Coding | 232 | 91 | 149 | 74 | 195 | 239 | 269 | 102 | 1351 |
| Coding | 119 | 16 | 121 | 92 | 121 | 115 | 152 | 72 | 808 |
| Synonymous | 36 | 3 | 37 | 27 | 44 | 42 | 46 | 15 | 250 |
| Non-synonymous | 83 | 13 | 84 | 65 | 77 | 73 | 106 | 57 | 558 |
| Coverage | 10.1 | 11.5 | 9.8 | 9.8 | 16.3 | 10.9 | 10.3 | 5.2 | |
| R2 | 0.25* | < 0.01 | 0.14 | < 0.01 | < 0.01 | 0.05 | 0.02 | 0.08 | |
*p < 0.05
Coverage indicates average sequence coverage across the viral genome segment in the whole genome shotgun, and R2 represents the correlation between the number of SNPs and sequence coverage. Only viral genome segment 1 showed a significant (p < 0.05) correlation between SNP density and coverage.
Figure 4Histogram of dN/dS ratios of 39 genes in the viral genome segments.