| Literature DB >> 20973996 |
Scott R Cornman1, Michael C Schatz, Spencer J Johnston, Yan-Ping Chen, Jeff Pettis, Greg Hunt, Lanie Bourgeois, Chris Elsik, Denis Anderson, Christina M Grozinger, Jay D Evans.
Abstract
BACKGROUND: The ectoparasitic mite Varroa destructor has emerged as the primary pest of domestic honey bees (Apis mellifera). Here we present an initial survey of the V. destructor genome carried out to advance our understanding of Varroa biology and to identify new avenues for mite control. This sequence survey provides immediate resources for molecular and population-genetic analyses of Varroa-Apis interactions and defines the challenges ahead for a comprehensive Varroa genome project.Entities:
Mesh:
Substances:
Year: 2010 PMID: 20973996 PMCID: PMC3091747 DOI: 10.1186/1471-2164-11-602
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 12C genome size estimate for . 2C genome size estimate for V. destructor based on flow cytometry, normalized to the Drosophila virilis genome (A). The V. destructor fluorescence peak (B) corresponds to a genome size of 0.577 picograms or 565 ± 3 Mbp.
Statistics of the Varroa destructor genome sequence survey
| Statistic | Initial assembly | Filtered assembly |
|---|---|---|
| Number of contigs | 271,543 | 184,094 |
| Sum of contig length (Mbp) | 318 | 294 |
| Maximum contig length (bp) | 18,703 | 16,332 |
| Mean contig length (bp) | 1,170.4 | 1,597.5 |
| N50 contig length (bp) | 2,107 | 2,262 |
| Contigs ≥ 1,000 bp | 107,195 | 105,621 |
| Contigs ≥ 5,000 bp | 5,407 | 5,374 |
| Contigs ≥10,000 bp | 120 | 118 |
| Mean coverage | 4.3 | 5.0 |
Figure 2Scatterplot of contig coverage versus length for the unfiltered assembly. Fold sequence coverage of assembled contigs is distributed relatively narrowly around the mean, with few outliers. Contigs with coverage within the range 2-10× (shaded green) were included for further analysis as putative nuclear genomic contigs of V. destructor (see text for details). Short, high-coverage contigs are predominantly mitochondrial or ribosomal in nature. A cluster of longer contigs approximately with 30-50× coverage appear to derive from a baculovirus (see text); contigs in this coverage range are shaded in yellow.
Figure 3Scatterplot of contig G+C content versus coverage. Scatterplot of contig G+C content versus fold sequence coverage shows a clear mode of G+C content in the range of 32-56%. The long tail of low G+C contigs includes low-complexity sequences such as AT repeats, as well as mitochondrial and ribosomal contigs. A secondary mode of high G+C contigs is also apparent; these contigs include many BLAST matches to the bacterial order Actinomycetales (see text).
Figure 4Evidence that high G+C contigs are bacterial. Contigs with a BLASTX match to the bacterial order Actinomycetales (at an expectation of 10-8 or less), plotted as a function of G+C content and contig length. Points are color-coded according to the taxonomy of the best GenBank match overall. There is a clean separation between contigs with lower G+C that are more similar, by percent identity and expectation of BLASTX sequence alignments, to arthropod sequences and contigs with higher G+C that are more similar to actinomycete sequences.
Frequency of infection of individual mites by a novel actinomycete bacterium identified in the V. destructor sequence survey
| Class | Present | Absent |
|---|---|---|
| Male | 4 | 8 |
| Female | 11 | 7 |
| Nymph | 2 | 16 |
| Egg | 0 | 5 |
Presence or absence of the bacterium is based on a PCR assay (see text for details).
Evidence for a novel virus related to the Baculoviridae in the sequenced sample of Varroa destructor
| ORF | Pfam domain description | Expectation | Reported in Baculoviridae? |
|---|---|---|---|
| VDK00007920-4466_1 | Ribonucleotide reductase, barrel domain | 5.50E-231 | Yes |
| VDK00121146-847_1 | Ribonucleotide reductase, small chain | 1.30E-125 | Yes |
| VDK00001240-6963_1 | Thymidylate synthase | 3.60E-114 | Yes |
| VDK00008686-4345_1 | Kinesin motor domain | 3.90E-060 | No* |
| VDK00103915-1040_1 | Reverse transcriptase | 7.80E-035 | Yes |
| VDK00064516-1660_1 | BRO family, N-terminal domain | 1.60E-017 | Yes |
| VDK00001041-7192_1 | Chitin binding domain | 1.50E-010 | Yes |
| VDK00192648-381_1 | Pacifastin inhibitor (LCMII) | 3.60E-008 | No |
| VDK00158309-530_2 | Baculovirus hypothetical protein | 1.10E-006 | Yes |
| VDK00025611-2890_1 | Matrixin (matrix metalloprotease) | 3.80E-006 | Yes |
| VDK00139482-672_1 | Protein of unknown function (DUF666) | 3.90E-006 | Yes |
| VDK00179897-428_1 | Phosphatidylinositol-specific phospolipase | 5.60E-006 | No* |
| VDK00008449-4382_2 | Protein of unknown function (DUF686) | 9.90E-006 | Yes |
| VDK00099267-1099_1 | Zinc knuckle (retroviral gag protein) | 1.20E-004 | Yes |
| VDK00107278-999_1 | Baculovirus BRO family, N-terminal domain | 1.80E-004 | Yes |
| VDK00158309-530_3 | Gamma-glutamyltranspeptidase | 2.60E-004 | No* |
| VDK00071395-1528_1 | Collagen triple helix repeat | 8.20E-004 | No* |
| VDK00038309-2357_1 | Collagen triple helix repeat | 9.20E-004 | No* |
| VDK00202991-345_3 | Amelogenin (cell adhesion protein) | 2.20E-003 | No |
| VDK00042225-2226_1 | Alpha/beta hydrolase fold | 2.40E-003 | No* |
| VDK00021332-3129_1 | Phage integrase family | 3.00E-003 | No* |
| VDK00104983-1027_1 | Collagen triple helix repeat | 4.20E-003 | No* |
| VDK00202991-345_2 | Collagen triple helix repeat | 4.20E-003 | No* |
| VDK00043167-2196_3 | Protein of unknown function (DUF686) | 9.10E-003 | Yes |
| VDK00073176-1494_3 | Gamma-glutamyltranspeptidase | 9.20E-003 | No* |
| VDK00048355-2046_1 | Matrixin (matrix metalloprotease) | 0.01 | Yes |
Pfam matches are shown for methionine-initiated ORFs of 90 amino acids or more that were encoded by contigs at least 1 kb in length and with 30×-60× coverage. Collectively the matches are consistent with a viral origin and include several domains characteristic of Baculoviridae. Only matches with an expectation of 0.01 or less are shown. *Domain reported from at least one virus in Pfam [36] and Interpro [66] databases. **Present in one baculovirus accession (Q65353 of UniProt [67]) due to a retrotransposon insertion.
The Varroa destructor glycolysis/gluconeogenesis pathway is well represented in the genome sequence survey
| Glycolysis/gluconeogesis enzyme | KEGG ID | Annotated in | Annotated in | Closest contig match in the | Strand | Start | Stop | E-value |
|---|---|---|---|---|---|---|---|---|
| 6-phosphofructokinase | K00850 | X | x | VDK00166959 | + | 274 | 486 | 6.00E-027 |
| acetyl-CoA synthetase | K01895 | X | x | VDK00052872 | - | 313 | 1554 | 3.00E-085 |
| aldehyde dehydrogenase | K00128 | X | x | VDK00013090 | + | 2385 | 3224 | 1.00E-138 |
| dihydrolipoamide dehydrogenase | K00382 | X | x | VDK00011534 | - | 343 | 917 | 1.00E-020 |
| enolase | K01689 | X | x | VDK00029529 | - | 752 | 2211 | 7.00E-143 |
| fructose-1,6-bisphosphatase | K03841 | X | x | VDK00132162 | + | 158 | 355 | 1.00E-023 |
| fructose-bisphosphate aldolase | K01623 | X | x | VDK00012888 | + | 801 | 2121 | 1.00E-073 |
| glucose-6-phosphate isomerise | K01810 | X | x | VDK00034893 | + | 2 | 145 | 6.00E-018 |
| glyceraldehyde 3-phosphate dehydrogenase | K00134 | X | x | VDK00020468 | - | 142 | 2477 | 2.00E-077 |
| hexokinase | K00844 | X | x | VDK00023511 | - | 313 | 1522 | 2.00E-094 |
| phosphoenolpyruvate carboxykinase | K01596 | X | x | VDK00063370 | - | 1166 | 1684 | 1.00E-038 |
| phosphoglucomutase | K01835 | X | x | VDK00033184 | - | 236 | 478 | 3.00E-024 |
| phosphoglycerate kinase | K00927 | X | x | VDK00052433 | + | 351 | 1229 | 4.00E-032 |
| phosphoglycerate mutase | K01834 | X | x | VDK00033074 | - | 257 | 2550 | 5.00E-031 |
| pyruvate dehydrogenase E1 component, subunit alpha | K00161 | X | x | VDK00079694 | - | 2 | 1382 | 8.00E-043 |
| pyruvate dehydrogenase E1 component, subunit beta | K00162 | X | x | VDK00041927 | - | 527 | 2110 | 2.00E-020 |
| pyruvate dehydrogenase E2 component | K00627 | X | x | VDK00062508 | - | 165 | 416 | 4.00E-033 |
| pyruvate kinase | K00873 | X | x | VDK00096436 | - | 676 | 1131 | 4.00E-067 |
| S-(hydroxymethyl)glutathione dehydrogenase | K00121 | X | x | VDK00003663 | - | 1731 | 2700 | 4.00E-125 |
| glucose-6-phosphate 1-epimerase | K01792 | X | x | |||||
| triosephosphate isomerise | K01803 | X | x | |||||
| aldose 1-epimerase | K01785 | X | ||||||
| L-lactate dehydrogenase | K00016 | X | ||||||
Putative pathway components listed below were identified by BLASTX, using KEGG-annotated components [40] from Anopheles gambiae and Ixodes scapularis as search queries.
Protein-encoding transposable elements in V. destructor
| Class | Family | Number identified |
|---|---|---|
| DNA | TcMar-Mariner | 6511 |
| DNA | TcMar-Tc1 | 357 |
| DNA | hAT | 58 |
| DNA | TcMar-Fot1 | 27 |
| DNA | MuDR | 12 |
| Helitron | Helitron | 338 |
| LINE | R1 | 981 |
| LINE | L2 | 93 |
| LINE | CR1 | 83 |
| LINE | BovB | 47 |
| LINE | Jockey | 28 |
| LINE | L1 | 17 |
| LTR | Gypsy | 914 |
| LTR | Pao | 102 |
| LTR | Copia | 74 |
| LTR | Gypsy-Cigr | 31 |
Transposable elements identified with the protein-level repeat masking mode of RepeatMasker [44]. Class abbreviations are DNA, type II DNA transposon; LINE, long interspersed nuclear element; LTR, long-terminal repeat retrotransposon. Family designations are those of RepBase [68].
Figure 5A distance-measure dendrogram of five arthropod taxa based on aligned blocks of conserved peptides. The unrooted tree was constructed with Phylip [64] using neighbor joining and the JTT exchange matrix as described in the text. The tree was drawn with TreeView [65].
Distribution of contigs that were designated bacterial by BLAST analysis, sorted by phylum
| Phylum | Number of contigs |
|---|---|
| Actinomycete | 1035 |
| α-proteobacteria | 10 |
| Aquificae | 1 |
| Bacteroides | 2 |
| β-proteobacteria | 13 |
| Cyanobacteria | 3 |
| δ-proteobacteria | 3 |
| Euryarachaeota | 1 |
| Firmicutes | 4 |
| γ-proteobacteria | 12 |