| Literature DB >> 22180805 |
Jozef I Nissimov, Charlotte A Worthy, Paul Rooks, Johnathan A Napier, Susan A Kimmance, Matthew R Henn, Hiroyuki Ogata, Michael J Allen.
Abstract
The Coccolithoviridae is a recently discovered group of viruses that infect the marine coccolithophorid Emiliania huxleyi. Emiliania huxleyi virus 84 (EhV-84) has a 160 -180 nm diameter icosahedral structure and a genome of approximately 400 kbp. Here we describe the structural and genomic features of this virus, together with a near complete draft genome sequence (~99%) and its annotation. This is the fourth genome sequence of a member of the coccolithovirus family.Entities:
Keywords: algae; coccolithovirus; marine; phycodnavirus; virus
Year: 2011 PMID: 22180805 PMCID: PMC3236045 DOI: 10.4056/sigs.1884581
Source DB: PubMed Journal: Stand Genomic Sci ISSN: 1944-3277
Figure 1Multiple Sequence Alignment of the DNA pol (DNA polymerase) gene of ten coccolithoviruses (EhVs) and five other algal viruses. The evolutionary history was inferred using the Neighbor-Joining method [9]. The bootstrap consensus tree inferred from 1000 replicates is taken to represent the evolutionary history of the taxa analyzed [10]. The percentage of replicate trees in which the associated taxa clustered together in the bootstrap test (1000 replicates) are shown next to the branches when greater than 50% [10]. The evolutionary distances were computed using the Maximum Composite Likelihood method [11] and are in the units of the number of base substitutions per site. All positions containing gaps and missing data were eliminated from the dataset (Complete deletion option). Phylogenetic analyses were conducted in MEGA4 [12].
Figure 2Transmission electron micrograph of an EhV-84 virion.
Classification and general features of Emiliania huxleyi virus 84 according to the MIGS recommendations [15].
| | | | |
|---|---|---|---|
| Current | Domain: | TAS [ | |
| Virion shape | Icosahedral | IDA | |
| MIGS-6 | Habitat | Oceanic, Coastal | TAS [ |
| MIGS-15 | Biotic relationship | Obligate intracellular parasite of | TAS [ |
| MIGS-14 | Pathogenicity | Lytic virus of | TAS [ |
| MIGS-4 | Geographic location | English Channel, UK | TAS [ |
| MIGS-5 | Sample collection | July 26, 1999 | TAS [ |
| MIGS-4.1 | Latitude | 50.15 N | TAS [ |
| MIGS-4.3 | Depth | 15 m | TAS [ |
Evidence codes - IDA: Inferred from Direct Assay; TAS: Traceable Author Statement (i.e., a direct report exists in the literature); NAS: Non-traceable Author Statement (i.e., not directly observed for the living, isolated sample, but based on a generally accepted property for the species, or anecdotal evidence). These evidence codes are from the Gene Ontology project [17].
Genome sequencing project information
| | | |
|---|---|---|
| MIGS-31 | Finishing quality | Finished (>99%) |
| Number of contigs | 9 | |
| Average contig size | 43,980 | |
| Largest contig size | 97,445 | |
| Assembly size (using large contigs) | 395,820 | |
| Assembly coverage ("peak Depth") | 36.16 | |
| Total number of reads used | 28,526 | |
| MIGS-29 | Sequencing platforms | 454 |
| MIGS-30 | Assemblers | Newbler Version 2.3 PostRelease-11.19.2009 |
| MIGS-32 | Gene calling method | Broad Institute Automated Phage Annotation Protocol [ |
| GenBank ID | JF974290 | |
| GOLD ID | N/A | |
| Project relevance | Gordon & Betty Moore Foundation's Marine Microbiology Initiative. Emiliania huxleyi virus 84- G3248. |
De novo assembly metrics for EhV-84
| | | | | | |
|---|---|---|---|---|---|
| 28526 | 9 | 97445 | 395820 | 36.9 ± 3.6 | 94.9 |
*Total contig length does not include bp for gaps of unknown size
Genome statistics of EhV-84
| Attribute | Value | % of totala |
|---|---|---|
| Size (bp) | 396,620 | 100.00% |
| G+C content (bp) | 158,983 | 40.17% |
| Coding region (bp) | 334,463 | 84.33% |
| Total genesb | 486 | 100.00% |
| RNA genes | 4 | 0.82% |
| Protein-coding genes | 482 | 99.18% |
| Protein coding genes with function prediction | 85 | 17.49% |
| Genes in paralog clusters | 15 | 3.09% |
| Genes with signal peptides | 142 | 29.22% |
The total is based on either the size of the genome in base pairs or the total number of protein coding genes in the annotated genome, where applicable.
Includes 18 pseudogenes.
Figure 3Graphical circular map of the 396,620 bp EhV-84 genome. The outside scale is numbered clockwise in bp. Circles 1 and 2 (from outside in) denotes CDSs (forward and reverse strands, respectively). Circle 3 represents the nine contigs of the genome that were used to construct the draft genome using the EhV-86 genome as the reference, and circle 4 is the G+C content.
Number of genes associated with COG functional categories.
| Code | value | %age | Description |
|---|---|---|---|
| G | 1 | 1.41 | Carbohydrate transport and metabolism |
| D | 3 | 4.23 | Cell cycle control, cell division, chromosome partitioning |
| M | 2 | 2.82 | Cell wall/membrane/envelope biogenesis |
| H | 2 | 2.82 | Coenzyme transport and metabolism |
| S | 6 | 8.45 | Function unknown |
| R | 7 | 9.86 | General function prediction only |
| P | 1 | 1.41 | Inorganic ion transport and metabolism |
| U | 2 | 2.82 | Intracellular trafficking, secretion, and vesicular transport |
| I | 5 | 7.04 | Lipid transport and metabolism |
| F | 6 | 8.45 | Nucleotide transport and metabolism |
| O | 8 | 11.27 | Posttranslational modification, prot. turnover, chaperones |
| L | 11 | 15.49 | Replication, recombination and repair |
| A | 4 | 5.63 | RNA processing and modification |
| T | 1 | 1.41 | Signal transduction mechanisms |
| K | 12 | 16.90 | Transcription |
CDSs with functional predictions identified in both EhV-84 and EhV-86 genomes [1]†.
| | | |
|---|---|---|
| ENVG 00127 | ehv014 Longevity-assurance (LAG1) family protein | 100 |
| ENVG 00131 | ehv018 flap endonuclease-1 | 100 |
| ENVG 00133 | ehv020 putative proliferating cell nuclear antigen | 99.61 |
| ENVG 00134 | ehv021 putative serine protease | 100 |
| ENVG 00135 | ehv022 phosphoglycerate mutase family protein | 99.07 |
| ENVG 00136 | ehv023 putative deoxycytidylate (dCMP) deaminase | 98.27 |
| ENVG 00139 | ehv026 ribonucleoside-diphosphate reductase small subunit | 99.38 |
| ENVG 00142 | ehv028 putative lipase | 100 |
| ENVG 00144 | ehv030 putative DNA polymerase delta catalytic subunit | 100 |
| ENVG 001451 | ehv031 putative sterol desaturase | 100 |
| ENVG 00149 | ehv035 putative membrane protein | 100 |
| ENVG 00156 | ehv041 putative endonuclease | 58.33 |
| ENVG 001651 | ehv050 serine myristoyl transferase | 100 |
| ENVG 00176 | ehv060 putative lectin protein | 100 |
| ENVG 001772 | ehv061 putative fatty acid desaturase | 100 |
| ENVG 00178 | ehv062 putative membrane protein | 100 |
| ENVG 00180 | ehv064 DNA-dependent RNA polymerase II largest subunit | 100 |
| ENVG 00181 | ehv064 DNA-dependent RNA polymerase II largest subunit beta | 100 |
| ENVG 001941 | ehv077 putative transmembrane fatty acid elongation protein | 100 |
| ENVG 001961 | ehv079 putative lipid phosphate phosphatase | 100 |
| ENVG 00202 | ehv085 major capsid protein | 99.81 |
| ENVG 00205 | ehv088 putative membrane protein | 99.02 |
| ENVG 00382 | ehv101 putative hydrolase | 100 |
| ENVG 00380 | ehv103 putative vesicle-associated membrane protein | 100 |
| ENVG 00379 | ehv104 putative DNA helicase | 99.81 |
| ENVG 00378 | ehv105 transcription factor S-II (TFIIS) family protein | 100 |
| ENVG 00375 | ehv108 putative DNA-directed RNA polymerase subunit | 100 |
| ENVG 00374 | ehv109 OTU-like cysteine protease | 100 |
| ENVG 00373 | ehv110 putative RING finger protein | 100 |
| ENVG 00370 | ehv113 bifunctional dihydrofolate reductase-thymidylate synthase | 99.79 |
| ENVG 00367 | ehv116 putative membrane protein | 100 |
| ENVG 00366 | ehv117 putative phosphate permease/ sodium-phosphate symporter | 100 |
| ENVG 00356 | ehv128 ERV1/ALR family protein | 98.22 |
| ENVG 00353 | ehv131 putative membrane protein | 95.08 |
| ENVG 00351 | ehv133 putative ATP-dependent protease proteolytic subunit | 97.90 |
| ENVG 00348 | ehv136 putative nucleic acid-binding protein | 98.58 |
| ENVG 00293 | ehv137 putative membrane protein | 24.90 |
| ENVG 00429 | ehv151 putative serine protease | 96.94 |
| ENVG 00399 | ehv166 putative RING finger protein | 97.93 |
| ENVG 00400 | ehv167 putative DNA-directed RNA polymerase subunit | 100 |
| ENVG 00413 | ehv179 Major Facilitator Superfamily protein/transporter | 99.63 |
| ENVG 00423 | ehv187 putative membrane protein | 72.00 |
| ENVG 00445 | ehv192 putative membrane protein | 94.06 |
| ENVG 00478 | ehv207 putative membrane protein | 100 |
| ENVG 00287 | ehv230 putative endonuclease V | 99.22 |
| ENVG 00307 | ehv246 putative lectin protein | 96.46 |
| ENVG 00232 | ehv315 putative membrane protein | 100 |
| ENVG 00264 | ehv349 putative protease | 100 |
| ENVG 00273 | ehv358 putative thioredoxin | 98.73 |
| ENVG 00276 | ehv361 putative serine protease | 97.14 |
| ENVG 00278 | ehv363 putative esterase | 97.72 |
| ENVG 00002 | ehv364 putative membrane protein | 100 |
| ENVG 00295 | ehv364 putative membrane protein | 34.55 |
| ENVG 00035 | ehv397 putative deoxyuridine 5'-triphosphate nucleotidohydrolase | 100 |
| ENVG 00037 | ehv399 putative DNA-directed RNA polymerase subunit | 100 |
| ENVG 00039 | ehv401 putative ribonuclease Hll | 99.52 |
| ENVG 000542 | ehv415 putative delta 9 acyl- lipid fatty acid desaturase | 100 |
| ENVG 00055 | ehv416 putative membrane protein | 100 |
| ENVG 00070 | ehv428 putative ribonucleoside-diphosphate reductase protein | 98.79 |
| ENVG 00074 | ehv431 putative thymidylate kinase | 99.69 |
| ENVG 00077 | ehv434 putative DNA-directed RNA polymerase II subunit B | 99.74 |
| ENVG 00083 | ehv440 putative proliferating cell nuclear antigen | 100 |
| ENVG 00087 | ehv444 putative DNA topoisomerase | 99.64 |
| ENVG 00091 | ehv447 putative serine protease | 100 |
| ENVG 00095 | ehv451 putative protein kinase | 100 |
| ENVG 00097 | ehv453 putative mRNA capping enzyme | 99.47 |
| ENVG 00100 | ehv455 putative sialidase | 100 |
| ENVG 00104 | ehv459 putative nucleic acid independent nucleoside triphosphatase | 100 |
| ENVG 00111 | ehv465 putative thioredoxin protein | 100 |
† including their sequence homologs (coding sequences) in EhV-84 based on TBLASTN (translated nucleotide database)
1 genes involved in sphingolipid biosynthesis
2 genes encoding desaturases