| Literature DB >> 26978387 |
Fábio P Dornas1, Felipe L Assis2, Sarah Aherfi3,4, Thalita Arantes5, Jônatas S Abrahão6, Philippe Colson7,8, Bernard La Scola9,10.
Abstract
In 2003, Acanthamoeba polyphaga mimivirus (APMV) was discovered as parasitizing Acanthamoeba. It was revealed to exhibit remarkable features, especially odd genomic characteristics, and founded viral family Mimiviridae. Subsequently, a second family of giant amoebal viruses was described, Marseilleviridae, whose prototype member is Marseillevirus, discovered in 2009. Currently, the genomes of seven different members of this family have been fully sequenced. Previous phylogenetic analysis suggested the existence of three Marseilleviridae lineages: A, B and C. Here, we describe a new member of this family, Brazilian Marseillevirus (BrMV), which was isolated from a Brazilian sample and whose genome was fully sequenced and analyzed. Surprisingly, data from phylogenetic analyses and comparative genomics, including mean amino acid identity between BrMV and other Marseilleviridae members and the analyses of the core genome and pan-genome of marseilleviruses, indicated that this virus can be assigned to a new Marseilleviridae lineage. Even if the BrMV genome is one of the smallest among Marseilleviridae members, it harbors the second largest gene content into this family. In addition, the BrMV genome encodes 29 ORFans. Here, we describe the isolation and genome analyses of the BrMV strain, and propose its classification as the prototype virus of a new lineage D within the family Marseilleviridae.Entities:
Keywords: Brazilian marseillevirus; Marseilleviridae; Marseillevirus; genomic analyses; giant virus; lineage D
Mesh:
Substances:
Year: 2016 PMID: 26978387 PMCID: PMC4810266 DOI: 10.3390/v8030076
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Analysis of Brazilian Marseillevirus ORFans by Position-Specific Interative Basic Local Alignment Search Tool (PSI-BLAST)—ORFs with no significant similarity found are not presented.
| ORFan ID | Protein Identification | Organism (1st and 2nd Best Hits) | Interation | Max Score | Total Score | Query Cover | Ident | Accession Number | |
|---|---|---|---|---|---|---|---|---|---|
| ORF_L46 | Methyltransferase |
| 3 | 166 | 166 | 93% | 4 × 10−45 | 12% | WP_025395836.1 |
| Methyltransferase |
| 165 | 165 | 93% | 7 × 10−45 | 12% | WP_015242269.1 | ||
| ORF_R48 | Transglycosylase |
| 2 | 256 | 256 | 93% | 4 × 10−78 | 24% | WP_030603268.1 |
| Transglycosylase |
| 229 | 229 | 93% | 9 × 10−68 | 22% | WP_046261419.1 | ||
| ORF_R84 | hypothetical protein |
| 3 | 58,1 | 58,1 | 20% | 8 × 10−5 | 18% | WP_025666489.1 |
| hypothetical protein |
| 83,5 | 83,5 | 20% | 2 × 10−15 | 34% | WP_021691485.1 | ||
| ORF_R86 | cysteine protease ATG4B |
| 2 | 145 | 145 | 83% | 10−38 | 23% | KFP75383.1 |
| cysteine protease ATG4B |
| 135 | 135 | 83% | 10−34 | 23% | KFV59860.1 | ||
| ORF_L94 | cytidine and deoxycytidylate deaminase |
| 3 | 102 | 102 | 77% | 2 × 10−25 | 24% | AGE49630.1 |
| cytidine and deoxycytidylate deaminase |
| 101 | 101 | 69% | 7 × 10-25 | 23% | AGE55798.1 | ||
| ORF_R115 | DNA mismatch repair protein MutL |
| 3 | 170 | 170 | 95% | 2 × 10−46 | 18% | WP_012693648.1 |
| DNA mismatch repair protein MutL |
| 145 | 145 | 80% | 2 × 10−39 | 22% | WP_034401941.1 | ||
| ORF_R123 | protein-L-isoaspartate O-methyltransferase |
| 2 | 124 | 124 | 64% | 8 × 10−31 | 29% | WP_049761110.1 |
| protein-L-isoaspartate O-methyltransferase |
| 124 | 124 | 64% | 10−30 | 29% | WP_043591788.1 | ||
| ORF_R124 | ABC transporter substrate-binding protein |
| 2 | 170 | 170 | 83% | 10−48 | 29% | WP_031569037.1 |
| ABC transporter substrate-binding protein |
| 127 | 127 | 87% | 2 × 10-32 | 23% | WP_027693958.1 | ||
| ORF_L133 | conserved signaling intermediate in Toll pathway |
| 2 | 142 | 142 | 87% | 4 × 10−37 | 27% | XP_008831822.1 |
| conserved signaling intermediate in Toll pathway |
| 141 | 141 | 87% | 7 × 10−37 | 27% | XP_008831824.1 | ||
| ORF_R218 | diguanylate phosphodiesterase |
| 3 | 147 | 147 | 70% | 3 × 10−38 | 18% | WP_008217346.1 |
| diguanylate phosphodiesterase |
| 145 | 145 | 70% | 8 × 10−38 | 19% | WP_048663292.1 | ||
| ORF_R239 | rho GTPase-activating protein 1 |
| 2 | 107 | 107 | 41% | 2 × 10−24 | 32% | XP_001490021.2 |
| rho GTPase-activating protein 1 isoform X2 |
| 107 | 107 | 41% | 2 × 10−24 | 32% | XP_005598135.1 | ||
| ORF_L254 | leucine-rich repeat-containing protein 9-like |
| 2 | 130 | 130 | 64% | 2 × 10−31 | 28% | XP_006632383.1 |
| Peroxidase |
| 122 | 122 | 64% | 10−30 | 29% | WP_009394707.1 | ||
| ORF_L292 | coiled-coil and C2 domain-containing protein 1A isoform X5 |
| 3 | 112 | 112 | 62% | 10−25 | 29% | XP_009191945.1 |
| coiled-coil and C2 domain-containing protein 1A isoform X8 |
| 112 | 112 | 62% | 10−25 | 29% | XP_011949500.1 | ||
| ORF_L300 | ATP-dependent helicase |
| 3 | 120 | 120 | 96% | 3 × 10−28 | 20% | WP_002822412.1 |
| ATP-dependent helicase/nuclease subunit A |
| 117 | 117 | 94% | 7 × 10−28 | 14% | GAO99721.1 | ||
| ORF_R303 | glycoside hydrolase family 9 |
| 4 | 104 | 104 | 93% | 4 × 10−23 | 26% | WP_023062725.1 |
| glycosyl hydrolase |
| 103 | 103 | 93% | 6 × 10−23 | 24% | WP_020457778.1 | ||
| ORF_R304 | aggrecan core protein |
| 3 | 118 | 696 | 100% | 10−27 | 24% | XP_007906559.1 |
| aggrecan core protein |
| 103 | 926 | 100% | 2 × 10−22 | 30% | XP_008638374.1 | ||
| ORF_L309 | peptide synthetase |
| 6 | 424 | 424 | 97% | 2 × 10−136 | 13% | WP_041780594.1 |
| ORF_L324 | N-acetylneuraminic acid mutarotase |
| 3 | 212 | 212 | 78% | 6 × 10−63 | 15% | WP_038216942.1 |
| N-acetylneuraminic acid mutarotase |
| 211 | 211 | 78% | 2 × 10−62 | 15% | WP_039481213.1 | ||
| ORF_L337 | cytochrome C |
| 3 | 178 | 178 | 99% | 2 × 10−50 | 17% | WP_006101072.1 |
| cytochrome C |
| 177 | 177 | 99% | 5 × 10−50 | 20% | WP_017318476.1 | ||
| ORF_R351 | extracellular dioxygenase |
| 3 | 151 | 151 | 61% | 4 × 10−39 | 25% | CCX09620.1 |
| Intradiol ring-cleavage dioxygenase |
| 143 | 143 | 67% | 3 × 10−36 | 17% | KGO45757.1 | ||
| ORF_L367 | regulator of telomere elongation helicase 1 |
| 3 | 170 | 170 | 94% | 10−45 | 17% | XP_013000054.1 |
| regulator of telomere elongation helicase 1 |
| 169 | 169 | 94% | 2 × 10−45 | 18% | KGM00023.1 | ||
| ORF_L375 | putative protein binding surface, polypeptide binding |
| 4 | 137 | 137 | 90% | 2 × 10−36 | 25% | CCA16909.1 |
| ORF_R485 | ephrin type-B receptor 4 |
| 3 | 115 | 115 | 58% | 7 × 10−27 | 21% | XP_012352012.1 |
| ephrin type-B receptor 4 |
| 115 | 115 | 59% | 9 × 10−27 | 21% | XP_012614574.1 |
Figure 1Genome alignment of BrMV and other marseillevirus strains. The figure shows genome architecture and synteny. Schematic genome alignment diagram was obtained using the MAUVE software package [37].
Figure 2Distribution of orthologous gene clusters among marseillevirus lineages (A) and the proportion of orthologous gene clusters shared between and within lineages (B). For these analyses, we used the genome content of viruses from lineage A (Marseillevirus, Cannes8 virus, Melbournevirus and Senegalvirus), lineage B (Lausannevirus), Lineage C (Tunisvirus and Insectomime virus), and Lineage D (BrMV). The letters at each ellipsis’s top are related to the marseillevirus lineages A–D.
Figure 3Average amino acid identity. In this analysis, estimates were reached using both best hits (one-way AAI) and reciprocal best hits (two-way AAI) between two datasets of proteins from the BrMV isolate and representative strains from marseillevirus groups (A–C). Plots (A–C) demonstrate the amino acid comparison between BrMV and marseilleviruses from groups (A–C); Plots (D–F) compare marseilleviruses from different lineages.
Figure 4Pan-genome (red line) and core genome (green line) size of the Marseilleviridae family. Numbers into grey boxes refer to the gene number encoded by each virus strain. Numbers at line nodes represent the cumulative COG numbers after the inclusion of a new genome. Number in (red and green) circles demonstrate the variation of COGs after the inclusion of sequences from a different lineage. Colors on the graph identify viruses from the same lineage, alongside the proposed new lineage D consisting of BrMV.
Figure 5Hierarchical clustering tree based on phyletic patterns. Phylogeny based on the presence-absence matrix of 5443 NCVOG (clusters of orthologous genes shared by nucleocytoplasmic large DNA viruses). The Pearson correlation was used as metric distance, and the scale bar means the branch time.
Figure 6Phylogenetic reconstruction based on a concatenated alignment of the five core genes, DNA polymerase B, major capsid protein, VV-A18 helicase, D6/D11 helicase and D5 helicase. The amino acid sequences were aligned using Muscle and the tree was built using FastTree. Pithovirus sibericum was used as an outgroup. Branches delineating the different lineages of the family Marseilleviridae are colored (dark blue for lineage A, purple for lineage B, blue for lineage C and red for lineage D).
Figure 7Supertree based on five phylogenetic trees. The five phylogenetic trees were built using FastTree and involved the DNA polymerase B, the major capsid protein, the VV-A18 helicase, the D6/D11 helicase and the D5 helicase amino acid sequences of the marseilleviruses and Pithovirus sibericum. The tree is rooted with Pithovirus sibericum used as an outgroup. Branches delineating the different lineages of the family Marseilleviridae are colored (dark blue for lineage A, purple for lineage B, blue for lineage C and red for lineage D).