| Literature DB >> 29958294 |
Giovanni Franzo1, Joaquim Segales2,3, Claudia Maria Tucciarone1, Mattia Cecchinato1, Michele Drigo1.
Abstract
Members of the genus Circovirus are host-specific viruses, which are totally dependent on cell machinery for their replication. Consequently, certain mimicry of the host genome features is expected to maximize cellular replicative system exploitation and minimize the recognition by the innate immune system. In the present study, the analysis of several genome composition and codon bias parameters of circoviruses infecting avian and mammalian species demonstrated the presence of quite distinctive patterns between the two groups. Remarkably, a higher deviation from the expected values based only on mutational patterns was observed for mammalian circoviruses both at dinucleotide and codon levels. Accordingly, a stronger selective pressure was estimated to shape the genome of mammalian circoviruses, particularly in the Cap encoding gene, compared to avian circoviruses. These differences could be attributed to different physiological and immunological features of the two host classes and suggest a trade-off between a tendency to optimize the capsid protein translation while minimizing the recognition of the genome and the transcript molecules. Interestingly, the recently identified Porcine circovirus 3 (PCV-3) had an intermediate pattern in terms of genome composition and codon bias. Particularly, its Rep gene appeared closely related to other mammalian circoviruses (especially bat circoviruses) while the Cap gene more closely resembled avian circoviruses. These evidences, coupled with the high selective forces apparently modelling the PCV-3 Cap gene composition, suggest the potential recombinant origin, followed or preceded by a host jump, of this virus.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29958294 PMCID: PMC6025852 DOI: 10.1371/journal.pone.0199950
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Circoviruses genome composition parameters.
Density plot of the different genome composition parameters colour coded accordingly with the specific class category (i.e. Aves: 705 Cap and 933 Rep; Mammalia: 3705 Cap and 1601 Rep). PCV-3 (111 Cap and 40 Rep) has been reported in blue. Both Rep (top) and Cap (bottom) genes have been analysed.
Over- and under-represented codons.
| Group | Type | ||
|---|---|---|---|
| Under-represented | |||
| Over-represented | |||
| Under-represented | |||
| Over-represented | |||
| Under-represented | |||
| Over-represented | aac,aga,agc,att,cac,cgt,cta,ctc,gaa, gac,gga,gtt,tcc |
Summary of over- and under-represented codons in the genes encoding the Rep and Cap proteins of avian and mammalian circoviruses and PCV-3.
Fig 2Nc and Ncp plot.
Scatterplot reporting the relationship between Nc and Ncp and GC3 content for the Rep and Cap genes. Avian and Mammals circoviruses and PCV-3 have been color-coded. The line representing the expected Nc values, which would result from GC composition being the only factor influencing the codon usage bias, has been superimposed.
Fig 3PCA based on RSCU and rho values.
Scatter plot based on the first two components of the PCA performed on RSCU and rho values calculated on mammal and avian circoviruses. For interpretation easiness, PCV-3 and Chiroptera circoviruses have been highlighted with different colours. The PCA loading are represented as arrows. The 95% confidence ellipses around clusters are also reported. Both Rep (top) and Cap (bottom) genes have been analysed.
Fig 4Diagnostic performances of predictive methods.
Distribution of diagnostic performance metrics of RF and LDA evaluated by cross-validation on Cap and Rep datasets.
Predictive method performances.
| Dataset | Method | Class prediction | Probability | Class prediction | Probability |
|---|---|---|---|---|---|
| LDA | Mammalia | 0.99 (0.95–0.99) | Aves | 1 (0.99–1) | |
| RF | Mammalia | 0.62 (0.60–0.64) | Aves | 1 (0.99–1) | |
| LDA | Mammalia | 0.99 (0.99–1) | Aves | 1 (0.99–1) | |
| RF | Mammalia | 0.62 (0.59–0.65) | Aves | 0.68 (0.61–0.73) | |
Results of PCV-3 host-class prediction performed using different datasets (i.e. rho statistic and RSCU) and predictive methods (i.e. LDA and RF). The most likely class and the estimated probability are reported for different method-dataset combinations. The probability range has been obtained by estimating the host-class using all the available PCV-3 genomes.