| Literature DB >> 29718236 |
Eli Goz1,2, Zohar Zafrir1,2, Tamir Tuller1,2,3.
Abstract
Motivation: Understanding how viruses co-evolve with their hosts and adapt various genomic level strategies in order to ensure their fitness may have essential implications in unveiling the secrets of viral evolution, and in developing new vaccines and therapeutic approaches. Here, based on a novel genomic analysis of 2625 different viruses and 439 corresponding host organisms, we provide evidence of universal evolutionary selection for high dimensional 'silent' patterns of information hidden in the redundancy of viral genetic code.Entities:
Mesh:
Substances:
Year: 2018 PMID: 29718236 PMCID: PMC7109696 DOI: 10.1093/bioinformatics/bty351
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Fig. 1.Selection for long host-repetitive patterns of silent functional information in viral coding regions. A summary of the analyzed hosts and viruses that undergo significant enrichment for mutually long sub-sequences. Each vertical bar corresponds to viruses infecting a specific host organism (in bacteria-a specific genus) and is partitioned into class specific segments; every segment corresponds to percentage of viruses belonging to its corresponding class (y-axis) and is assigned a specific color. Further, each segment is composed of two stacked parts: the lower part with full color interior represents the portion (out of all host-specific viruses) of AHRS-significant viruses (P < 0.05 w.r.t both randomization models); and the upper part with black interior (but with borders of the corresponding color) represents the rest of the viruses (P ≥ 0.05 w.r.t at least one randomization model). The numbers (e.g. x/y) shown under each bar indicate the number of viruses (e.g. x) that show significant enrichment out the total number of viruses checked (e.g. y); thus, for each class-specific segment, the sum of its two parts (significant and not significant) represent the total portion of viruses of this class within all viruses related to the organism described by the bar, and the sum of all segments is equal to 1. Horizontal bars visualizes the total percentage of AHRS-significant viruses in each host domain. We can see that coding regions in 47, 36, 39, 27, 25 and 90% of viruses from different classes that infect one or several vertebrates, metazoa, plants, fungi, protists and bacteria organisms (correspondingly) undergo an evolutionary pressure to maintain long genomic substrings that also tend to repeat in the coding regions of at least one related host
Fig. 2.Selection for complex host-repetitive silent functional patterns depends on protein's function. The upper panel (in blue) represents the number of coding sequences within each functional group. The bars in the middle panel (green, yellow and red, respectively) represent the percentage of significant (AHRS P < 0.05 w.r.t both randomization model, green); semi-significant (AHRS P < 0.05 w.r.t only one randomization models, green); non-significant (AHRS P > 0.05 w.r.t. both randomization models). Black lines represent the mean length of significant (solid line) and non-significant (dotted-line) coding sequences in each group. We can see that those structural proteins are encoded by the highest portion of AHRS significant coding sequences. On the other hand, surface proteins have the smallest number of AHRS significant coding sequences. The enzymes and other proteins show an intermediate level of selection for long host-repetitive patterns. Each green bar (at the bottom) is divided into three parts, corresponding to the local AHRS analysis in the 5′, middle, and 3′ segments of a coding sequence. In each part the percentage of AHRS-significant genes with the highest local AHRS found in this part is indicated. We can see that for each gene group, most of the sequences used to have the highest local AHRS in the middle part; the percentage of genes with the highest local AHRS in the 3′ part was found to be the smallest