| Literature DB >> 35685354 |
Zhen He1,2, Lang Qin1, Xiaowei Xu1, Shiwen Ding1.
Abstract
During recent decades, many new emerging or re-emerging RNA viruses have been found in plants through the development of deep-sequencing technology and big data analysis. These findings largely changed our understanding of the origin, evolution and host range of plant RNA viruses. There is evidence that their genetic composition originates from viruses, and host populations play a key role in the evolution and host adaptability of plant RNA viruses. In this mini-review, we describe the state of our understanding of the evolution of plant RNA viruses in view of compositional biases and explore how they adapt to the host. It appears that adenine rich (A-rich) coding sequences, low CpG and UpA dinucleotide frequencies and lower codon usage patterns were found in the vast majority of plant RNA viruses. The codon usage pattern of plant RNA viruses was influenced by both natural selection and mutation pressure, and natural selection mostly from hosts was the dominant factor. The codon adaptation analyses support that plant RNA viruses probably evolved a dynamic balance between codon adaptation and deoptimization to maintain efficient replication cycles in multiple hosts with various codon usage patterns. In the future, additional combinations of computational and experimental analyses of the nucleotide composition and codon usage of plant RNA viruses should be addressed.Entities:
Keywords: Codon usage; Compositional biases; Host adaptation; Plant RNA viruses
Year: 2022 PMID: 35685354 PMCID: PMC9160401 DOI: 10.1016/j.csbj.2022.05.021
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 6.155
Fig. 1Schematic overview on the emergence and host adaptation of plant RNA viruses.
Fig. 2Schematic overview on the compositional biases of plant RNA viruses in ideal and nature conditions.
Fig. 3Nucleotide composition of recently reported plant RNA viruses. Source from Biswas et al. (2019), Chakraborty et al. (2015), Cheng et al. (2012), Gómez et al. (2020), He et al. (2019, 2020, 2021, 2022), Prádena et al (2020), Patil et al. (2017), Yang et al. (2022), Huang et al. (2015).
Software for nucleotide composition and codon adaptation analyses.
| Name | Description and advantages | Uses | Availability | URL | Reference |
|---|---|---|---|---|---|
| Software for nucleotide composition and codon analyses | |||||
| BioEditor | BioEditor is an application that enables scientists and educators to prepare and present structure annotations containing formatted text, graphics, sequence data, and interactive molecular views. | BioEditor can be used to analyse codon and base composition. | Local installation | ||
| chips | Nc provides an intuitive and meaningful measure of the degree of codon bias in genes. Low values indicate strong codon bias and high values indicate low bias (probably noncoding regions). | Chips computes Frank Wright's Nc statistic for nucleotide sequences. | Local installation | ||
| CodonW | Codon W is a software package for codon usage analysis. It is designed to simplify multivariate analysis (MVA) of codon usage. The MVA method employed in codon W is COA, the most popular MVA method for codon usage analysis. Codon W can generate COAs for codon usage, relative synonymous codon usage, or amino acid usage. Other analyses of codon usage include studies of optimal codons, codon and dinucleotide bias and/or base composition. | Codon W applies correspondence analysis (COA), the most popular MVA method for codon usage analysis. Codon W can generate COA for codon usage, relative synonymous codon usage or amino acid usage analyses. | Local installation | ||
| cusp | Cusp computes a codon usage table for one or more nucleotide coding sequences and writes the table to a file. | Creates a codon usage table from a nucleotide sequence. | Local installation | ||
| DnaSP | DnaSP is a software package for the analysis of DNA polymorphism data. | The present version allows for analysis of the evolutionary pattern of preferred and unpreferred codons. | Local installation | ||
| EncPrime | A program to calculate the summary statistic Nc' of codon usage bias. | Calculates the ENC metric. | Local installation | ||
| SMS (Sequence Manipulation Suite) | The program can compares the frequency of codons encoding the same amino acid (synonymous codons) | SMS can be used to assess whether sequences show a preference for particular synonymous codons. | Web | ||
| MEGA 11 | Molecular Evolutionary Genetics Analysis (MEGA) software has matured to contain a large collection of methods and tools of computational molecular evolution. | MEGA now contains methods for analyses of codons, RSCU and base composition. | Local installation | ||
| Software for codon pair analysis | |||||
| ANACONDA | The Anaconda software package provides a set of statistical, bioinformatics and data visualization tools for gene primary structure analysis. | It can be used for analysis of genomic codon preference and codon pair preference | Local installation | ||
| CoCoPUTs | CoCoPUT is a table of codon and codon pair usage derived from all available GenBank and RefSeq data. When searching for species, the search takes precedence over RefSeq, so that if the RefSeq assembly is available, it will automatically extract data from that source. If searching for a species without RefSeq assemblies, use the taxonomic ID of the organism for best results. | The codon usage table is a measure of codon usage bias, such as the relative frequency with which different codons are used in genes of a given species. Likewise, the codon pair usage table shows counts for each codon pair in the CDS of a given species and is a measure of codon pair usage bias. | Web | ||
| CPS (codon pair score) | Measures codon pair bias, defined analogously to the RSCU. | It can be used to determine the level of similarity in codon pair preferences between viruses and their host. | R package | ||
| CPO (codon pair optimization) | A software tool to provide codon pair optimization for synthetic gene design. | CPO provides a simple and efficient means for customizing codon optimization based on the codon pair bias of Pichia pastoris. | R package | ||
| Software for codon adaptation analysis | |||||
| CAIcal | It includes a complete set of CAI related utilities. The server provides useful important functions such as computational and graphical representation of CAI, representation along single sequences or protein multiple sequence alignments translated into DNA. The CAIcal tool also includes automatic calculation of the CAI and its expected value. | The CAIcal server provides a complete set of tools to assess codon usage adaptation and aid in genome annotation. | Web | ||
| CBI (codon bias index) | Optimal codon usage is measured using the ratio between the number of optimal codons in the gene and the total number of codons in the gene. It uses the expected usage as a scaling factor. | It can calculate the presence of components with high CUB in a particular gene. | Local installation | ||
| COOL | COOL was designed as an adaptable web-based interface that provides a wide range of functions. Users can completely customize the synthetic gene design process through a step-by-step job submission process, which allows for them to specify their optimal parameter settings. | COOL supports a simple and flexible interface for customizing various codon optimization parameters such as the codon adaptation index, single codon usage, and codon pairing. | Web | ||
| coRdon | Codon usage bias can be used to predict the relative expression levels of genes by comparing the CU bias of a gene to the CU bias of a set of genes known to be highly expressed. This method can be effectively used to predict highly expressed genes in a single genome, and it is particularly useful at a higher level of the whole metagenome. By analysing the CU deviation of the macrogenome, we can identify the genes with high predictive expression in the whole microbial community, and determine the enrichment functions in the community, that is, their “functional fingerprint”. | It can calculation of different CU bias statistics and CU-based gene expression predictions, gene set enrichment analysis of annotated sequences, and several methods for displaying CU and enrichment analysis results. | R package | ||
| COUSIN | Calculates codon usage for user-supplied | COUSIN allows for easy and complete analysis of cuprefs, including seven other indices, and provides functions such as statistical analysis, clustering and cuprefs optimization of gene expression. | Web or install | ||
| HEG-DB | Database of the CAI index of HEGs for 200 genomes | Calculates the CAI. | Web | ||
| Jcat (Java Codon Adaptation Tool) | Further choices for Jcat codon adaptation include the avoidance of unwanted cleavage sites for restriction enzymes and Rho-independent transcription terminators. Compared with existing tools, Jcat does not need to manually define high-expression genes, so it is a very fast and simple method. | A novel method for the adaptation of target gene codon usage to most sequenced prokaryotes and selected eukaryotic gene expression hosts to improve heterologous protein production. | Web | ||
| OPTIMIZER | OPTIMIZER allows for three optimization methods and uses several valuable new reference sets. It can be used to optimize the expression levels of genes, assess the fitness of foreign genes inserted into the genome, or design new genes from protein sequences. | Optimizes the codon usage of a DNA sequence to increase its expression level. | Web | ||
| stAI (species-specific tRNA adaptation index) | The tRNA adaptation index (tAI) is a widely used measure of the efficiency with which the intracellular tRNA pool recognizes coding sequences. The index includes weights representing the wobble interactions between codons and tRNA molecules. The software presents a new method to adjust tAI weights to any target model organism without the need for gene expression measurements. The method is based on optimizing the correlation between tAI and codon usage bias measures. | The calculator includes optimized tAI weights for 100 species from three life domains, as well as a stand-alone software package to optimize weights for new organisms. | Web | ||
| Synthetic Gene Designer | Synthetic Gene Designer includes three main stages of genetic design. Given it a gene of interest and the target genome in which it is expressed. | Synthetic Gene Designer offers enhanced functionality compared to existing software options; for example, it enables users to use nonstandard genetic codes, user-defined codon usage patterns, and an expanded set of codon optimization methods. | Web | ||
Fig. 4Schematic overview on the regulatory role of plant RNA viruses’ CUB and its evolutionary implication.