| Literature DB >> 33730572 |
Xavier Hernandez-Alias1, Hannah Benisty2, Martin H Schaefer3, Luis Serrano4.
Abstract
Viruses need to hijack the translational machinery of the host cell for a productive infection to happen. However, given the dynamic landscape of tRNA pools among tissues, it is unclear whether different viruses infecting different tissues have adapted their codon usage toward their tropism. Here, we collect the coding sequences of 502 human-infecting viruses and determine that tropism explains changes in codon usage. Using the tRNA abundances across 23 human tissues from The Cancer Genome Atlas (TCGA), we build an in silico model of translational efficiency that validates the correspondence of the viral codon usage with the translational machinery of their tropism. For instance, we detect that severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is specifically adapted to the upper respiratory tract and alveoli. Furthermore, this correspondence is specifically defined in early viral proteins. The observed tissue-specific translational efficiency could be useful for the development of antiviral therapies and vaccines.Entities:
Keywords: SARS-CoV-2; codon usage; tRNA; tissue; translation; tropism
Mesh:
Substances:
Year: 2021 PMID: 33730572 PMCID: PMC7962955 DOI: 10.1016/j.celrep.2021.108872
Source DB: PubMed Journal: Cell Rep Impact factor: 9.423
Figure 1Tropism corresponds with differences in RCU of human-infecting viruses
(A) A total of 502 viruses was distributed among 35 families and covered all 7 Baltimore groups. Of those viruses, 228 were classified in 6 general tropisms based on ViralZone annotations (Hulo et al., 2011).
(B) Three internal clustering indexes were computed to assess the validity of each viral classification in terms of their relative codon usage (RCU). Good cluster performances lead to low WB indexes but lead to high Silhouette and Dunn values (as shown in the color code).
(C) Linear discriminant analysis of the RCU of the 228 tropism-defined viruses. In parentheses is the percentage of variance explained by each of the components. See also Figure S1.
Figure 2Viruses are adapted to the tRNA-based translational efficiencies of their target tissues
(A) Receiver operating characteristic (ROC) and precision-recall (PR) curves of a random forest classifier, in which the average supply-to-demand adaptation (SDA) of viral proteins to each of the 23 TCGA tissues is used to predict their corresponding viral tropism of NCBI viruses (see STAR Methods). The area under the curves (AUCs) ± SD summarize the performance of the model.
(B) Relative feature weights of each of the 23 TCGA tissues for each of the 6 tropisms, which measure the contribution of each tissue in the decision trees. The dendrograms show a hierarchical clustering among tissues (left) and among tropisms (top). The cyan lines show the trace of weights along each tropism. Refer to Table S2 for full TCGA cancer type names. See also Figures S2–S5.
Figure 3Early viral proteins are better adapted than late counterparts
(A) Average SDA of replication (Xr) and structural (Xs) proteins of a total of 104 annotated tropism-specific viruses matched to 461 samples of their tissues of infection (Table S4). Boxes expand from the first to the third quartile, with the center values indicating the median. The whiskers define a confidence interval of median ± 1.58∗interquartile range (IQR)/sqrt(n). Statistical significance was determined by paired (structural against replication proteins of each virus) and 2-tailed Wilcoxon rank-sum test.
(B) Top 10 positive and negative virus orthologous groups upon gene set enrichment analysis of the SDA of all proteins of tropism-specific viruses (Table S4). Based on their annotations, proteins groups are colored based on their early/replication or late/structural function (Knipe and Howley, 2013).
Figure 4Translational adaptation of viral proteins upon infection
(A) Relative tRNA adaptation index (see STAR Methods; Table S5) of viral proteins upon effective viral infections in different cell lines. Proteins are allocated to different time expression classes based on current viral knowledge (Knipe and Howley, 2013; Table S5). Center values within the violin plot represent the median. Only significant differences are shown and are denoted as follows: ∗p ≤ 0.05, ∗∗p ≤ 0.01, ∗∗∗p ≤ 0.001, and ∗∗∗∗p ≤ 0.0001. Statistical differences are based on a false discovery rate (FDR)-corrected 2-tailed Wilcoxon rank-sum test, with paired comparisons between time points (written in color) and unpaired comparisons between expression classes.
(B) Abundances of viral proteins (see STAR Methods; Table S5) upon effective viral infections at different time points in different cell lines. Solid lines represent the median of the expression class, surrounded by an uncertainty interval between the 0.4 and 0.6 percentiles.
| REAGENT or RESOURCE | SOURCE | IDENTIFIER |
|---|---|---|
| Antarctic phosphatase | New England BioLabs | Cat#M0289 |
| T4 Polynucleotide Kinase | New England BioLabs | Cat#M0201 |
| ProtoScript II Reverse Transcriptase | New England BioLabs | Cat#M0368 |
| miRNeasy Mini kit | QIAGEN | Cat#217004 |
| 15% TBE–urea gels | NOVEX, Invitrogen | Cat#EC6885BOX |
| RNeasy MinElute Cleanup Kit | QIAGEN | Cat#74204 |
| QIAquick PCR Purification Kit | QIAGEN | Cat#28106 |
| Supply-to-Demand Adaptation weights (SDAw) from TCGA samples | Synapse: syn20640275 | |
| SARS-CoV-2 reference genome | NCBI RefSeq: NC_045512.2 | |
| Bat coronavirus RaTG13 genome | ( | GenBank: MN996532.1 |
| Small RNA-seq of HFF infected by HCMV | GEO: | |
| Small RNA-seq of KMB-17 infected by HSV1 | GEO: | |
| Small RNA-seq of SUP-T1 infected by HIV1 | GEO: | |
| Hydro-tRNaseq of HEK293, HCT116, HeLa, MDA-MB-231, and BJ/hTERT | GEO: | |
| Hydro-tRNaseq of HACAT and HepG2 | This study | ArrayExpress: |
| HACAT | CRG Collection (Center for Genomic Regulation) | RRID: CVCL_0038 |
| HepG2 | IMIM Collection (Institut Hospital del Mar d’Investigacions Mèdiques) | RRID: CVCL_0027 |
| GSEA [v4.0.3] | ||
| SciKit Learn [v0.20.1] | ||
| Codon Usage tool | ||
| ViennaRNA toolkit [v2.4.14] | ||
| KnotInFrame | ||
| BBMap [v38.22] | Bushnell B. | |
| FastQC [v0.11.4] | Andrews S. | |
| SAMtools [v1.3.1] | ||
| tRNAscan-SE [v2.0] | ||
| BEDtools [v2.27.1] | ||
| Segemehl [v0.3.1] | ||
| Picard [v2.18.17] | Broad Institute | |
| GATK [v3.8] | ||
| Code for tRNA mapping and quantification of Hydro-tRNaseq data | This paper | |
| Code for all computational analyses of this report | This paper | |