| Literature DB >> 34201869 |
Hasan Arsın1,2, Andrius Jasilionis3, Håkon Dahle2,4, Ruth-Anne Sandaa1, Runar Stokke1,2, Eva Nordberg Karlsson3, Ida Helene Steen1,2.
Abstract
Marine viral sequence space is immense and presents a promising resource for the discovery of new enzymes interesting for research and biotechnology. However, bottlenecks in the functional annotation of viral genes and soluble heterologous production of proteins hinder access to downstream characterization, subsequently impeding the discovery process. While commonly utilized for the heterologous expression of prokaryotic genes, codon adjustment approaches have not been fully explored for viral genes. Herein, the sequence-based identification of a putative prophage is reported from within the genome of Hypnocyclicus thermotrophus, a Gram-negative, moderately thermophilic bacterium isolated from the Seven Sisters hydrothermal vent field. A prophage-associated gene cluster, consisting of 46 protein coding genes, was identified and given the proposed name Hypnocyclicus thermotrophus phage H1 (HTH1). HTH1 was taxonomically assigned to the viral family Siphoviridae, by lowest common ancestor analysis of its genome and phylogeny analyses based on proteins predicted as holin and DNA polymerase. The gene neighbourhood around the HTH1 lytic cassette was found most similar to viruses infecting Gram-positive bacteria. In the HTH1 lytic cassette, an N-acetylmuramoyl-L-alanine amidase (Amidase_2) with a peptidoglycan binding motif (LysM) was identified. A total of nine genes coding for enzymes putatively related to lysis, nucleic acid modification and of unknown function were subjected to heterologous expression in Escherichia coli. Codon optimization and codon harmonization approaches were applied in parallel to compare their effects on produced proteins. Comparison of protein yields and thermostability demonstrated that codon optimization yielded higher levels of soluble protein, but codon harmonization led to proteins with higher thermostability, implying a higher folding quality. Altogether, our study suggests that both codon optimization and codon harmonization are valuable approaches for successful heterologous expression of viral genes in E. coli, but codon harmonization may be preferable in obtaining recombinant viral proteins of higher folding quality.Entities:
Keywords: Escherichia coli; Hypnocyclicus thermotrophus; codon harmonization; codon optimization; heterologous expression; hydrothermal vent; lytic cassette; prophage
Mesh:
Substances:
Year: 2021 PMID: 34201869 PMCID: PMC8310279 DOI: 10.3390/v13071215
Source DB: PubMed Journal: Viruses ISSN: 1999-4915 Impact factor: 5.048
Figure 1Phylogeny analysis of the prophage based on the alignment of 106 amino acid long region of holin proteins from 94 phages, using maximum likelihood, with 1000 bootstrap replicates. The tree is centre-rooted, and the scale bar represents the average number of amino acid substitutions per site. Numbers next to collapsed clades represent the number of leaves covered by each illustration. The HTH1 holin is highlighted in red.
Figure 2Gene neighbourhood map of HTH1 and the comparable regions of three closely related phage gene clusters aligned around the holin in their respective lytic cassettes. Displayed genes are drawn to scale, as shown on the top right. Respective organism or sample names, related accession numbers (in parentheses) and genome regions displayed (in bp ranges) are provided above each graphic. Genes chosen for expression of proteins from HTH1 are also labelled with their identifier numbers. Double dashes (//) indicate the presence of genes further up or downstream the gene regions displayed in this figure.
Figure 3Illustration depicting sequence features of chosen candidate proteins predicted by HMMER [82]. Black lines show non-annotated amino acid sequences, grey boxes show predicted Pfam domains, purple lines mark transmembrane domains and numbers flanking each feature show their respective amino acid residue number ranges. The blue box shows the HTP4410 analogue found in Streptococcus phage Javan630 (SJ630) and Erysipelothrix phage phi1605 (EP1605).
Codon usage parameters and soluble production yield estimation of target HTH1 proteins. CAI—codon adaptation index, CHI—codon harmonization index, CO—codon-optimized, CH—codon-harmonized, ND—target protein not detected in total soluble protein fraction.
| Identifier | Proposed Protein Function | CAI for Expression Host | CHI for Expression Host | Codon Native Gene Sequence CAI for Native Host | Soluble Produced Protein Yield * (mg/L) | |||
|---|---|---|---|---|---|---|---|---|
| CO Gene Variant | CH Gene Variant | CO Gene Variant | CH Gene Variant | Expressed from CO Gene Variant | Expressed from CH Gene Variant | |||
| HTP4435 | Endopeptidase tail | 0.89 | 0.64 | 0.61 | 0.48 | 0.50 | 18.7 ± 1.3 | 13.8 ± 1.5 |
| HTP4425 | Hypothetical protein | 0.87 | 0.67 | 0.59 | 0.48 | 0.51 | ND | ND |
| HTP4420 | Glycosyl hydrolase 18 | 0.89 | 0.66 | 0.60 | 0.45 | 0.51 | ND | ND |
| HTP4415 | Holin, toxin secretion/phage lysis | 0.87 | 0.58 | 0.60 | 0.47 | 0.40 | ND | ND |
| HTP4410 | N-acetylmuramoyl-L-alanine amidase | 0.88 | 0.64 | 0.61 | 0.47 | 0.46 | 40.7 ± 5 | 30.8 ± 3.4 |
| HTP4400 | rRNA biogenesis protein rrp5, putative | 0.84 | 0.64 | 0.58 | 0.48 | 0.48 | 135.80 ± 2.49 | 27.9 ± 3.2 |
| HTP4385 | DNA Polymerase | 0.86 | 0.62 | 0.60 | 0.46 | 0.47 | ND | ND |
| HTP4360 | hypothetical protein | 0.84 | 0.74 | 0.55 | 0.43 | 0.56 | 151.7 ± 10.2 | ND |
| HTP4350 | DUF262 / DNase | 0.86 | 0.62 | 0.59 | 0.45 | 0.44 | 32.3 ± 1.2 | 15.6 ± 2.7 |
* Values represent mean ± standard error of three independent expressions.
Figure 4Relative abundance of target HTH1 proteins produced after expression from codon-optimized (CO) and codon-harmonized (CH) gene variants in total soluble protein fraction. ND—target protein not detected in total soluble protein fraction. Values represent relative abundance mean in percent of total proteins in total soluble protein fraction ± standard error of three independent expressions.
Figure 5SDS-PAGE image of purified proteins produced from codon-harmonized (CH) and codon-optimized (CO) genes. The HTP prefix and the numbers above the lanes correspond to the identifiers of the genes tested. M indicates the protein marker (Bio-Rad Precision Plus Dual Color). Numbers next to each protein marker lane show the respective molecular weight labels in kDa.
Purification yield of target HTH1 proteins. Protein concentrations were measured spectrophotometrically estimating total amount of target recombinant protein in clarified lysate by combining densitometry calculation results and total soluble protein quantification results. CO—codon-optimized, CH—codon-harmonized, ND—target protein not detected in total soluble protein fraction.
| Identifier | Proposed Protein Function | Protein Purification Yield * (%) | |
|---|---|---|---|
| Target Protein Expressed from CO Gene Variant | Target Protein Expressed from CH Gene Variant | ||
| HTP4410 | N-acetylmuramoyl-L-alanine amidase | 85.6 ± 1.4 | 85.9 ± 1.9 |
| HTP4400 | rRNA biogenesis protein rrp5, putative | 38.6 ± 7.2 | 58 ± 3.7 |
| HTP4360 | hypothetical protein | 75 ± 3.8 | ND |
| HTP4350 | DUF262/DNase | 83.5 ± 5.2 | 92.1 ± 2.1 |
* Values represent mean ± standard error of three independent purifications.
Thermal unfolding estimation with differential scanning fluorimetry of stably soluble target HTH1 proteins. CO—codon-optimized, CH—codon-harmonized.
| Target Protein | Proposed Protein Function | Melting Temperature (Tm, °C) | |
|---|---|---|---|
| Target Protein Expressed from CO Gene Variant | Target Protein Expressed from CH Gene Variant | ||
| HTP4410 | N-acetylmuramoyl-L-alanine amidase | 66.23 ± 0.07 * | 73.03 ± 0.10 |
| HTP4400 | rRNA biogenesis protein rrp5, putative | 51.57 ± 0.34 | 55.70 ± 0.22 |
| HTP4350 | DUF262 / DNase | 61.57 ± 1.47 | 65.24 ± 0.43 |
* Values represent mean ± standard error of three independent differential scanning fluorimetry assays.