| Literature DB >> 18378386 |
Sheng Zhao1, Qin Zhang, Xiaolin Liu, Xuemin Wang, Huilin Zhang, Yan Wu, Fei Jiang.
Abstract
Human Bocavirus (HBoV) is a novel virus which can cause respiratory tract disease in infants or children. In this study, the codon usage bias and the base composition variations in the available 11 complete HBoV genome sequences have been investigated. Although, there is a significant variation in codon usage bias among different HBoV genes, codon usage bias in HBoV is a little slight, which is mainly determined by the base compositions on the third codon position and the effective number of codons (ENC) value. The results of correspondence analysis (COA) and Spearman's rank correlation analysis reveals that the G+C compositional constraint is the main factor that determines the codon usage bias in HBoV and the gene's function also contributes to the codon usage in this virus. Moreover, it was found that the hydrophobicity of each protein and the gene length are also critical in affecting these viruses' codon usage, although they were less important than that of the mutational bias and the genes' function. At last, the relative synonymous codon usage (RSCU) of 44 genes from these 11 HBoV isolates is analyzed using a hierarchical cluster method. The result suggests that genes with same function yet from different isolates are classified into the same lineage and it does not depend on geographical location. These conclusions not only can offer an insight into the codon usage patterns and gene classification of HBoV, but also may help in increasing the efficiency of gene delivery/expression systems.Entities:
Mesh:
Substances:
Year: 2008 PMID: 18378386 PMCID: PMC7116908 DOI: 10.1016/j.biosystems.2008.01.006
Source DB: PubMed Journal: Biosystems ISSN: 0303-2647 Impact factor: 1.973
Eleven complete genome sequences of HBoV under study
| Isolate name | Country of isolation | Accession |
|---|---|---|
| HBoV WLL-3 | China | |
| HBoV WLL-2 | China | |
| HBoV BJ3722 | China | |
| HBoV CU74 | Thailand | |
| HBoV CU49 | Thailand | |
| HBoV CU6 | Thailand | |
| HBoV BJ3064 | China | |
| HBoV WLL-1 | China | |
| HBoV CRD2 | USA | |
| HBoV st2 | Sweden | |
| HBoV st1 | Sweden |
Identified genes (length > 300 bps) in these 11 HBoV genomes under study
| SN | Gene | GC3S | GC | ENC | F1 | F2 | SN | Gene | GC3S | GC | ENC | F1 | F2 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 1 | WLL-3-NS1 | 0.31 | 0.42 | 46.63 | −0.03 | −0.41 | 23 | CU6-VP1 | 0.30 | 0.42 | 42.07 | 0.26 | 0.11 |
| 2 | WLL-3-NP1 | 0.40 | 0.44 | 47.96 | −0.50 | 0.16 | 24 | CU6-VP2 | 0.31 | 0.43 | 41.09 | 0.29 | 0.14 |
| 3 | WLL-3-VP1 | 0.30 | 0.42 | 42.12 | 0.24 | 0.10 | 25 | BJ3064-NS1 | 0.31 | 0.42 | 46.63 | −0.03 | −0.41 |
| 4 | WLL-3-VP2 | 0.31 | 0.43 | 41.27 | 0.27 | 0.13 | 26 | BJ3064-NP1 | 0.40 | 0.44 | 47.41 | −0.54 | 0.18 |
| 5 | WLL-2-NS1 | 0.31 | 0.42 | 46.69 | −0.03 | −0.41 | 27 | BJ3064-VP1 | 0.29 | 0.42 | 41.97 | 0.25 | 0.11 |
| 6 | WLL-2-NP1 | 0.40 | 0.44 | 47.96 | −0.50 | 0.16 | 28 | BJ3064-VP2 | 0.31 | 0.43 | 41.10 | 0.29 | 0.14 |
| 7 | WLL-2-VP1 | 0.30 | 0.42 | 42.14 | 0.24 | 0.10 | 29 | WLL-1-NS1 | 0.31 | 0.42 | 46.84 | −0.03 | −0.41 |
| 8 | WLL-2-VP2 | 0.31 | 0.43 | 41.31 | 0.27 | 0.12 | 30 | WLL-1-NP1 | 0.39 | 0.44 | 47.91 | −0.49 | 0.16 |
| 9 | BJ3722-NS1 | 0.30 | 0.41 | 46.57 | −0.03 | −0.41 | 31 | WLL-1-VP1 | 0.29 | 0.42 | 41.89 | 0.25 | 0.12 |
| 10 | BJ3722-NP1 | 0.39 | 0.44 | 47.91 | −0.49 | 0.16 | 32 | WLL-1-VP2 | 0.31 | 0.43 | 40.87 | 0.28 | 0.15 |
| 11 | BJ3722-VP1 | 0.29 | 0.42 | 41.97 | 0.25 | 0.11 | 33 | CRD2-NS1 | 0.30 | 0.41 | 46.41 | −0.02 | −0.41 |
| 12 | BJ3722-VP2 | 0.31 | 0.43 | 41.10 | 0.29 | 0.14 | 34 | CRD2-NP1 | 0.39 | 0.44 | 47.91 | −0.49 | 0.16 |
| 13 | CU74-NS1 | 0.30 | 0.41 | 46.57 | −0.03 | −0.40 | 35 | CRD2-VP1 | 0.30 | 0.42 | 41.95 | 0.25 | 0.12 |
| 14 | CU74-NP1 | 0.39 | 0.43 | 47.76 | −0.49 | 0.16 | 36 | CRD2-VP2 | 0.31 | 0.43 | 41.00 | 0.29 | 0.14 |
| 15 | CU74-VP1 | 0.30 | 0.42 | 42.27 | 0.25 | 0.12 | 37 | st2-NS1 | 0.31 | 0.42 | 46.63 | −0.03 | −0.41 |
| 16 | CU74-VP2 | 0.31 | 0.43 | 41.41 | 0.28 | 0.14 | 38 | st2-NP1 | 0.39 | 0.43 | 47.76 | −0.49 | 0.16 |
| 17 | CU49-NS1 | 0.30 | 0.41 | 46.40 | −0.03 | −0.41 | 39 | st2-VP1 | 0.30 | 0.42 | 41.87 | 0.25 | 0.11 |
| 18 | CU49-NP1 | 0.39 | 0.44 | 47.91 | −0.49 | 0.16 | 40 | st2-VP2 | 0.31 | 0.43 | 40.93 | 0.29 | 0.14 |
| 19 | CU49-VP1 | 0.30 | 0.42 | 42.67 | 0.23 | 0.11 | 41 | st1-NS1 | 0.30 | 0.41 | 46.40 | −0.03 | −0.42 |
| 20 | CU49-VP2 | 0.31 | 0.43 | 41.95 | 0.27 | 0.14 | 42 | st1-NP1 | 0.40 | 0.44 | 48.42 | −0.50 | 0.16 |
| 21 | CU6-NS1 | 0.30 | 0.41 | 46.46 | −0.03 | −0.41 | 43 | st1-VP1 | 0.29 | 0.42 | 42.16 | 0.23 | 0.12 |
| 22 | CU6-NP1 | 0.40 | 0.44 | 48.04 | −0.49 | 0.16 | 44 | st1-VP2 | 0.31 | 0.43 | 41.37 | 0.26 | 0.14 |
SN: sequence number; ENC: effective number of codons.
The frequency of G + C at the third synonymously variable coding position.
The frequency of G + C of this gene.
The first axis values of each gene in COA.
The second axis values of each gene in COA.
Fig. 1Correlation between GC content at first and second codon positions (GC12) with that at synonymous third codon positions (GC3S).
Fig. 2Distribution of the codon usage index, ENC, and GC content at synonymous third codon positions (GC3S). The curve indicates the expected codon usage if GC compositional constraints alone account for codon usage bias.
Fig. 3Correlation between the first axis (A), second axis (B) values in COA and GC3S values of each gene.
Fig. 4Average ENC value and its corresponding S.D. value of each group according to the gene type among 11 HBoV isolates under study.
Fig. 5Dendroid chart of the cluster result of the 44 HBoV genes under study based on hierarchical cluster method.
Distances between main lineages
| Lineage no. | I | II | III | IV |
|---|---|---|---|---|
| I | 5.85 | 5.73 | 6.11 | |
| II | 5.85 | 4.97 | 5.41 | |
| III | 5.73 | 4.97 | 1.11 | |
| IV | 6.11 | 5.41 | 1.11 |