| Literature DB >> 25081062 |
James Lara, F López-Labrador, Fernando González-Candelas, Marina Berenguer, Yury E Khudyakov.
Abstract
BACKGROUND: Chronic infection with hepatitis C virus (HCV) is a risk factor for liver diseases such as fibrosis, cirrhosis and hepatocellular carcinoma. HCV genetic heterogeneity was hypothesized to be associated with severity of liver disease. However, no reliable viral markers predicting disease severity have been identified. Here, we report the utility of sequences from 3 HCV 1b genomic regions, Core, NS3 and NS5b, to identify viral genetic markers associated with fast and slow rate of fibrosis progression (RFP) among patients with and without liver transplantation (n = 42).Entities:
Mesh:
Substances:
Year: 2014 PMID: 25081062 PMCID: PMC4120150 DOI: 10.1186/1471-2105-15-S8-S5
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
RFP-relevant HCV sites.
| Method | Sites§ | Score |
|---|---|---|
| 8480X5, 4496X2, 4484X1, 450X3, 4496X4, 8606X5, 8480X1, 8606X1, 8435X5, 8378X4, 4496X1, 4496X3 | ||
| 4496X4, 8480X5, 450X3, 8606X4, 4484X4, 8435X5, 8606X3, 4496X5, 8378X2 | ||
§Genomic positions assigned based on reference sequence Con1.
aA total of 3,757 feature subsets were evaluated by CFS search method. Scoring metric is based on the merit heuristic. Merit scores of feature subsets of the individual regions were also computed: core (0.19), NS3 (0.28) and NS5b (0.20). CFS sites, physicochemical properties of which were selected for the RFP-relevant projections are shown in bold.
bFeatures are listed in the placement order of base vectors in LP graphs shown in Fig. 1. Site-specific physicochemical properties are denoted X1(hydrophobicity), X2(polarity), X3(dipole moment), X4(surface area) and X5(stacking area). See methods for details on scoring metric.
Figure 12D linear projection (LP) graphs. Base vectors of projection represent RFP-relevant features. Sites are identified by their positions in the HCV genome and physicochemical properties are shown in parenthesis as X1-X5. See Table 1 for detail. LP graphs of HCV 1b isolates (n = 42) sampled from TOH and IC patients based on A) 12-feature and B) 9-feature projections are shown. To the right of each LP graph is the same graph except for the condition that data points were jittered [41] to highlight membership and size of clusters.
Figure 2RFP-specificity of LP models. A) 2D graph of the LP model shown in Fig 1. HCV strains distributed into 4 clusters, from left to right: cluster 1 (fast RFP-IC, n = 1), cluster 2 (fast RFP-IC, n = 2; fast RFP-TOH, n = 4), cluster 3 (slow RFP-IC, n = 13; slow RFP-TOH, n = 12, and fast RFP-IC, n = 1; fast RFP-TOH, n = 3) and cluster 4 (fast RFP-IC, n = 3; fast RFP-TOH, n = 3). The graph below, shows mapping of computed probability potentials in LP model defining three RFP-class spaces of HCV strains (fast-RFP in blue, slow-RFP in red), where color density of areas are directly proportional to the probability of association to the respective RFP-class. Plots, where x-axis represents predicted probabilities and y-axis denotes observed RFP-class of HCV strains, show classification performances in validation tests of the B) LP-IC and C) LP-TOH model.
Performance evaluations of models.
| Model‡ | Dataset | CV test§ | CA (%) | Sensitivity (%) | Specificity (%) | F-measure (%) |
|---|---|---|---|---|---|---|
| LP | Full (n = 42) | 70/30-CV | 93.20 | 82.00 | 100 | 90.11 |
| LP-TOH | TOH (n = 22) | 70/30-CV | 90.00 | 76.67 | 100 | 86.79 |
| LP-IC | IC (n = 20) | 70/30-CV | 95.00 | 85.00 | 100 | 92.31 |
| LP-TOH | IC (n = 20) | 90.00 | 71.43 | 100 | 83.33 | |
| LP-IC | TOH (n = 22) | 86.36 | 70.00 | 100 | 82.35 | |
| BNC-TOH | IC (n = 20) | 85.00 | 71.43 | 92.31 | 82.90 | |
| BNC-IC | TOH (n = 22) | 86.40 | 70.00 | 100 | 85.62 | |
| IC (n = 20) | 45.00 | na | na | na | ||
| TOH (n = 22) | 45.54 | na | na | na | ||
LP model is based on the selected projection comprised of 9 HCV features. Classification accuracy for RandBNCs was averaged over 5 repetitions.
Values were averaged for 10 repetitions.
Figure 3BNC models of RFP-relevant HCV sites. Nodes in the graph represent 25 nt sites (Table 1) and arcs between them represent relationships. Numbering of nodes in BNC denotes genomic position in Con1 as reference and colour represent genomic region. Node representing RFP is coloured in red. Models learned from HCV sequence profiles sampled from A) non-transplanted patients and from B) transplanted patients are shown.