| Literature DB >> 32076047 |
Sutanu Bhattacharya1, Debswapna Bhattacharya2,3.
Abstract
The development of improved threading algorithms for remote homology modeling is a critical step forward in template-based protein structure prediction. We have recently demonstrated the utility of contact information to boost protein threading by developing a new contact-assisted threading method. However, the nature and extent to which the quality of a predicted contact map impacts the performance of contact-assisted threading remains elusive. Here, we systematically analyze and explore this interdependence by employing our newly-developed contact-assisted threading method over a large-scale benchmark dataset using predicted contact maps from four complementary methods including direct coupling analysis (mfDCA), sparse inverse covariance estimation (PSICOV), classical neural network-based meta approach (MetaPSICOV), and state-of-the-art ultra-deep learning model (RaptorX). Experimental results demonstrate that contact-assisted threading using high-quality contacts having the Matthews Correlation Coefficient (MCC) ≥ 0.5 improves threading performance in nearly 30% cases, while low-quality contacts with MCC <0.35 degrades the performance for 50% cases. This holds true even in CASP13 dataset, where threading using high-quality contacts (MCC ≥ 0.5) significantly improves the performance of 22 instances out of 29. Collectively, our study uncovers the mutual association between the quality of predicted contacts and its possible utility in boosting threading performance for improving low-homology protein modeling.Entities:
Mesh:
Year: 2020 PMID: 32076047 PMCID: PMC7031282 DOI: 10.1038/s41598-020-59834-2
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Evaluation of predicted contact mapsa on PSICOV150 datasetb, sorted by non-increasing order of the value of MCC (best performance and best performer are listed in bold).
| Contact Source | Precision | Coverage | Mean FP Error | Spread | MCC |
|---|---|---|---|---|---|
| 72.08 | |||||
| MetaPSICOV | 71.61 | 34.20 | 0.73 | 5.63 | 0.47 |
| PSICOV | 72.83 | 8.78 | 1.08 | 8.32 | 0.24 |
| mfDCA | 75.22 | 3.20 | 1.03 | 20.05 | 0.14 |
aExcluding residue pairs with contact probability <0.5.
bExcluding two targets (1tqhA and 1hdoA) for which RaptorX could not predict contact maps.
Figure 1Representative examples of contact maps predicted by four complementary methods for targets 1aapA and 1dsxA. The upper triangles represent true (native) contacts of the target and the lower triangles represent predicted contacts with contact probability of at least 0.5. Numbers inside parenthesis indicate precision (%), and MCC respectively. For target 1aapA, (A) native contacts versus mfDCA contacts with a precision of 100% and an MCC of 0.09, (B) native contacts versus PSICOV contacts with a precision of 100% and an MCC of 0.22, (C) native contacts versus MetaPSICOV contacts with a precision of 84.62% and an MCC of 0.55, (D) native contacts versus RaptorX contacts with a precision of 83.78% and an MCC of 0.65. For target 1dsxA, (E) native contacts versus mfDCA contacts with a precision of 46.15% and an MCC of 0.13, (F) native contacts versus PSICOV contacts with a precision of 42.86% and an MCC of 0.09, (G) native contacts versus MetaPSICOV contacts with a precision of 86.21% and an MCC of 0.39, (H) native contacts versus RaptorX contacts with a precision of 78.12% and an MCC of 0.64.
Performance comparison on PSICOV150 targetsa based on top ranked models, sorted by non-decreasing order of performance (best performance and best performer are listed in bold) with shaded row representing the performance of pure threading method.
| Methods | Average TM-score ( | %time TM-score >0.5b |
|---|---|---|
| rrmfDCA-assisted threadingc | 0.58 (1.5e-11) | 69.6 |
| rrPSICOV-assisted threadingd | 0.59 (1.3e-09) | 71.6 |
| pure threadinge | 0.63 (0.0001) | 75.7 |
| rrMetaPSICOV-assisted threadingf | 0.64 (0.0007) | 77.7 |
aExcluding two targets (1tqhA and 1hdoA) for which RaptorX could not predict contact maps.
bPercentage of time the respective method predicts the correct fold (TM-score > 0.5).
cContact-assisted threading method using mfDCA contacts.
dContact-assisted threading method using PSICOV contacts.
ePure threading method (without contacts).
fContact-assisted threading method using MetaPSICOV contacts.
gContact-assisted threading method using RaptorX contacts.
*One sample T-Test’s p-value of the TM-score difference compared to rrRaptorX-assisted threading.
Figure 2A head-to-head comparison of different contact-assisted threading methods and baseline contact-free pure threading method on PSICOV150 dataset. (A) mfDCA-assisted threading method (referred to as rrmfDCA) versus baseline threading method (referred to as Pure threading), (B) PSICOV-assisted threading method (referred to as rrPSICOV) versus baseline threading method, (C) MetaPSICOV-assisted threading method (referred to as rrMetaPSICOV) versus baseline threading method, (D) RaptorX-assisted threading method (referred to as rrRaptorX) versus baseline threading method. Each point in each scatter plot represents joint TM-score of top ranked model predicted by baseline pure threading method and contact-assisted threading method. (E) TM-score distribution of top ranked models predicted by pure threading method versus mfDCA-assisted threading method (referred to as rrmfDCA-assisted threading), (F) TM-score distribution of top ranked models predicted by pure threading method versus PSICOV-assisted threading method (referred to as rrPSICOV-assisted threading), (G) TM-score distribution of top ranked models predicted by pure threading method versus MetaPSICOV-assisted threading method (referred to as rrMetaPSICOV-assisted threading), (H) TM-score distribution of top ranked models predicted by pure threading based method versus RaptorX-assisted threading method (referred to as rrRaptorX-assisted threading). Templates with sequence similarity >30% to the query sequence are excluded.
Figure 3A head-to-head comparison of different contact-assisted threading methods and baseline RaptorX-assisted threading method on PSICOV150 dataset. (A) mfDCA-assisted threading method (referred to as rrmfDCA) versus baseline RaptorX-assisted threading method (referred to as rrRAPTORX-assisted threading), (B) PSICOV-assisted threading method (referred to as rrPSICOV) versus baseline RaptorX-assisted threading method, (C) MetaPSICOV-assisted threading method (referred to as rrMetaPSICOV) versus baseline RaptorX-assisted threading method. Each point in each scatter plot represents joint TM-score of top ranked model predicted by baseline RaptorX-assisted threading and one of the other three contact-assisted threading methods respectively. (D) TM-score distribution of top ranked models predicted by RaptorX-assisted threading method versus mfDCA-assisted threading method (referred to as rrmfDCA-assisted threading), (F) TM-score distribution of top ranked models predicted by RaptorX-assisted threading method versus PSICOV-assisted threading method (referred to as rrPSICOV-assisted threading), (G) TM-score distribution of top ranked models predicted by RaptorX-assisted threading method versus MetaPSICOV-assisted threading method (referred to as rrMetaPSICOV-assisted threading). Templates with sequence similarity >30% to the query sequence are excluded.
Figure 4The relationship between changes in TM-score of contact-assisted threading methods compared to pure threading method, and the MCC (Matthews correlation coefficient) of predicted contact maps, tested on PSICOV150. The dataset includes all four contact-assisted threading methods over 148 targets resulting in a total of 592 instances. Each point in the scatter plot represents MCC of a predicted contact map and change in TM-score of a top ranked model predicted by various contact-assisted threading methods compared to pure threading. The dark points indicate improvement in TM-score (positive change in TM-score), whereas the grey points indicate performance deterioration (negative change in TM-score) compared to pure threading. The data points are separated based on the quality (measured by MCC by considering residue pairs with contact probability of at least 0.5) of contacts: (i) 211 pairs with high quality contacts (MCC ≥ 0.5), (ii) 301 pairs with low-quality (MCC < 0.35) contacts, and (iii) the twilight zone comprises of 80 pairs with moderate-quality contacts (0.35 ≤ MCC < 0.5). Each bar plot represents the percentage of TM-score improvement and deterioration compared to pure threading. Templates with sequence similarity >30% to the query protein are excluded.
Figure 5Representative example of contact-assisted threading with contact maps of diverse qualities on target 2mhrA. (A) Structural alignment between the top ranked model predicted by RaptorX-assisted threading (in thick rainbow) with a TM-score of 0.59 and the native structure of the target (in thin gray), (B) Structural alignment between top ranked model predicted by MetaPSICOV-assisted threading (in thick rainbow) with a TM-score of 0.44 and the native structure of the target (in thin gray), (C) Structural alignment between top ranked model predicted by PSICOV-assisted threading (in thick rainbow) with a TM-score of 0.26 and the native structure of the target (in thin gray), (D) Structural alignment between top ranked model predicted by mfDCA-assisted threading (in thick rainbow) with a TM-score of 0.19 and the native structure of the target (in thin gray). (E) Native contact map (upper triangle) versus predicted contact map by RaptorX (lower triangle) with an MCC of 0.55. (F) Native contact map (upper triangle) versus predicted contact map by MetaPSICOV (lower triangle) with an MCC of 0.44. (G) Native contact map (upper triangle) versus predicted contact map by PSICOV (lower triangle) with an MCC of 0.25. (H) Native contact map (upper triangle) versus predicted contact map by mfDCA (lower triangle) with an MCC of 0.12. For all predicted contact maps, pair of residues with contact probability <0.5 are excluded.
Performance evaluation on CASP13 dataseta based on average TM-score of top ranked models.
| Target type | TripletRes- assisted threading ( | RaptorX-Contact- assisted threading ( | pure threadingd |
|---|---|---|---|
| Full-length | 0.457 (0.001) | 0.449 (0.006) | 0.403 |
| Domain level | 0.392 (0.0002) | 0.387 (0.0008) | 0.340 |
aCASP officially released 20 full-length targets in a total of 32 domains on December 2018.
bZhang and coworkers participated in CASP13 with TripletRes (group number G032).
cXu and coworkers participated in CASP13 with RaptorX-Contact (group number G498).
dPure threading method (without contacts).
*One sample T-Test’s p-value of the TM-score difference compared to pure threading.