| Literature DB >> 15180906 |
Lars Kiemer1, Ole Lund, Søren Brunak, Nikolaj Blom.
Abstract
BACKGROUND: Despite the passing of more than a year since the first outbreak of Severe Acute Respiratory Syndrome (SARS), efficient counter-measures are still few and many believe that reappearance of SARS, or a similar disease caused by a coronavirus, is not unlikely. For other virus families like the picornaviruses it is known that pathology is related to proteolytic cleavage of host proteins by viral proteinases. Furthermore, several studies indicate that virus proliferation can be arrested using specific proteinase inhibitors supporting the belief that proteinases are indeed important during infection. Prompted by this, we set out to analyse and predict cleavage by the coronavirus main proteinase using computational methods.Entities:
Mesh:
Substances:
Year: 2004 PMID: 15180906 PMCID: PMC442122 DOI: 10.1186/1471-2105-5-72
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Figure 1Logo plot of a multiple alignment of 77 coronavirus cleavage sites. The height of the letters reflects the Shannon information at individual positions (see Methods section for detailed information).
Figure 2Method performance comparison. Using consensus patterns or neural network (NN) to identify cleavage sites. Green bars are percentage of true positives, red bars are percentage of true negatives and blue bars are Matthews correlation coefficients multiplied with 100.
Figure 3Reliability analysis of data set test results. Scoring range (0 – 1) was divided into ten bins. The fraction of negative examples in each bin is illustrated with red bars, the fraction of positive examples is illustrated with green bars, blue bars are posterior probabilities of a true cleavage prediction (see Methods section for detailed information).
Selected potential cleavage sites in human proteins from the Swiss-Prot database examined in this work. Columns represent Swiss-Prot identifier, predicted cleavage site position of P1 in the target protein, cleavage site score, and cellular localisation of target protein (Cyt – cytoplasmic, Nuc – nuclear, Mem – membrane associated). The last column lists the cleavage site in the sequence – cleavage is predicted between the central glutamine residue (Q) and the following amino acid residue. Sorted by prediction score.
| AT6B_HUMAN | Nuc | 358 | 0.916 | EARL |
| INI2_HUMAN | Mem | 97 | 0.890 | VATL |
| PO21_HUMAN | Nuc | 62 | 0.874 | GTSL |
| IRA1_HUMAN | Cyt | 457 | 0.859 | QSTL |
| CFTR_HUMAN | Mem | 762 | 0.842 | GPTL |
| SCAD_HUMAN | Mem | 22 | 0.828 | GSHL |
| P532_HUMAN | Nuc | 308 | 0.782 | ASVP |
| RPC1_HUMAN | Cyt | 195 | 0.765 | SNFL |
| P531_HUMAN | Nuc | 196 | 0.738 | KEQL |
| T2D1_HUMAN | Nuc | 741 | 0.730 | GQLL |
| P532_HUMAN | Nuc | 197 | 0.725 | KAAL |
| RPA1_HUMAN | Cyt | 329 | 0.704 | TVNL |
| CFTR_HUMAN | Mem | 958 | 0.693 | HSVL |
| MAE1_HUMAN | Cyt | 64 | 0.661 | KVKF |
| MAE3_HUMAN | Cyt | 64 | 0.661 | KVKF |
| P531_HUMAN | Nuc | 410 | 0.660 | QKKL |
| CFTR_HUMAN | Mem | 890 | 0.654 | NTPL |
| P532_HUMAN | Nuc | 722 | 0.624 | SPNL |
| T2DT_HUMAN | Nuc | 133 | 0.619 | PSSV |
| T2D3_HUMAN | Nuc | 610 | 0.570 | SSGK |
| MAP4_HUMAN | Cyt | 1005 | 0.519 | YSHI |
Known main proteinase cleavage sites in coronavirus polyproteins used in this study, which were missed by the neural network during cross-validation. Position refers to position in the viral polyprotein. The last column lists the cleavage site in the sequence – cleavage occurs between the central glutamine residue (Q) and the following amino acid residue.
| NC_001451 | 3928 | AIBV | KSSV |
| NC_001846 | 3923 | MHV | VSQI |
| NC_001846 | 5984 | MHV | NPRL |
| NC_002306 | 5527 | TGV | KIGL |
| NC_003045 | 5900 | BCoV | ETRV |
| NC_003436 | 3299 | PEDV | GVNL |
| NC_003436 | 6141 | PEDV | SNNL |
| NC_004718 | 3546 | SARS | GVTF |
| NC_004718 | 4369 | SARS | EPLM |
| NC_004718 | 5902 | SARS | VATL |
Negative examples predicted to be cleaved by the neural network during cross-validation. Position refers to position in the viral polyprotein. The last column lists the cleavage site in the sequence – cleavage is predicted between the central glutamine residue (Q) and the following amino acid residue.
| NC_001846 | 3607 | MHV | HSGF |
| NC_001846 | 6613 | MHV | YTDL |
| NC_002306 | 1457 | TGV | ETSL |
| NC_002306 | 5747 | TGV | YSSS |
| NC_002306 | 698 | TGV | ETNI |
| NC_002306 | 85 | TGV | SVML |
| NC_002645 | 1169 | HCoV-229E | IRQL |
| NC_002645 | 2659 | HCoV-229E | YSSI |
| NC_002645 | 322 | HCoV-229E | VIAL |
| NC_003045 | 1364 | BCoV | DART |
| NC_003045 | 1498 | BCoV | RTFV |
| NC_003045 | 2713 | BCoV | SSDF |
| NC_003045 | 311 | BCoV | VMRL |
| NC_003436 | 1751 | PEDV | SAGL |