| Literature DB >> 26607834 |
Qin Tang1,2, Yulong Song1,2, Mijuan Shi1, Yingyin Cheng1, Wanting Zhang1, Xiao-Qin Xia1.
Abstract
Many coronaviruses are capable of interspecies transmission. Some of them have caused worldwide panic as emerging human pathogens in recent years, e.g., severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory syndrome coronavirus (MERS-CoV). In order to assess their threat to humans, we explored to infer the potential hosts of coronaviruses using a dual-model approach based on nineteen parameters computed from spike genes of coronaviruses. Both the support vector machine (SVM) model and the Mahalanobis distance (MD) discriminant model achieved high accuracies in leave-one-out cross-validation of training data consisting of 730 representative coronaviruses (99.86% and 98.08% respectively). Predictions on 47 additional coronaviruses precisely conformed to conclusions or speculations by other researchers. Our approach is implemented as a web server that can be accessed at http://bioinfo.ihb.ac.cn/seq2hosts.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26607834 PMCID: PMC4660426 DOI: 10.1038/srep17155
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Summary of the hosts predicted for the 730 samples by MD in leave-one-out cross-validation.
| Host species | Viruses isolated (A) | Other viruses (B = 730 − A) | Total predictions (C) | Infectivity probability (P = C/730) | Predictions in others (D = C – A) | Percentage in others (E = D/B × 100%) |
|---|---|---|---|---|---|---|
| avian | 173 | 557 | 377 | 0.5164 | 204 | 36.62% |
| bat | 74 | 656 | 494 | 0.6767 | 420 | 64.02% |
| bovine | 77 | 653 | 77 | 0.1055 | 0 | 0 |
| human | 196 | 534 | 202 | 0.2767 | 6 | 1.12% |
| murine | 28 | 702 | 28 | 0.0384 | 0 | 0 |
| porcine | 182 | 548 | 185 | 0.2534 | 3 | 0.55% |
The incorrect predictions of MD and SVM in leave-one-out cross-validation.
| NCBI Access No. | Virus sources | Wrong predictions by MD | Wrong predictions by SVM | |
|---|---|---|---|---|
| AB008940.1; AB551247.1; AF190406.1;AF201929.1; AF208066.1; FJ647223.1;FJ647224.1;FJ938068.1;JF792616.1 | murine | 9 (bat) | ||
| NC_011549.1; NC_011550.1; NC_016993.1;NC_016994.1; NC_016995.1 | avian | 5 (bat) | ||
| NC_016996.1 | avian | 1 (human) | ||
| In total | 14 | 1 | ||
| Accuracy rate | (730–14)/730 = 98.08% | (730–1)/730 = 99.86% |
The isolate sources and predicted hosts of 47 coronaviruses.
| Serial number | Test sample AccNum | SVM prediction | MD prediction | Isolate source |
|---|---|---|---|---|
| 1 | AY572034.1 | human* | bat**, human* | palm civet |
| 2 | AY572036.1 | human* | bat**, human* | palm civet |
| 3 | AY572037.1 | human* | bat**, human* | palm civet |
| 4 | AY687355.1 | human* | bat**, human* | palm civet |
| 5 | AY687356.1 | human* | bat**, human** | palm civet |
| 6 | AY687358.1 | human* | bat**, human* | raccoon dog |
| 7 | AY687359.1 | human* | bat**, human** | palm civet |
| 8 | AY687360.1 | human* | bat**, human* | palm civet |
| 9 | AY687361.1 | human* | bat**, human* | palm civet |
| 10 | AY687362.1 | human* | bat**, human* | palm civet |
| 11 | AY687363.1 | human* | bat**, human* | palm civet |
| 12 | AY687365.1 | human* | bat**, human* | palm civet |
| 13 | AY687367.1 | human* | bat**, human* | palm civet |
| 14 | AY687368.1 | human* | bat**, human* | palm civet |
| 15 | AY687369.1 | human* | bat**, human* | palm civet |
| 16 | AY687370.1 | human* | bat**, human* | palm civet |
| 17 | AY687371.1 | human* | bat**, human* | palm civet |
| 18 | AY687372.1 | human* | bat**, human* | palm civet |
| 19 | AY627044.1 | human* | bat**, human** | palm civet |
| 20 | AY627045.1 | human* | bat**, human** | palm civet |
| 21 | AY627046.1 | human* | bat**, human** | palm civet |
| 22 | AY627047.1 | human* | bat**, human** | palm civet |
| 23 | AY627048.1 | human* | bat**, human** | palm civet |
| 24 | AY613952.1 | human* | bat**, human* | palm civet |
| 25 | AY613951.1 | human* | bat**, human* | palm civet |
| 26 | AY525636.1 | human* | bat**, human* | palm civet |
| 27 | DQ514528.1 | human* | human**, bat** | palm civet |
| 28 | DQ514529.1 | human* | human**, bat** | palm civet |
| 29 | DQ514530.1 | human* | human**, bat** | palm civet |
| 30 | DQ514531.1 | human* | human**, bat** | palm civet |
| 31 | DQ514532.1 | human* | human**, bat** | palm civet |
| 32 | KJ477102.1 | human* | human**, bat* | dromedary |
| 33 | KJ650098.1 | human* | human**, bat** | dromedary |
| 34 | KJ650295.1 | human* | human**, bat** | dromedary |
| 35 | KJ713295.1 | human* | human**, bat** | dromedary |
| 36 | KJ713296.1 | human* | human**, bat** | dromedary |
| 37 | KJ713297.1 | human* | human**, bat** | dromedary |
| 38 | KJ713298.1 | human* | human**, bat** | dromedary |
| 39 | KJ713299.1 | human* | human**, bat** | dromedary |
| 40 | KF917527.1 | human* | human**, bat** | dromedary |
| 41 | AY654624.1 | human*, bat, avian, porcine | human**, bat*, avian, porcine | porcine |
| 42 | KC881005.1 | bat | bat**, avian* | bat |
| 43 | KC881006.1 | human, bat | bat** | bat |
| 44 | KC881007.1 | human, bat | bat** | bat |
| 45 | DQ915164.2 | bovine* | bovine**, avian*, bat* | alpaca |
| 46 | FJ415324.1 | bovine*, human | bovine**, avian*, bat, human | human |
| 47 | FJ938067.1 | bovine*, human | bovine**, avian*, bat, human | human, bovine |
Predictions consist of hosts with minimal MD or p values, those with MD <= 200 or p <= 0.05 for SVM, and those with MD or p values no greater than corresponding values of isolate sources if the isolate sources are among the six categories of hosts. All predictions are listed in ascending of MD or p values. *p <= 0.05 or MD <= 200. **p <= 0.01 or MD <= 100.
Figure 1Tendencies of MD and SVM models.