| Literature DB >> 32462744 |
Duarte Gouveia1, Lucia Grenga1, Jean-Charles Gaillard1, Fabrice Gallais1, Laurent Bellanger1, Olivier Pible1, Jean Armengaud1.
Abstract
Detection of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is a crucial tool for fighting the COVID-19 pandemic. This dataset brief presents the exploration of a shotgun proteomics dataset acquired on SARS-CoV-2 infected Vero cells. Proteins from inactivated virus samples were extracted, digested with trypsin, and the resulting peptides were identified by data-dependent acquisition tandem mass spectrometry. The 101 peptides reporting for six viral proteins were specifically analyzed in terms of their analytical characteristics, species specificity and conservation, and their proneness to structural modifications. Based on these results, a shortlist of 14 peptides from the N, S, and M main structural proteins that could be used for targeted mass-spectrometry method development and diagnostic of the new SARS-CoV-2 is proposed and the best candidates are commented.Entities:
Keywords: COVID-19; SARS-CoV-2; mass spectrometry; peptides; proteomics; viral protein detection
Mesh:
Substances:
Year: 2020 PMID: 32462744 PMCID: PMC7267140 DOI: 10.1002/pmic.202000107
Source DB: PubMed Journal: Proteomics ISSN: 1615-9853 Impact factor: 5.393
List of the 14 viral peptides shortlisted for targeted method and their analytical characteristics, specificity, modifications, and missed cleavages
| Protein | Peptide Sequence | Precursor Charge | Retention time [min] | Precursor | Product | Inter‐species specificity | Intra‐species conservation | Modifications observed in the DDA data | Modifications observed in other deposited datasets | Estimation of modification rate based on GPMDB data | Detection of missed cleavages in the DDA data |
|---|---|---|---|---|---|---|---|---|---|---|---|
| Protein M | EITVATSR | 2 | 19.48 ± 0.55 | 438.742899 | 434.235772 (y4), 634.351865 (y6), 175.118952 (y1) | no | yes | no | No | 0% | EITVATSRTLSYYK |
| IAGHHLGR | 2 | 7.22 ± 0.32 | 430.74611 | 482.283391 (y4), 345.22448 (y3), 676.363767 (y6) | no | yes | no | No | 0% | no | |
| VAGDSGFAAYSR | 2 | 35.43 ± 0.75 | 600.785827 | 1030.458849 (y10), 171.112804 (b2), 858.410442 (y8) | yes | yes | no | One succinylation in 128 observations | 1% | LGASQRVAGDSGFAAYSR, VAGDSGFAAYSRYR | |
| Protein N | ADETQALPQR | 2 | 21.33 ± 0.65 | 564.785827 | 400.230293 (y3), 584.351471 (y5), 187.071333 (b2) | yes | no | deamidation (NQ) |
Eight deamidation in 87 observations | 9% | ADETQALPQRQK, ADETQALPQRQKK, KADETQALPQRQR, KADETQALPQRQRQK, KKADETQALPQR |
| AYNVTQAFGR | 2 | 38.7 ± 0.77 | 563.78563 | 679.352199 (y6), 892.463541 (y8), 349.150646 (b3) | yes | no | deamidation (NQ) | 11 deamidations in 139 observations | 8% | RTATKAYNVTQAFGR, TATKAYNVTQAFGR | |
| GFYAEGSR | 2 | 20.91 ± 0.56 | 443.706317 | 519.252151 (y5), 682.315479 (y6), 448.215037 (y4) | no | yes | no | Two phosphorylation in 33 observations | 6% | GFYAEGSRGGSQASSR | |
| GPEQTQGNFGDQELIR | 3 | 49.45 ± 0.77 | 596.955223 | 830.436657 (y7), 175.118952 (y1), 658.38825 (y5) | no | no | deamidation (NQ) |
Eight deamidations in 98 observations | 8% | RGPEQTQGNFGDQELIR | |
| IGMEVTPSGTWLTYTGAIK | 3 | 82.63 ± 0.2 | 675.683564 | 753.414131 (y7), 866.498195 (y8), 652.366452 (y6) | yes | no | Oxidation (M) | Six carbamidomethyl, three phosphorylation, 30 dioxidation, 224 oxidation in 400 observations | 66% | no | |
| NPANNAAIVLQLPQGTTLPK | 2 | 76.11 +/‐ 0.69 | 1030.578571 | 841.477794 (y8), 1294.772913 (y12), 1195.704499 (y11) | yes | no | deamidation (NQ) | 95 deamidation, seven phophorylation in 415 observations | 25% | DHIGTRNPANNAAIVLQLPQGTTLPK | |
| 3 | 76.12 +/‐ 0.69 | 687.388139 | 841.477794 (y8), 766.384228 (b8), 865.452642 (b9) | ||||||||
| WYFYYLGTGPEAGLPYGANK | 3 | 85.3 +/‐ 0.17 | 756.365111 | 649.330401 (y6), 819.435929 (y8), 890.473043 (y9) | yes | no | deamidation (NQ) | Two deamidations, 75 dioxidations, 75 oxidations in 443 observations | 34% | WYFYYLGTGPEAGLPYGANKDGIIWVATEGALNTPK | |
| Protein S | FQTLLALHR | 3 | 47.82 +/‐ 0.79 | 366.885464 | 496.299041 (y4), 609.383105 (y5), 425.261928 (y3) | yes | yes | no | no | 0% | no |
| GWIFGTTLDSK | 2 | 75.84 +/‐ 0.64 | 612.816595 | 244.108053 (b2), 868.441074 (y8), 721.37266 (y7) | yes | no | no | 15 dioxidation, 11 oxidation in 52 observations | 50% | no | |
| HTPINLVR | 2 | 22.94 +/‐ 0.55 | 475.282526 | 711.451185 (y6), 239.113866 (b2), 175.118952 (y1) | yes | yes | deamidation (NQ) |
Two deamidations in 51 observations | 4% | no | |
| LQSLQTYVTQQLIR | 2 | 78.5 +/‐ 0.69 | 845.977961 | 1121.631334 (y9), 1449.806004 (y12), 1249.689912 (y10) | yes | yes | no | no | 0% | no | |
| 3 | 78.5 +/‐ 0.69 | 564.321066 | 758.451913 (y6), 857.520327 (y7), 242.149918 (b2) |
Specificity is given by hits only on human‐virus, the SARS‐CoV2, in a NCBInr BLAST search at 100% ID and 100% query coverage (on 24th of April 2020);
100% sequence conservation in human‐viruses from Figure 2;
from Global Proteome Machine Database (GPMDB, https://gpmdb.thegpm.org/thegpm‐cgi/dblist_pep.pl)
Figure 1Retention times and peak areas from the 16 precursor ions identified from the skyline analysis.
Figure 2Sequence variants identified for each of the 14 peptides from Table 1. The sequences of these peptides are indicated in bold. Variants were identified through multiple sequence alignments of the ten viral proteins from virus Italy‐INMI1 strain with other sequenced human SARS‐CoV‐2, bat, and pangolin closely related viruses. Variations in the amino acid sequences are highlighted in red.