| Literature DB >> 27830053 |
Abstract
Proteomic MS/MS mass spectrometry detections are usually biased towards peptides cleaved by experimentally added digestion enzyme(s). Hence peptides resulting from spontaneous degradation and natural proteolysis usually remain undetected. Previous analyses of tryptic human proteome data (cleavage after K, R) detected non-canonical tryptic peptides translated according to tetra- and pentacodons (codons expanded by silent mono- and dinucleotides), and from transcripts systematically (a) deleting mono-, dinucleotides after trinucleotides (delRNAs), (b) exchanging nucleotides according to 23 bijective transformations. Nine symmetric and fourteen asymmetric nucleotide exchanges (X ↔ Y, e.g. A ↔ C; and X → Y → Z → X, e.g. A → C → G → A) produce swinger RNAs. Here unbiased reanalyses of these proteomic data detect preferentially non-canonical tryptic peptides despite assuming random cleavage. Unbiased analyses couldn't reconstruct experimental tryptic digestion if most detected non-canonical peptides were false positives. Detected non-tryptic non-canonical peptides map preferentially on corresponding, previously described non-canonical transcripts, as for tryptic non-canonical peptides. Hence unbiased analyses independently confirm previous trypsin-biased analyses that showed translations of del- and swinger RNA and expanded codons. Accounting for natural proteolysis completes trypsin-biased mitopeptidome analyses, independently confirms non-canonical transcriptions and translations.Entities:
Keywords: Bijective transformation; Digestive enzymes; Frameshift; RNA–DNA difference; Unbiased analyses
Year: 2016 PMID: 27830053 PMCID: PMC5094600 DOI: 10.1016/j.csbj.2016.09.004
Source DB: PubMed Journal: Comput Struct Biotechnol J ISSN: 2001-0370 Impact factor: 7.271
Fig. 1Sequence (A) and its systematic transformations and corresponding translations (B–F). B) A ↔ C systematic nucleotide exchange of sequence in A; C) assuming systematic codon expansion by silent mononucleotides; D) assuming systematic mononucleotide deletion after each trinucleotide (translation identical to that in C); E) assuming systematic codon expansion by silent dinucleotides; F) assuming systematic dinucleotide deletion after each trinucleotide (translation identical to that in E). RNAs and peptides corresponding to these alternative transcriptions and translations have been previously described for human mitochondria [22], [23]. For swinger transformations, A ↔ C is only one among 23 possibilities, nine symmetric of type X ↔ Y, and 14 asymmetric, of type X → Y → Z → X. Systematic deletions of mono- and dinucleotides after each trinucleotide are annotated as delRNA3–1 and delRNA3–2. Systematic deletions can start at the 5′ extremity of a sequence, which is indicated by delRNA3–1.0 and delRNA3–2.0, deletion frames can be shifted by 0–2 and 0–3 nucleotides for delRNA3–1 and delRNA3–2, respectively, which can be indicated by corresponding indices.
Abundances of residues at carboxyl extremities of non-canonical peptides detected by unbiased analyses. Analyses assume random cleavage of tryptic human mitoproteome. Peptides are translated from the del-, swinger-transformed human mitogenome, for codons expanded by 0–2 silent nucleotides. Column 1 indicates the residue. Columns 2, 6, 10 and 14 are numbers of detected peptides with residue indicated in 1, for each analysis assuming different transcription/translation (del-, swinger-, tetra- and pentacodon); 3, 7, 11 and 15 indicate total number of that residue in corresponding translations of the mitogenome; 4, 9, 12 and 16 indicate the bias of detecting peptides with that residue in carboxyl terminus position considering the total frequency of the residue in the corresponding translation of the mitogenome; 5, 9, 13 and 17 indicate numbers of peptides mapping on corresponding detected non-canonical RNAs. The two last lines compare results when merging tryptic vs other peptides, numbers of non-canonical peptides mapping on non-canonical RNAs are followed by expected numbers assuming random mapping.
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AA | Del | Swinger | Swinger | Swinger | ||||||||||||
| Tri | Genome | Bias | RNA | Tri | Genome | Bias | RNA | Tetra | Genome | Bias | RNA | Penta | Genome | Bias | RNA | |
| A | 10 | 8328 | 1.49 | 0 | 2 | 46,838 | 0.30 | 0 | 6 | 47,048 | 0.75 | 2 | 9 | 46,178 | 1.33 | 0 |
| C | 4 | 5680 | 0.87 | 0 | 0 | 22,697 | 0.00 | 5 | 22,836 | 1.29 | 0 | 0 | 22,108 | 0.00 | 0 | |
| D | 1 | 4662 | 0.27 | 0 | 3 | 21,545 | 0.97 | 0 | 0 | 21,752 | 0 | 1 | 3 | 20,795 | 0.98 | 0 |
| E | 6 | 6281 | 1.18 | 0 | 4 | 24,954 | 1.12 | 0 | 7 | 25,290 | 1.63 | 0 | 4 | 24,200 | 1.13 | 0 |
| F | 5 | 7868 | 0.79 | 0 | 4 | 30,878 | 0.91 | 0 | 4 | 30,982 | 0.76 | 0 | 4 | 29,626 | 0.92 | 1 |
| G | 12 | 12,857 | 1.16 | 0 | 9 | 57,120 | 1.10 | 1 | 12 | 57,648 | 1.22 | 0 | 9 | 55,452 | 1.11 | 0 |
| H | 0 | 6578 | 0.00 | 4 | 22,775 | 1.23 | 0 | 7 | 22,836 | 1.80 | 1 | 5 | 21,803 | 1.56 | 0 | |
| IL | 26 | 27,890 | 1.15 | 3 | 11 | 97,382 | 0.79 | 0 | 15 | 97,914 | 0.90 | 1 | 6 | 93,464 | 0.44 | 0 |
| K | 16 | 8142 | 2.43 | 2 | 10 | 30,603 | 2.29 | 0 | 14 | 30,982 | 2.65 | 0 | 14 | 29,608 | 3.22 | 2 |
| M | 6 | 7581 | 0.98 | 0 | 2 | 22,553 | 0.62 | 0 | 4 | 22,836 | 1.03 | 0 | 4 | 21,660 | 1.26 | 0 |
| N | 5 | 7723 | 0.80 | 0 | 7 | 26,449 | 1.85 | 0 | 6 | 26,660 | 1.32 | 0 | 5 | 25,427 | 1.34 | 1 |
| P | 9 | 12,857 | 0.87 | 3 | 6 | 57,489 | 0.73 | 0 | 7 | 57,648 | 0.71 | 0 | 7 | 55,452 | 0.86 | 0 |
| Q | 7 | 5647 | 1.53 | 2 | 6 | 24,126 | 1.74 | 0 | 13 | 24,206 | 3.15 | 0 | 4 | 23,422 | 1.16 | 0 |
| R | 9 | 5876 | 1.90 | 0 | 11 | 46,786 | 1.64 | 0 | 18 | 47,048 | 2.25 | 0 | 10 | 45,626 | 1.49 | 0 |
| S | 4 | 16,856 | 0.29 | 1 | 5 | 68,457 | 0.51 | 0 | 2 | 68,800 | 0.17 | 3 | 5 | 65,983 | 0.52 | 0 |
| T | 3 | 11,954 | 0.31 | 0 | 9 | 46,822 | 1.34 | 0 | 1 | 47,047 | 0.13 | 0 | 4 | 44,813 | 0.61 | 0 |
| V | 11 | 11,954 | 1.14 | 2 | 10 | 46,591 | 1.50 | 1 | 4 | 47,047 | 0.50 | 4 | 7 | 44,813 | 1.06 | 1 |
| W | 9 | 6838 | 1.63 | 1 | 3 | 23,960 | 0.88 | 0 | 4 | 24,206 | 0.97 | 0 | 1 | 23,118 | 0.30 | 0 |
| Y | 5 | 7630 | 0.81 | 0 | 0 | 23,238 | 0.00 | 0 | 2 | 22,836 | 0.51 | 0 | 4 | 21,364 | 1.28 | 0 |
| Tot | 148 | 183,202 | 14/6.31 | 106 | 741,263 | 2/3.02 | 127 | 745,622 | 12/3.38 | 105 | 714,912 | 5/2.94 | ||||
| Tryps | 25 | 14,018 | 2.21 | 2/1.07 | 21 | 77,389 | 1.90 | 0/0.68 | 32 | 78,030 | 2.41 | 0/0.19 | 24 | 75,234 | 2.24 | 2/0.70 |
| Others | 123 | 169,184 | 0.90 | 12/5.24 | 85 | 663,874 | 0.90 | 2/2.34 | 95 | 667,592 | 0.84 | 12/3.19 | 81 | 639,678 | 0.85 | 3/2.24 |
Pearson correlation coefficient r between abundances of non-canonical peptides detected by unsupervised proteomic analyses of trypsin-digested human mitochondrial proteomic MS/MS data and abundances of corresponding, previously detected non-canonical RNAs [22], [23]. Correlations are calculated separately for tryptic peptides (carboxyl extremity K or R) and other peptides. Non-canonical transcripts are del-and swinger-transformations of the human mitogenome, the latter translated along codons expanded by 0, 1 and 2 silent nucleotides. P values are one tailed, expecting positive correlations. Fisher's method for combining P values sums the − 2 × log Pi, where i runs from 1 to k. This sum follows a chi-square statistic distribution with 2 × k degrees of freedoms, where k is the number of Ps combined (here k = 4). Bold indicates statistical significance at P < 0.05.
| Pearson r | Unbias Tryps | Other | |||
|---|---|---|---|---|---|
| Transformation | r | P | r | P | All |
| Del | 0.358 | 0.172 | 0.270 | 0.241 | 0.401 |
| Swinger | 0.016 | 0.171 | 0.217 | 0.253 | |
| Swinger tetra | 0.109 | 0.310 | 0.099 | 0.327 | 0.143 |
| Swinger penta | 0.192 | 0.190 | 0.306 | 0.078 | 0.186 |
| Combined chi | 0.026 | 13.24 | 0.104 |
Bias in amino acid identity at the N-terminal (column 1) of the peptide after detected peptides, for unbiased analyses assuming random cleavage. Analysis search for peptides matching translations of the del- (columns 2–4) and swinger-transformed human mitogenome (columns 5–7), and translations of the swinger mitogenomes according to tetra- (columns 8–10) and pentacodons (columns 11–13). Columns 2, 5 , 8 and 11 incicate numbers of detections. ‘Genome’ (columns 3, 6, 9, 12) indicates abundances of that residue in the corresponding hypothetical translations of the complete mitogenome after transformations and non-canonical translations. Biases (columns 4, 7, 10, 13) do not resemble those for carboxyl-extremities of detected peptides (Table 1) and are less extreme. Overall they match random distributions around ‘1’, indicating lack of bias. This suggests that there is no or very little natural proteolysis with cleavage specificity related to the N-terminal of peptides after detected peptides.
| 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| AA | Del | Swinger | Swinger | Swinger | ||||||||
| Tri | Genome | Bias | Tri | Genome | Bias | Tetra | Genome | Bias | Penta | Genome | Bias | |
| A | 8 | 8328 | 1.19 | 6 | 46,838 | 0.79 | 10 | 47,048 | 1.09 | 3 | 46,178 | 0.42 |
| C | 4 | 5680 | 0.87 | 3 | 22,697 | 0.82 | 6 | 22,836 | 1.34 | 8 | 22,108 | 2.31 |
| D | 6 | 4662 | 1.59 | 3 | 21,545 | 0.86 | 7 | 21,752 | 1.64 | 5 | 20,795 | 1.54 |
| E | 6 | 6281 | 1.18 | 8 | 24,954 | 1.98 | 7 | 25,290 | 1.41 | 2 | 24,200 | 0.53 |
| F | 5 | 7868 | 0.79 | 2 | 30,878 | 0.40 | 5 | 30,982 | 0.82 | 2 | 29,626 | 0.43 |
| G | 13 | 12,857 | 1.25 | 8 | 57,120 | 0.87 | 8 | 57,648 | 0.71 | 4 | 55,452 | 0.46 |
| H | 6 | 6578 | 1.13 | 4 | 22,775 | 1.09 | 4 | 22,836 | 0.90 | 2 | 21,803 | 0.59 |
| IL | 20 | 27,890 | 0.89 | 20 | 97,382 | 1.27 | 25 | 97,914 | 1.30 | 18 | 93,464 | 1.23 |
| K | 5 | 8142 | 0.76 | 4 | 30,603 | 0.81 | 5 | 30,982 | 0.82 | 4 | 29,608 | 0.86 |
| M | 5 | 7581 | 0.81 | 6 | 22,553 | 1.64 | 5 | 22,836 | 1.12 | 4 | 21,660 | 1.18 |
| N | 9 | 7723 | 1.44 | 2 | 26,449 | 0.47 | 6 | 26,660 | 1.15 | 3 | 25,427 | 0.75 |
| P | 11 | 12,857 | 1.06 | 5 | 57,489 | 0.54 | 12 | 57,648 | 1.06 | 11 | 55,452 | 1.27 |
| Q | 2 | 5647 | 0.44 | 4 | 24,126 | 1.24 | 4 | 24,206 | 0.84 | 4 | 23,422 | 1.09 |
| R | 3 | 5876 | 0.61 | 8 | 46,786 | 1.06 | 3 | 47,048 | 0.33 | 13 | 45,626 | 1.82 |
| S | 12 | 16,856 | 0.88 | 8 | 68,457 | 0.72 | 19 | 68,800 | 1.41 | 11 | 65,983 | 1.06 |
| T | 11 | 11,954 | 1.14 | 12 | 46,822 | 1.58 | 9 | 47,047 | 0.98 | 8 | 44,813 | 1.14 |
| V | 9 | 11,954 | 0.93 | 13 | 46,591 | 1.72 | 5 | 47,047 | 0.54 | 6 | 44,813 | 0.96 |
| W | 7 | 6838 | 1.27 | 2 | 23,960 | 0.52 | 2 | 24,206 | 0.42 | 2 | 23,118 | 0.55 |
| Y | 6 | 7630 | 0.97 | 2 | 23,238 | 0.53 | 4 | 22,836 | 0.90 | 2 | 21,364 | 0.60 |
| Tot | 148 | 183,202 | 120 | 741,263 | 146 | 745,622 | 112 | 714,912 | ||||
Observed (column 4) and expected (column 5) numbers of detected non-canonical peptides compatible with translations according to each nuclear and mitochondrial vertebrate genetic codes. Predictions account for peptide length (mean length and standard deviation in columns 2 and 3), considering that translation of 60/64 (0.9375) codons is identical between these genetic codes. Results indicate strong biases against detection of peptides compatible with both genetic codes, showing that detected populations of peptides are specifically translated according to the mitochondrial vertebrate genetic code. This systematic bias excludes that detected non-canonical peptides have cytosolic origins.
| 1 | 2 | 3 | 4 | 5 |
|---|---|---|---|---|
| Transformation | AAs | Sd | Obs | Exp |
| Del | 18.28 | 5.96 | 23 | 48.75 |
| Swinger tri | 21.23 | 9.75 | 27 | 36.26 |
| Swinger tetra | 17.37 | 7.38 | 19 | 52.45 |
| Swinger penta | 17.37 | 7.79 | 23 | 40.40 |
Distributions of amino acids inserted at stops in detected non-canonical peptides (columns Del, Swinger tri, Swinger tetra and Swinger penta), compared to the distribution of amino acids in canonical proteins encoded by the human mitogenome (Mito). Bias is the ratio between the frequency of the amino acid across all non-canonical peptides (column All) and its frequency in canonical proteins. P values are calculated using a chi-square test. Statistically significant results at P < 0.05 are underlined, and in bold when these are positive biases indicating greater than expected insertions at stop codons.
| AA | Mito | Del | Swinger tri | Swinger tetra | Swinger penta | All peptides | Bias | P |
|---|---|---|---|---|---|---|---|---|
| A | 225 | 1 | 10 | 7 | 1 | 19 | 0.65 | 0.062 |
| C | 22 | 3 | 3 | 3 | 0 | 9 | 3.15 | |
| D | 66 | 2 | 5 | 10 | 0 | 17 | 1.98 | |
| E | 88 | 5 | 9 | 4 | 2 | 20 | 1.75 | |
| F | 216 | 3 | 2 | 4 | 1 | 10 | 0.36 | |
| G | 212 | 16 | 9 | 12 | 1 | 38 | 1.38 | 0.058 |
| H | 97 | 0 | 4 | 4 | 0 | 8 | 0.64 | 0.208 |
| I,L | 963 | 19 | 6 | 19 | 46 | 90 | 0.72 | |
| K | 95 | 18 | 22 | 22 | 11 | 73 | 5.92 | |
| M | 208 | 8 | 9 | 6 | 5 | 28 | 1.04 | 0.853 |
| N | 164 | 8 | 7 | 5 | 3 | 23 | 1.08 | 0.723 |
| P | 219 | 5 | 3 | 9 | 5 | 22 | 0.77 | 0.237 |
| Q | 90 | 18 | 7 | 8 | 10 | 43 | 3.68 | |
| R | 63 | 4 | 2 | 2 | 3 | 11 | 1.35 | 0.359 |
| S | 274 | 9 | 5 | 10 | 10 | 34 | 0.96 | 0.796 |
| T | 351 | 10 | 8 | 6 | 5 | 29 | 0.64 | |
| V | 167 | 5 | 5 | 5 | 2 | 17 | 0.78 | 0.328 |
| W | 104 | 2 | 2 | 3 | 3 | 10 | 0.74 | 0.356 |
| Y | 135 | 2 | 1 | 7 | 4 | 14 | 0.80 | 0.414 |