| Literature DB >> 26114101 |
O V Galzitskaya1, M Yu Lobanov1.
Abstract
How is it possible to find good traits for phylogenetic reconstructions? Here, we present a new phyloproteomic criterion that is an occurrence of simple motifs which can be imprints of evolution history. We studied the occurrences of 11780 six-residue-long motifs consisting of two randomly located amino acids in 97 eukaryotic and 25 bacterial proteomes. For all eukaryotic proteomes, with the exception of the Amoebozoa, Stramenopiles, and Diplomonadida kingdoms, the number of proteins containing the motifs from the first group (one of the two amino acids occurs once at the terminal position) made about 20%; in the case of motifs from the second (one of two amino acids occurs one time within the pattern) and third (the two amino acids occur randomly) groups, 30% and 50%, respectively. For bacterial proteomes, this relationship was 10%, 27%, and 63%, respectively. The matrices of correlation coefficients between numbers of proteins where a motif from the set of 11780 motifs appears at least once in 9 kingdoms and 5 phyla of bacteria were calculated. Among the correlation coefficients for eukaryotic proteomes, the correlation between the animal and fungi kingdoms (0.62) is higher than between fungi and plants (0.54). Our study provides support that animals and fungi are sibling kingdoms. Comparison of the frequencies of six-residue-long motifs in different proteomes allows obtaining phylogenetic relationships based on similarities between these frequencies: the Diplomonadida kingdoms are more close to Bacteria than to Eukaryota; Stramenopiles and Amoebozoa are more close to each other than to other kingdoms of Eukaryota.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26114101 PMCID: PMC4465679 DOI: 10.1155/2015/208346
Source DB: PubMed Journal: Biomed Res Int Impact factor: 3.411
Averaged correlation coefficients (in percentage terms) between numbers of proteins where a simple motif, six residues long, from the whole set of 11780 motifs appears at least once in 9 kingdoms of Eukaryota and 5 phyla of Bacteria.
| Metazoa (17) | Viridiplantae (5) | Stramenopiles (1) | Choanoflagellida (1) | Euglenozoa (4) | Alveolata (6) | Amoebozoa (2) | Diplomonadida (3) | Fungi (58) | Acidobacteria (1) | Actinobacteria (14) | Proteobacteria (8) | Bacteroidetes (2) | Chloroflexi (1) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 65* | 49 | 46 | 29 | 47 | 22 | 30 | 35 | 62* | 27 | 25 | 27 | 26 | 24 | Metazoa (17) |
| 49 | 61* | 60* | 28 | 47 | 14 | 13 | 27 | 54* | 41 | 43 | 42 | 21 | 22 | Viridiplantae (5) |
| 46 | 60* | — | 20 | 52* | 8 | 8 | 16 | 41 | 49 | 48 | 44 | 23 | 17 | Stramenopiles (1) |
| 29 | 28 | 20 | — | 30 | 7 | 13 | 23 | 32 | 20 | 21 | 22 | 12 | 15 | Choanoflagellida (1) |
| 47 | 47 | 52* | 30 | 47 | 10 | 17 | 27 | 45 | 35 | 39 | 35 | 24 | 22 | Euglenozoa (4) |
| 22 | 14 | 8 | 7 | 10 | 69* | 37 | 8 | 25 | −1 | −1 | 0 | 13 | 3 | Alveolata (6) |
| 30 | 13 | 8 | 13 | 17 | 37 | 90** | 13 | 35 | −1 | −1 | −1 | 5 | 3 | Amoebozoa (2) |
| 35 | 27 | 16 | 23 | 27 | 8 | 13 | 68* | 38 | 21 | 22 | 22 | 25 | 27 | Diplomonadida (3) |
| 62* | 54* | 41 | 32 | 45 | 25 | 35 | 38 | 71* | 28 | 26 | 28 | 21 | 21 | Fungi (58) |
|
| ||||||||||||||
| 27 | 41 | 49 | 20 | 35 | −1 | −1 | 21 | 28 | — | 70* | 67* | 32 | 39 | Acidobacteria (1) |
| 25 | 43 | 48 | 21 | 39 | −1 | −1 | 22 | 26 | 70* | 87** | 74* | 29 | 38 | Actinobacteria (14) |
| 27 | 42 | 44 | 22 | 35 | 0 | −1 | 22 | 28 | 67* | 74* | 72* | 29 | 39 | Proteobacteria (8) |
| 26 | 21 | 23 | 12 | 24 | 13 | 5 | 25 | 21 | 32 | 29 | 29 | 39 | 39 | Bacteroidetes (2) |
| 24 | 22 | 17 | 15 | 22 | 3 | 3 | 27 | 21 | 39 | 38 | 39 | 39 | — | Chloroflexi (1) |
Averaged correlation coefficients (in percentage terms) between numbers of proteins where a simple motif, six residues long, appears at least once in 17 animal proteomes (kingdom Metazoa).
| Phylum | Proteome |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Chordata |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
| 95** | 95** | 89** | 80** | 70* |
| 49 | 57* | 44 | 60* | 68* |
| 38 | 50* | 68* | ||
|
|
| 95** | 97** | 90** | 82** | 75** |
| 56* | 64* | 52* | 66* | 71* |
| 44 | 57* | 70* | ||
|
|
| 95** | 97** | 90** | 83** | 72* |
| 50* | 61* | 46 | 61* | 71* |
| 43 | 53* | 70* | ||
|
|
| 89** | 90** | 90** | 86** | 71* |
| 48 | 61* | 46 | 59* | 74* |
| 42 | 55* | 73* | ||
|
|
| 80** | 82** | 83** | 86** | 72* |
| 48 | 68* | 49 | 60* | 80** |
| 48 | 60* | 77** | ||
|
|
| 70* | 75** | 72* | 71* | 72* |
| 45 | 54* | 43 | 56* | 61* |
| 50* | 52* | 59* | ||
|
| ||||||||||||||||||
| Arthropoda |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
|
|
| 49 | 56* | 50* | 48 | 48 | 45 |
| 84** | 91** | 87** | 53* |
| 47 | 71* | 42 | ||
|
|
| 57* | 64* | 61* | 61* | 68* | 54* |
| 84** | 86** | 87** | 73* |
| 54* | 76** | 57* | ||
|
|
| 44 | 52* | 46 | 46 | 49 | 43 |
| 91** | 86** | 91** | 53* |
| 51* | 75** | 40 | ||
|
|
| 60* | 66* | 61* | 59* | 60* | 56* |
| 87** | 87** | 91** | 62* |
| 48 | 72* | 50* | ||
|
| ||||||||||||||||||
| Nematoda |
|
| 68* | 71* | 71* | 74* | 80** | 61* |
| 53* | 73* | 53* | 62* |
| 52* | 64* | 71* | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||
|
|
| 38 | 44 | 43 | 42 | 48 | 50* |
| 47 | 54* | 51* | 48 | 52* |
| 68* | 46 | ||
|
|
| 50* | 57* | 53* | 55* | 60* | 52* |
| 71* | 76** | 75** | 72* | 64* |
| 68* | 53* | ||
|
| ||||||||||||||||||
| Cnidaria |
|
| 68* | 70* | 70* | 73* | 77** | 59* |
| 42 | 57* | 40 | 50* | 71* |
| 46 | 53* | |
11780 motifs that frequently occur in 123 proteomes.
| 11780 | The first group | The second group | The third group | ||||
|---|---|---|---|---|---|---|---|
| EEEEED | 6744 | EEEEED | 6744 | EDEEEE | 4248 | APAPAP | 3543 |
| QQQQQP | 6300 | QQQQQP | 6300 | STSSSS | 4166 | DDEEEE | 3464 |
| DEEEEE | 6165 | DEEEEE | 6165 | NNNNSN | 4030 | SGSGSG | 3423 |
| TSSSSS | 6135 | TSSSSS | 6135 | EEEEDE | 3995 | PAPAPA | 3392 |
| SGGGGG | 6117 | SGGGGG | 6117 | NSNNNN | 3992 | EEEEDD | 3292 |
| AAAAAG | 5863 | AAAAAG | 5863 | EEDEEE | 3959 | GSGSGS | 3240 |
| PSSSSS | 5813 | PSSSSS | 5813 | SSSSTS | 3953 | DEDEDE | 3127 |
| NNNNNS | 5811 | NNNNNS | 5811 | GGGGSG | 3934 | EDEDED | 3045 |
| QQQQQH | 5798 | QQQQQH | 5798 | AAVAAA | 3768 | RSRSRS | 2983 |
| SSSSST | 5780 | SSSSST | 5780 | AAAVAA | 3758 | DDDDEE | 2953 |
| DDDDDE | 5611 | DDDDDE | 5611 | GSGGGG | 3690 | EEEDDD | 2845 |
| SNNNNN | 5585 | SNNNNN | 5585 | SSTSSS | 3660 | DDDEEE | 2822 |
| ASSSSS | 5581 | ASSSSS | 5581 | SSSTSS | 3652 | RGRGRG | 2817 |
| SAAAAA | 5405 | SAAAAA | 5405 | EEEDEE | 3627 | EEDDDD | 2754 |
| APPPPP | 5325 | APPPPP | 5325 | AAAAVA | 3616 | AAAAGG | 2743 |
| AAAAAS | 5322 | AAAAAS | 5322 | GGGSGG | 3556 | EDEDEE | 2651 |
| AAAAAV | 5277 | AAAAAV | 5277 | SPSSSS | 3459 | DDEDED | 2570 |
| GGGGGS | 5118 | GGGGGS | 5118 | NNSNNN | 3429 | RGGRGG | 2537 |
| GGGGGA | 4862 | GGGGGA | 4862 | NNNSNN | 3418 | DEDEDD | 2489 |
| PQQQQQ | 4819 | PQQQQQ | 4819 | AVAAAA | 3391 | SSSSTT | 2448 |
Occurrence of 11780 motifs from the three groups in 9 kingdoms of Eukaryota and for 5 phyla of Bacteria in percentage terms.
| Kingdom | <x> | Error | <x> | Error | <x> | Error |
|---|---|---|---|---|---|---|
| First group | Second group | Third group | ||||
| Metazoa (17) | 21 | 3 | 29 | 1 | 50 | 4 |
| Viridiplantae (5) | 21 | 4 | 28 | 2 | 51 | 5 |
| Stramenopiles (1) | 28 | — | 32 | — | 41 | — |
| Choanoflagellida (1) | 18 | — | 27 | — | 55 | — |
| Euglenozoa (4) | 22 | 3 | 29 | 2 | 49 | 4 |
| Alveolata (6) | 23 | 4 | 29 | 1 | 48 | 5 |
| Amoebozoa (2) | 31 | 1 | 31 | 0 | 38 | 2 |
| Diplomonadida (3) | 11 | 1 | 24 | 1 | 65 | 2 |
| Fungi (58) | 18 | 3 | 28 | 1 | 53 | 4 |
|
| ||||||
| Bacteria (25) | 10 | 1 | 27 | 2 | 63 | 3 |
|
| ||||||
| All 11780 motifs | 6 | 0 | 13 | 0 | 81 | 0 |
Figure 1Statistics of occurrence of motifs, six residues long. Statistics of occurrence of motifs, six residues long, consisting of two amino acids in the three groups for 3 kingdoms of Eukaryota and for 5 phyla of Bacteria in percentage terms: (a) the Metazoa kingdom (17 proteomes), (b) the Amoebozoa kingdom (2 proteomes), (c) the Diplomonadida kingdom (3 proteomes), and (d) 26 bacterial proteomes. For each kingdom, the motif with the frequent occurrence in the group is presented.
Averaged correlation coefficients (in percentage terms) between numbers of proteins where a simple motif, six residues long, from the first group (760 motifs) appears at least once in 9 kingdoms of Eukaryota and 5 phyla of Bacteria.
| Metazoa (17) | Viridiplantae (5) | Stramenopiles (1) | Choanoflagellida (1) | Euglenozoa (4) | Alveolata (6) | Amoebozoa (2) | Diplomonadida (3) | Fungi (58) | Acidobacteria (1) | Actinobacteria (14) | Proteobacteria (8) | Bacteroidetes (2) | Chloroflexi (1) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 64* | 49 | 45 | 61* | 50* | 13 | 26 | 37 | 65* | 34 | 29 | 35 | 29 | 32 | Metazoa (17) |
| 49 | 62* | 61* | 60* | 48 | 6 | 7 | 28 | 54* | 48 | 47 | 52* | 22 | 22 | Viridiplantae (5) |
| 45 | 61* | — | 45 | 53* | 0 | 1 | 14 | 40 | 68* | 64* | 63* | 28 | 20 | Stramenopiles (1) |
| 61* | 60* | 45 | — | 60* | 7 | 28 | 35 | 67* | 42 | 46 | 47 | 26 | 29 | Choanoflagellida (1) |
| 50* | 48 | 53* | 60* | 51* | 3 | 16 | 30 | 49 | 47 | 49 | 48 | 31 | 32 | Euglenozoa (4) |
| 13 | 6 | 0 | 7 | 3 | 74* | 29 | 4 | 16 | −6 | −6 | −5 | 8 | −1 | Alveolata (6) |
| 26 | 7 | 1 | 28 | 16 | 29 | 92** | 12 | 33 | −5 | −5 | −5 | 1 | −1 | Amoebozoa (2) |
| 37 | 28 | 14 | 35 | 30 | 4 | 12 | 67* | 40 | 17 | 16 | 19 | 29 | 31 | Diplomonadida (3) |
| 65* | 54* | 40 | 67* | 49 | 16 | 33 | 40 | 74* | 29 | 25 | 32 | 20 | 22 | Fungi (58) |
|
| ||||||||||||||
| 34 | 48 | 68* | 42 | 47 | −6 | −5 | 17 | 29 | — | 75* | 73* | 40 | 38 | Acidobacteria (1) |
| 29 | 47 | 64* | 46 | 49 | −6 | −5 | 16 | 25 | 75* | 90** | 76** | 30 | 28 | Actinobacteria (14) |
| 35 | 52* | 63* | 47 | 48 | −5 | −5 | 19 | 32 | 73* | 76** | 74* | 33 | 33 | Proteobacteria (8) |
| 29 | 22 | 28 | 26 | 31 | 8 | 1 | 29 | 20 | 40 | 30 | 33 | 52* | 57* | Bacteroidetes (2) |
| 32 | 22 | 20 | 29 | 32 | −1 | −1 | 31 | 22 | 38 | 28 | 33 | 57* | — | Chloroflexi (1) |
Averaged correlation coefficients (in percentage terms) between numbers of proteins where at least once a simple motif, six residues long, from the second group (1520 motifs) appears in 9 kingdoms of Eukaryota and 5 phyla of Bacteria.
| Metazoa (17) | Viridiplantae (5) | Stramenopiles (1) | Choanoflagellida (1) | Euglenozoa (4) | Alveolata (6) | Amoebozoa (2) | Diplomonadida (3) | Fungi (58) | Acidobacteria (1) | Actinobacteria (14) | Proteobacteria (8) | Bacteroidetes (2) | Chloroflexi (1) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 64* | 45 | 42 | 54* | 47 | 20 | 23 | 39 | 63* | 26 | 24 | 27 | 26 | 27 | Metazoa (17) |
| 45 | 64* | 62* | 58* | 50* | 10 | 5 | 28 | 52* | 51* | 52* | 52* | 22 | 24 | Viridiplantae (5) |
| 42 | 62* | — | 46 | 55* | 6 | 2 | 16 | 40 | 62* | 57* | 56* | 22 | 23 | Stramenopiles (1) |
| 54* | 58* | 46 | — | 55* | 11 | 23 | 32 | 61* | 44 | 51* | 52* | 28 | 29 | Choanoflagellida (1) |
| 47 | 50* | 55* | 55* | 50 | 10 | 13 | 32 | 48 | 41 | 46 | 45 | 27 | 26 | Euglenozoa (4) |
| 20 | 10 | 6 | 11 | 10 | 66* | 40 | 6 | 23 | −3 | −4 | −3 | 13 | −1 | Alveolata (6) |
| 23 | 5 | 2 | 23 | 13 | 40 | 90** | 10 | 30 | −4 | −4 | −4 | 0 | −3 | Amoebozoa (2) |
| 39 | 28 | 16 | 32 | 32 | 6 | 10 | 72* | 42 | 17 | 19 | 20 | 25 | 31 | Diplomonadida (3) |
| 63* | 52* | 40 | 61* | 48 | 23 | 30 | 42 | 71* | 28 | 26 | 29 | 19 | 19 | Fungi (58) |
|
| ||||||||||||||
| 26 | 51* | 62* | 44 | 41 | −3 | −4 | 17 | 28 | — | 70* | 69* | 33 | 40 | Acidobacteria (1) |
| 24 | 52* | 57* | 51* | 46 | −4 | −4 | 19 | 26 | 70* | 92** | 80** | 30 | 35 | Actinobacteria (14) |
| 27 | 52* | 56* | 52* | 45 | −3 | −4 | 20 | 29 | 69* | 80** | 77** | 32 | 38 | Proteobacteria (8) |
| 26 | 22 | 22 | 28 | 27 | 13 | 0 | 25 | 19 | 33 | 30 | 32 | 47 | 56* | Bacteroidetes (2) |
| 27 | 24 | 23 | 29 | 26 | −1 | −3 | 31 | 19 | 40 | 35 | 38 | 56* | — | Chloroflexi (1) |
Averaged correlation coefficients (in percentage terms) between numbers of proteins where a simple motif, six residues long, from the third group (9500 motifs) appears at least once in 9 kingdoms of Eukaryota and 5 phyla of Bacteria.
| Metazoa (17) | Viridiplantae (5) | Stramenopiles (1) | Choanoflagellida (1) | Euglenozoa (4) | Alveolata (6) | Amoebozoa (2) | Diplomonadida (3) | Fungi (58) | Acidobacteria (1) | Actinobacteria (14) | Proteobacteria (8) | Bacteroidetes (2) | Chloroflexi (1) | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 59* | 44 | 44 | 23 | 37 | 25 | 31 | 31 | 56* | 25 | 26 | 25 | 21 | 22 | Metazoa (17) |
| 44 | 58* | 54* | 22 | 41 | 17 | 15 | 26 | 53* | 36 | 40 | 38 | 17 | 22 | Viridiplantae (5) |
| 44 | 54* | — | 15 | 47 | 11 | 11 | 16 | 41 | 43 | 45 | 39 | 18 | 16 | Stramenopiles (1) |
| 23 | 22 | 15 | — | 24 | 6 | 10 | 21 | 25 | 15 | 16 | 16 | 8 | 12 | Choanoflagellida (1) |
| 37 | 41 | 47 | 24 | 40 | 10 | 14 | 23 | 38 | 30 | 34 | 29 | 17 | 19 | Euglenozoa (4) |
| 25 | 17 | 11 | 6 | 10 | 61* | 38 | 8 | 28 | 0 | −1 | 0 | 13 | 5 | Alveolata (6) |
| 31 | 15 | 11 | 10 | 14 | 38 | 88** | 13 | 36 | 0 | 1 | 0 | 8 | 6 | Amoebozoa (2) |
| 31 | 26 | 16 | 21 | 23 | 8 | 13 | 67* | 33 | 22 | 24 | 24 | 21 | 24 | Diplomonadida (3) |
| 56* | 53* | 41 | 25 | 38 | 28 | 36 | 33 | 68* | 26 | 27 | 27 | 18 | 21 | Fungi (58) |
|
| ||||||||||||||
| 25 | 36 | 43 | 15 | 30 | 0 | 0 | 22 | 26 | — | 69* | 65* | 28 | 38 | Acidobacteria (1) |
| 26 | 40 | 45 | 16 | 34 | −1 | 1 | 24 | 27 | 69* | 84** | 73* | 28 | 40 | Actinobacteria (14) |
| 25 | 38 | 39 | 16 | 29 | 0 | 0 | 24 | 27 | 65* | 73* | 71* | 26 | 40 | Proteobacteria (8) |
| 21 | 17 | 18 | 8 | 17 | 13 | 8 | 21 | 18 | 28 | 28 | 26 | 30 | 30 | Bacteroidetes (2) |
| 22 | 22 | 16 | 12 | 19 | 5 | 6 | 24 | 21 | 38 | 40 | 40 | 30 | — | Chloroflexi (1) |