| Literature DB >> 32620079 |
Daniel Restrepo-Montoya1,2, Robert Brueggeman3, Phillip E McClean4,5, Juan M Osorno6.
Abstract
BACKGROUND: In plants, the plasma membrane is enclosed by the cell wall and anchors RLK and RLP proteins, which play a fundamental role in perception of developmental and environmental cues and are crucial in plant development and immunity. These plasma membrane receptors belong to large gene/protein families that are not easily classified computationally. This detailed analysis of these plasma membrane proteins brings a new source of information to the legume genetic, physiology and breeding research communities.Entities:
Keywords: Dicots; Legumes; Model plants; Plasma membrane receptors; Resistance genes/proteins
Mesh:
Substances:
Year: 2020 PMID: 32620079 PMCID: PMC7333395 DOI: 10.1186/s12864-020-06844-z
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Performance evaluation
| Sensitivity | 0.88 | 0.85 |
| Specificity | 1 | 1 |
| Matthews correlation coefficient | 0.91 | 0.91 |
Non-redundant datasets used for the performance evaluation are RLK, n:63; RLP, n:27; and Other R genes, n = 96. The Additional file 1: Table S1 - lists the experimentally-validated proteins used for this evaluation including information about its prediction condition (RLK, RLP, and cytoplasmic resistance proteins), and the Additional file 2: Table S2 – provides a performance evaluation summary
Fig. 1Computational strategy followed to identify RLK and RLP
Summary of total number of RLK and RLP identified across legumes/non-legumes
| Signal peptide | Transmembrane helices | RLK/RLP proteins identified per species | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|
| Species | Total proteins reported | Pre/Abs | Number of proteins | % | Number of proteins | % | RLKa | % RLK | RLPa | % RLP |
| 48,331 | P | 2679 | 5.5 | 1031 | 2.1 | 197 | 62 | |||
| A | 45,652 | 94.4 | 5760 | 11.9 | 253 | 80 | ||||
| total | 450 | 0.9 | 142 | 0.3 | ||||||
| 88,647 | P | 8125 | 9.1 | 3934 | 4.4 | 1182 | 282 | |||
| A | 80,522 | 90.8 | 15,459 | 17.4 | 682 | 186 | ||||
| total | 1864 | 2.1 | 468 | 0.5 | ||||||
| 62,319 | P | 6251 | 10.0 | 2961 | 4.7 | 647 | 196 | |||
| A | 56,068 | 89.9 | 10,383 | 16.6 | 413 | 167 | ||||
| total | 1060 | 1.7 | 363 | 0.6 | ||||||
| 36,995 | P | 4120 | 11.1 | 1895 | 5.1 | 571 | 138 | |||
| A | 32,875 | 88.8 | 6349 | 17.1 | 271 | 79 | ||||
| total | 842 | 2.3 | 217 | 0.6 | ||||||
| 37,769 | P | 3570 | 9.4 | 1681 | 4.4 | 557 | 124 | |||
| A | 34,199 | 90.5 | 6364 | 16.8 | 278 | 91 | ||||
| total | 835 | 2.2 | 215 | 0.6 | ||||||
| 35,143 | P | 3450 | 9.8 | 1584 | 4.5 | 505 | 142 | |||
| A | 31,693 | 90.1 | 5934 | 16.8 | 265 | 99 | ||||
| total | 770 | 2.2 | 241 | 0.7 | ||||||
| 42,287 | P | 4698 | 11.1 | 2105 | 4.9 | 660 | 190 | |||
| A | 37,589 | 88.9 | 7962 | 18.8 | 332 | 104 | ||||
| total | 992 | 2.3 | 294 | 0.7 | ||||||
| 26,346 | P | 2043 | 7.7 | 842 | 3.2 | 269 | 99 | |||
| A | 24,303 | 92.2 | 4980 | 18.9 | 174 | 73 | ||||
| total | 443 | 1.7 | 172 | 0.6 | ||||||
| 35,386 | P | 4088 | 11.5 | 1935 | 5.4 | 408 | 121 | |||
| A | 31,298 | 88.4 | 5784 | 16.3 | 147 | 51 | ||||
| total | 555 | 1.6 | 172 | 0.5 | ||||||
| 34,725 | P | 3258 | 9.3 | 1480 | 4.2 | 316 | 107 | |||
| A | 1467 | 90.6 | 5727 | 16.4 | 160 | 54 | ||||
| total | 476 | 1.4 | 161 | 0.5 | ||||||
For each species, the results were distinguished by the present “P” and absent “A” of signal peptide and follow the logic flow presented in Fig. 1. aNon-redundant data reported. For the RLK-nonRD, the results per species are: A. thaliana: 48 proteins (8.6%), C. cajan: 61 proteins (13.6%), G. max: 223 proteins (11.9%), M. truncatula: 194 proteins (18.3%), P. vulgaris: 124 proteins (14.7%), S. lycopersicum: 83 proteins (17.4%), V. angularis: 122 proteins (14.6%), V. radiata: 113 proteins (14.7%), V. unguiculata: 158 proteins (15.9%), and V. vinifera: 59 proteins (13.3%). RLK-nonRD IDs are reported in the Additional file 3: Table S3. The kinome (total set of proteins with a kinase in a genome) per species was calculated and the results for the species are CC: 1268 p. (35.5% - RLK), GM: 4497 p. (41.4% - RLK), MT: 2281 p. (46.6% - RLK), PV: 1888 p. (44.7% - RLK), VA: 1898 p. (44% - RLK), VR: 1772 p. (43.5% - RLK), VU: 2090 p. (47.5% - RLK), VV: 1064 p. (41.7% - RLK), AT: 1431 p. (38.9% - RLK), and SL: 1194 p. (39.9% - RLK)
Receptor-like kinases identified by extracellular domains across the species
| Domain class | Domain combinations | Species | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| CC | GM | MT | PV | VA | VR | VU | VV | AT | SL | ||
| LRR | lrr | 134 | 579 | 324 | 239 | 254 | 249 | 301 | 136 | 180 | 198 |
| G-lectin: combination of ectodomains | s-locus | 2 | 1 | 1 | 1 | 1 | 1 | 3 | 2 | 0 | 0 |
| b-lectin | 7 | 20 | 25 | 12 | 12 | 14 | 15 | 7 | 2 | 7 | |
| b-lectin/pan | 2 | 9 | 7 | 12 | 2 | 5 | 17 | 2 | 1 | 5 | |
| s-locus/pan | 5 | 10 | 4 | 5 | 1 | 0 | 7 | 18 | 2 | 0 | |
| b-lectin/s-locus | 11 | 24 | 14 | 15 | 14 | 18 | 15 | 7 | 2 | 10 | |
| b-lectin/s-locus/pan | 31 | 146 | 131 | 41 | 53 | 44 | 96 | 12 | 33 | 42 | |
| L-Lectin | l-lectin | 24 | 66 | 46 | 38 | 35 | 36 | 42 | 20 | 44 | 22 |
| C-lectin | c-lectin | 1 | 4 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
| Lectin | lysM | 7 | 27 | 16 | 14 | 11 | 12 | 13 | 5 | 5 | 8 |
| Lectin (Feronia) | malectin | 29 | 99 | 54 | 82 | 58 | 50 | 60 | 29 | 36 | 22 |
| Thaumatin (Osmotin) | pr5k | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 |
| WAK | wak | 11 | 66 | 33 | 41 | 45 | 39 | 46 | 14 | 27 | 17 |
| egf | 1 | 4 | 0 | 1 | 0 | 0 | 1 | 2 | 0 | 2 | |
| wak/egf | 5 | 10 | 16 | 6 | 3 | 7 | 8 | 7 | 4 | 7 | |
| DUF26 recently renamed | stress_antifung | 28 | 173 | 66 | 70 | 57 | 58 | 90 | 22 | 45 | 15 |
| Classically related to G-lectin | pan | 5 | 10 | 1 | 2 | 2 | 0 | 1 | 10 | 0 | 0 |
| Combination of different domain ectodomains identified | lrr/malectin | 12 | 63 | 66 | 32 | 30 | 19 | 28 | 26 | 47 | 7 |
| pan/wak | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | |
| s-locus/wak | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 1 | 0 | 0 | |
| b-lectin/pr5k | 0 | 1 | 3 | 0 | 1 | 0 | 1 | 0 | 0 | 0 | |
| b-lectin/s-locus/wak | 2 | 0 | 1 | 2 | 3 | 0 | 3 | 0 | 0 | 0 | |
| b-lectin/s-locus/pan/wak | 0 | 8 | 2 | 2 | 2 | 2 | 5 | 1 | 0 | 1 | |
| pan/s-locus/wak | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | |
| RLK - pkinase | rlk – non-target ectodomain | 4 | 11 | 6 | 5 | 3 | 5 | 6 | 18 | 8 | 5 |
| Combination of ectodomains Identified RLCK with/without ectodomains | rlk - not ectodomains | 30 | 180 | 74 | 87 | 93 | 72 | 86 | 80 | 25 | 28 |
| rlck extra domain | 8 | 7 | 6 | 6 | 10 | 3 | 5 | 5 | 5 | 9 | |
| rlck only pkinase | 91 | 346 | 163 | 128 | 144 | 133 | 142 | 16 | 86 | 70 | |
For each species, the results were merge by present “P” and absent “A” of signal peptide. All possible domain combinations were explored and are reported in the “Domain combinations” column (proteins reported are non-redundant). A. thaliana: AT, C. cajan: CC, G. max: GM, M. truncatula: MT, P. vulgaris: PV, S. lycopersicum: SL, V. angularis: VA, V. radiata: VR, V. unguiculata: VU, and V. vinifera: VV (Table A4). RLCK: Only kinase domain identified. All proteins reported in this table have at least one transmembrane helix. Extra: proteins that have the presence/absence of signal peptide, at least one transmembrane helix, a Pkinase and other extracellular/intracellular domains different than LRR, L/C/G-Lectin, LysM, Pr5k-Thaumatin, WAK, Malectin, EGF or Stress-Antifung were only considered for the combination identification analysis, but other domains reported in Table A7 named as “non-target” domains could be present
Receptor-like proteins identified by extracellular domains across the species
| Domain details | Domain combinations | Species | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| CC | GM | MT | PV | VA | VR | VU | VV | AT | SL | ||
| LRR | lrr | 69 | 247 | 225 | 107 | 104 | 138 | 171 | 78 | 71 | 67 |
| G-lectin: combination of ectodomains identified | s-locus | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 |
| b-lectin | 1 | 5 | 2 | 5 | 3 | 2 | 5 | 8 | 4 | 1 | |
| s-locus/pan | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | |
| b-lectin/s-locus | 0 | 1 | 1 | 0 | 0 | 0 | 0 | 2 | 0 | 1 | |
| b-lectin/s-locus/pan | 2 | 3 | 3 | 3 | 1 | 2 | 5 | 4 | 2 | 1 | |
| L-lectin | l-lectin | 31 | 88 | 54 | 35 | 39 | 33 | 34 | 27 | 34 | 36 |
| Lectin | lysM | 3 | 7 | 8 | 5 | 3 | 5 | 4 | 3 | 2 | 4 |
| Lectin (Feronia) | malectin | 5 | 12 | 7 | 3 | 5 | 3 | 5 | 8 | 7 | 3 |
| Thaumatin (Osmotin) | pr5k | 7 | 28 | 19 | 13 | 16 | 17 | 20 | 7 | 15 | 16 |
| WAK | wak | 5 | 14 | 16 | 8 | 9 | 12 | 11 | 10 | 8 | 12 |
| egf | 4 | 17 | 5 | 6 | 8 | 5 | 8 | 3 | 5 | 3 | |
| wak/egf | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 2 | 0 | 0 | |
| DUF26 recently renamed | stress_antifung | 12 | 34 | 15 | 22 | 22 | 19 | 24 | 9 | 23 | 14 |
| Classically related to G-lectin | pan | 1 | 3 | 3 | 1 | 1 | 1 | 3 | 3 | 0 | 0 |
| Combination of target ectodomains | lrr/malectin | 1 | 9 | 5 | 9 | 4 | 4 | 4 | 5 | 1 | 3 |
For each species, the results were distinguished by present “P” and absent “A” of signal peptide, all possible domain combinations were explored and are reported in the “Domain combinations” column. Proteins reported are non-redundant. A. thaliana: AT, C. cajan: CC, G. max: GM, M. truncatula: MT, P. vulgaris: PV, S. lycopersicum: SL, V. angularis: VA, V. radiata: VR, V. unguiculata: VU, and V. vinifera: VI. All proteins reported in this table have at least one transmembrane helix. Other domains reported in Table A8 named as “non-target” domains could be present
Fig. 2Summary of the extracellular domains identified in RLK/RLP. The domains in this figure resume the domains and the combinations identified. A. Classical RLK/RLP protein structure. B. Ectodomains identified that are also reported by the scientific community (Tables 1 and 2). C. Ectodomain combinations identified in RLK/RLP. In B and C, the ectodomains are only represented, in the RLK cases all proteins must have an intracellular Pkinase
Summary of domains present on the RLK proteins predicted
| Clan or domain | Pfam domain name ID | Species | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| CC | GM | MT | PV | VA | VR | VU | VV | AT | SL | ||
| Pkinase | Ins_P5_2-kin | x | x | ||||||||
| RIO1 | x | x | |||||||||
| Pkinase | x | x | x | x | x | x | x | x | x | x | |
| PI3_PI4_kinase | x | x | x | ||||||||
| Pkinase_Tyr | x | x | x | x | x | x | x | x | x | x | |
| Choline_kinase | x | x | x | x | x | ||||||
| ABC1 | x | x | x | x | x | x | x | x | x | x | |
| Pkinase_C | x | x | |||||||||
| PIP5K | x | x | |||||||||
| WaaY | x | ||||||||||
| APH | x | x | x | x | |||||||
| LRR | LRRNT_2 | x | x | x | x | x | x | x | x | x | x |
| LRR_8 | x | x | x | x | x | x | x | x | x | x | |
| LRR_1 | x | x | x | x | x | x | x | x | x | x | |
| LRR_4 | x | x | x | x | x | x | x | x | x | x | |
| LRR_6 | x | x | x | x | x | x | x | x | x | x | |
| LRR_2 | x | x | x | ||||||||
| LRR_5 | x | x | x | x | x | ||||||
| L-Lectin | Lectin_legB | x | x | x | x | x | x | x | x | x | x |
| C-Lectin | Lectin_C | x | x | x | x | x | x | x | x | x | x |
| G-Lectin | B_lectin | x | x | x | x | x | x | x | x | x | x |
| S_locus_glycop | x | x | x | x | x | x | x | x | x | x | |
| PAN | PAN_2 | x | x | x | x | x | x | x | x | x | x |
| PAN_1 | x | x | x | ||||||||
| LysM | LysM | x | x | x | x | x | x | x | x | x | x |
| PR5K | Thaumatin | x | x | x | x | x | |||||
| WAK | WAK_assoc | x | x | x | x | x | x | x | x | x | x |
| WAK | x | x | x | x | x | x | |||||
| GUB_WAK_bind | x | x | x | x | x | x | x | x | x | x | |
| Malectin | Malectin_like | x | x | x | x | x | x | x | x | x | x |
| Malectin | x | x | x | x | x | x | x | x | x | x | |
| EGF | EGF_CA | x | x | x | x | x | x | x | x | x | x |
| EGF | x | x | |||||||||
| EGF_3 | x | x | x | x | x | x | |||||
| Stress-antifung (DUF26) | Stress-antifung | x | x | x | x | x | x | x | x | x | x |
Present: X
Summary of domains present on the RLP proteins predicted
| Clan or Domain | Domain name | Species | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| CC | GM | MT | PV | VA | VR | VU | VV | AT | SL | ||
| LRR | LRR_8 | x | x | x | x | x | x | x | x | x | x |
| LRR_1 | x | x | x | x | x | x | x | x | x | x | |
| LRRNT_2 | x | x | x | x | x | x | x | x | x | x | |
| LRR_2 | x | x | |||||||||
| LRR_4 | x | x | x | x | x | x | x | x | x | x | |
| LRR_6 | x | x | x | x | x | x | x | x | x | x | |
| LRR_9 | x | ||||||||||
| LRR_5 | x | ||||||||||
| L-Lectin | Gal-bind_lectin | x | x | x | x | x | x | x | x | x | x |
| Glyco_hydro_32C | x | x | x | x | x | x | x | x | x | ||
| XET_C | x | x | x | x | x | x | x | x | x | x | |
| Lectin_legB | x | x | x | x | x | x | x | x | x | x | |
| Glyco_hydro_16 | x | x | x | x | x | x | x | x | x | x | |
| Calreticulin | x | x | x | x | x | x | x | x | x | x | |
| SPRY | x | x | x | x | x | x | x | ||||
| Alginate_lyase2 | x | x | |||||||||
| G-Lectin | B_lectin | x | x | x | x | x | x | x | x | x | x |
| S_locus_glycop | x | x | x | x | x | x | x | x | x | x | |
| PAN | PAN_2 | x | x | x | x | x | x | x | x | x | |
| PAN_1 | x | x | x | x | x | x | x | ||||
| PAN_4 | x | x | x | x | x | x | x | ||||
| LysM | LysM | x | x | x | x | x | x | x | x | x | x |
| Thaumatin (PR5K) | Thaumatin | x | x | x | x | x | x | x | x | x | x |
| WAK | WAK_assoc | x | x | x | x | x | x | x | x | x | x |
| WAK | x | x | |||||||||
| GUB_WAK_bind | x | x | x | x | x | x | x | x | x | x | |
| Malectin | Malectin_like | x | x | x | x | x | x | x | x | x | x |
| Malectin | x | x | x | x | x | ||||||
| EGF | EGF_alliinase | x | x | x | x | x | x | ||||
| cEGF | x | x | x | x | x | x | x | x | x | x | |
| EGF_CA | x | x | x | ||||||||
| EGF_2 | x | x | x | x | x | x | x | ||||
| Stress-antifung | Stress-antifung | x | x | x | x | x | x | x | x | x | x |
X: Present
Summary of domains identified in the validation dataset
| Clan or Domain | Domain or Family | RLK | RLP |
|---|---|---|---|
| PKinase | Pkinase_Tyr | Xa | |
| Pkinase | X | ||
| LRR | LRR_8 | X | X |
| LRRNT_2 | X | X | |
| LRR_1 | X | X | |
| LRR_4 | X | X | |
| LRR_6 | X | X | |
| L-Lectin | Lectin_legB | X | |
| G-Lectin | B_lectin | X | |
| PAN_2 | X | ||
| S_locus_glycop | X | ||
| LysM | LysM | X | X |
| PR5K | Thaumatin | X | |
| WAK | GUB_WAK_bind | X | |
| WAK | X | ||
| Malectin | Malectin_like | X | |
| EGF | EGF_CA | X | |
| Stress-Antifung | Stress-antifung | X | |
| DUF3403 | DUF3403 | X | |
| CL0384 | GDPD | X |
aX: Present. Source: Table A1: the list of experimentally-validated proteins used for this evaluation were RLK, n:63 and RLP, n:27
Summary of genomes
| Species | Database | File name | N. of genes | N. of proteins | N. of chr |
|---|---|---|---|---|---|
| VR | NCBI | GCF_000741045.1_Vradiata_ver6 | 34,911 | 35,143 | 11 |
| CC | NCBI | GCA_000340665.1_C.cajan_V1.0 | 23,374 | 48,331 | 11 |
| VA | NCBI | annotation release 100 | 22,276 | 37,769 | 11 |
| GM | Phytozome | gmax_275_wm82.a2.v1 | 55,589 | 88,647 | 20 |
| MT | Phytozome | Mtruncatula_285_Mt4.0v1 | 48,338 | 62,319 | 8 |
| PA | Phytozome | Pvulgaris_442_v2.1 | 27,012 | 36,995 | 11 |
| VU | Phytozome | Vunguiculata_469_v1.1 | 28,881 | 42,287 | 11 |
| AT | Phytozome | Athaliana_167_TAIR10 | 27,206 | 35,386 | 5 |
| SL | Phytozome | Slycopersicum_390_ITAG2.4 | 33,838 | 34,725 | 12 |
| VV | Phytozome | Vvinifera_145_Genoscope.12X | 23,647 | 26,346 | 19 |
Target domains for the classification of RLK/RLP
| Functional familya | Clan or Domain | Number of domains reported in Pfam 31 |
|---|---|---|
| LRR | CL0022 | 11 |
| LRRNT_2 | 1 | |
| Pkinase | CL0016 | 35 |
| Pkinase_C | 1 | |
| L-Lectin | CL0004 | 43 |
| C-Lectin | Lectin_C | 1 |
| G-Lectin | B_lectin | 1 |
| S_locus_glycop | 1 | |
| LysM | LysM | 3 |
| PR5K | Thaumatin | 1 |
| TNFR | TNFR | 6 |
| PAN | CL0168 | 6 |
| WAK | WAK | 1 |
| GUB_WAK | 1 | |
| WAK_assoc | 1 | |
| Malectin | CL0468 | 2 |
| EGF | CL0001 | 18 |
| Stress-antifungal | Stress-antifungal | 1 |
| NB-ARC | NB-ARC | 217 |
aSource: Pfam 31.0 [85]. The domains reported in Table 9 are not exclusively present on RLK and RLP. The NB-ARC belong to R genes, which belong to cytoplasmic proteins and were used to exclude false positive proteins