| Literature DB >> 35736740 |
Alexandr Pozharskiy1,2, Valeriya Kostyukova1,2, Gulnaz Nizamdinova1, Ruslan Kalendar3,4, Dilyara Gritsenko1,2.
Abstract
MLO proteins are a family of transmembrane proteins in land plants that play an important role in plant immunity and host-pathogen interactions, as well as a wide range of development processes. Understanding the evolutionary history of MLO proteins is important for understanding plant physiology and health. In the present work, we conducted a phylogenetic analysis on a large set of MLO protein sequences from publicly available databases, specifically emphasising MLOs from the tomato plant and related species. As a result, 4886 protein sequences were identified and used to construct a phylogenetic tree. In comparison to previous findings, we identified nine phylogenetic clades, revealed the internal structure of clades I and II as additional clades and showed the presence of monocotyledon species in all MLO clades. We identified a set of 19 protein motifs that allowed for the identification of particular clades. Sixteen SlMLO proteins from tomato were located in the phylogenetic tree and identified in relation to homologous sequences from other Solanaceae species. The obtained results could be useful for further work on the use of MLO proteins in the study of mildew resistance in Solanaceae and other plant families.Entities:
Keywords: Mildew locus o; Solanaceae; phylogeny; seven transmembrane proteins
Year: 2022 PMID: 35736740 PMCID: PMC9229925 DOI: 10.3390/plants11121588
Source DB: PubMed Journal: Plants (Basel) ISSN: 2223-7747
Figure 1The distribution of sequence lengths in the MLO protein dataset and the results of data filtering.
A representation of the high-level taxonomic groups in the MLO dataset.
| Group | Number of Sequences | Number of Genera | Number of Species | Number of Sequences per Clade | ||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| c1.1.1 | c1.1.2 | c1.2.1 | c1.2.2 | c2.1.1 | c2.1.2 | c2.2.1 | c2.2.2 | c2.2.3 | ||||
| Algae | 19 | 13 | 14 | - | - | - | - | - | - | - | - | - |
| Embryophytes | 49 | 3 | 4 | 15 | - | - | - | - | - | - | - | - |
| Gymnosperms | 5 | 2 | 2 | - | - | 4 | - | - | 1 | - | - | - |
| Angiosperms | 99 | 4 | 4 | 5 | 18 | 9 | 16 | 3 | 14 | 4 | 8 | 22 |
| Monocotyledons | 966 | 27 | 54 | 83 | 68 | 171 | 328 | 87 | 160 | 9 | 24 | 26 |
| Dicotyledons | 76 | 5 | 5 | 10 | 10 | 8 | 17 | 1 | 14 | 5 | 7 | 4 |
| Rosids | 2888 | 69 | 129 | 261 | 296 | 216 | 429 | 130 | 423 | 195 | 221 | 708 |
| Asterids | 784 | 28 | 42 | 67 | 113 | 49 | 135 | 34 | 151 | 50 | 29 | 153 |
Figure 2The results of the phylogenetic analysis and MEME motif search of the 4886 MLO proteins: (a) The neighbour-joining tree of MLO proteins; (b) the occurrence of the selected motifs across the MLO proteins, with colours indicating the motif matching scores; (c) a principal component analysis of the frequency of the motif occurrences in the defined phylogenetic clades. The motif numeration is in accordance with the MEME results in the Supplementary Material (File S2).
Figure 3A comparison of the phylogenetic trees obtained using four different methods: (a) a superposition of the four aligned trees; (b) the consistency of the global topologies of the four trees.
The distribution of the MLO homologue identifiers provided by the NCBI database within the identified phylogenetic clades.
| Clade (Size) | MLO1 | MLO2 | MLO3 | MLO4 | MLO5 | MLO6 | MLO7 | MLO8 | MLO9 | MLO10 | MLO11 | MLO12 | MLO13 | MLO14 | MLO15 | MLO17 |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| c1.1.1 (441) | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 175 | 0 | 0 | 59 | 0 | 0 |
| c1.1.2 (505) | 0 | 0 | 0 | 282 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| c1.2.1 (457) | 5 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 201 | 0 | 1 | 0 |
| c1.2.2 (925) | 426 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 4 | 0 | 34 | 0 |
| c2.1.1 (255) | 40 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| c2.1.2 (763) | 0 | 0 | 0 | 0 | 25 | 0 | 15 | 78 | 140 | 117 | 0 | 0 | 0 | 0 | 0 | 0 |
| c2.2.1 (263) | 0 | 1 | 144 | 0 | 0 | 1 | 0 | 0 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 0 |
| c2.2.2 (289) | 0 | 12 | 0 | 0 | 0 | 31 | 0 | 0 | 8 | 0 | 0 | 113 | 0 | 0 | 0 | 0 |
| c2.2.3 (913) | 0 | 65 | 1 | 0 | 0 | 292 | 0 | 0 | 0 | 0 | 0 | 101 | 1 | 0 | 0 | 1 |
Protein motifs with specific distribution in the found phylogenetic clades.
| Motif No. | Consensus Sequence | Regular Expression | MEME ID | Number of Occurrences | E-Value | Specific Clades |
|---|---|---|---|---|---|---|
| 10 | YQFSNDPERFRFTR | Y[QE]F[SA][NH]DP[ES]RFR[FL][TA][RH] | MEME-10 | 2339 | 7.4 × 10−851 | c2 |
| 11 | ERHAVVKGAPVVQPSD | [ED][RK][HG]A[VA][VI][KQE]GX[PL][VL]VQP[SG]D | MEME-11 | 3330 | 6.9 × 10−1030 | c2 and c1.2.2 |
| 20 | ETSFGRRHLSFW | [EQD]T[ST]F[GV]RRHL[SN]FW | MEME-20 | 2329 | 3.3 × 10−597 | c2 |
| 24 | SSRPTTPSHGMSPVHLLHNY | SSRP[TA]TP[ST]HG[MS]SP[VI]HLL[HR]N[YH] | MEME-24 | 1005 | 3.7 × 10−433 | c2.1.1 and c2.2.3 |
| 25 | FIKHHFSGPWKRSAILGWLL | F[IV][KR]H[HR][FTA]S[GH]P[WG][KS][RK][SN][AR][IV]L[GSI]W[LMV][LH] | MEME-25 | 1978 | 8.6 × 10−304 | c1 |
| 27 | EEEHRRRLLWYERRFLAGGS | E[EG]EH[RH]R[RK]LL[WS][YF]E[RH]RFL[AS][GA][GAD]S | MEME-27 | 456 | 1 × 10−213 | c2.1.2 |
| 29 | LENAGITGPFSGTKLKPRDD | LE[NIS]A[GE]ITG[PY]F[ST]G[TA][KQ][LV][KR]PRD[DE] | MEME-29 | 548 | 3.6 × 10−165 | c1.1 |
| 30 | DDSTIHTETSTVMSLEEDDH | DDST[IV][HR]T[ED]TSTV[MC]S[LI]E[ED]DDH | MEME-30 | 362 | 8 × 10−134 | c1.1.1 |
| 33 | MDRHDSLTEITRELTMRRQS | MDRHDSL[TS]EI[TA]RE[LK]T[ML]RRQ[ST] | MEME-33 | 400 | 9.5 × 10−113 | c1.1.1 |
| 34 | PTLHRFKTTGHSTRSSYYDD | [PH]TLHRFKTTGHSTRSSYY[DE][DE] | MEME-34 | 305 | 4.5 × 10−106 | c2.1.2 * |
| 42 | SSLFSSRFYJCSEEDY | SSLF[ST]S[RK]FY[IL]CSEEDY | MEME-42 | 324 | 3 × 10−47 | c1.1.2 |
| 43 | ANETSSRVGTPLLRPSASIS | ANETSSR[VA]GTPLLRP[SC]AS[IV]S | MEME-43 | 198 | 1.7 × 10−48 | c1.1.1 |
| 44 | HTTRSVCSLESTIDERDEI | H[TA][TA]RS[VT]CSL[ED][ST]TID[ED][RE][DR][ED][IE] | MEME-44 | 243 | 1.7 × 10−35 | c1.1.2 * |
| 45 | KNYDPEZVLKPKVTHVQQHD | [KE][NE]YD[PT]E[QE]VLK[PKT]K[VF]THV[QH][QDE]H[DA] | MEME-46 | 304 | 8.3 × 10−33 | c1.2.2 |
| 48 | SLWGIKERSCFMKNH | SLW[GE][IFL]K[EQ]RSCFMKNH | MEME-51 | 226 | 6.9 × 10−28 | c1.1.2 |
| 51 | AARRKRRLGIFT | AARR[KR]RR[LH]G[IM][FY]T | MEME-56 | 276 | 2 × 10−24 | c1.1.1 |
| 52 | KKKGGKGGKSPTRTLGGSPS | KKK[GK]GKGGKSPTRTLGGS[PS]S | MEME-57 | 283 | 1.6 × 10−23 | c2.1.2 * |
| 53 | QEASDLEADPLSPTSS | Q[ED]ASDLEA[DE]PL[ST]PT[SP][ST] | MEME-60 | 268 | 8.2 × 10−17 | c2.1.2 * |
| 54 | ETDAGTYTEIELQPPSTVTS | ETDAGT[YG][TN]E[IV]ELQPPST[VI]T[ST] | MEME-62 | 228 | 7.7 × 10−10 | c1.1.1 |
* Partial presence of the motif in the clade.
Figure 4Multiple sequence alignment of the 15 MLO proteins from Solanum lycopersicum L. with the clade-specific motifs. The seven transmembrane domains are shown as boxes, according to Kusch et al. (2016). The red lines indicate the extracellular (above) or intracellular (below) localisation of the domain. The motifs shown in the alignment were enumerated in accordance with Table 3 and the corresponding sequences are listed below. The fully conserved positions are shown in blue. The SlMLO homologue names are listed on the left, according to Zheng et al. (2016) (Z.) and Kusch et al. (2016) (K.).
The occurrences of the MLO sequences from species of the Solanaceae family in the nine phylogenetic clades.
| Species | Total | c1.1.1 | c1.1.2 | c1.2.1 | c1.2.2 | c2.1.1 | c2.1.2 | c2.2.1 | c2.2.2 | c2.2.3 |
|---|---|---|---|---|---|---|---|---|---|---|
|
| 103 | 5 | 18 | 7 | 19 | 0 | 22 | 5 | 2 | 25 |
|
| 6 | 1 | 0 | 0 | 2 | 0 | 3 | 0 | 0 | 0 |
|
| 10 | 0 | 2 | 1 | 2 | 0 | 3 | 1 | 0 | 1 |
|
| 49 | 3 | 8 | 3 | 7 | 0 | 10 | 2 | 1 | 15 |
|
| 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 1 |
|
| 12 | 0 | 2 | 1 | 2 | 0 | 2 | 1 | 0 | 4 |
|
| 25 | 1 | 6 | 2 | 6 | 0 | 4 | 1 | 1 | 4 |
|
| 50 | 4 | 11 | 1 | 12 | 0 | 9 | 0 | 5 | 8 |
|
| 30 | 2 | 7 | 1 | 6 | 0 | 5 | 0 | 3 | 6 |
|
| 10 | 1 | 2 | 0 | 3 | 0 | 2 | 0 | 1 | 1 |
|
| 10 | 1 | 2 | 0 | 3 | 0 | 2 | 0 | 1 | 1 |
|
| 65 | 3 | 12 | 5 | 13 | 0 | 11 | 4 | 0 | 17 |
|
| 16 | 1 | 2 | 1 | 3 | 0 | 4 | 2 | 0 | 3 |
|
| 6 | 0 | 2 | 1 | 0 | 0 | 1 | 0 | 0 | 2 |
|
| 33 | 2 | 4 | 2 | 9 | 0 | 4 | 2 | 0 | 10 |
|
| 10 | 0 | 4 | 1 | 1 | 0 | 2 | 0 | 0 | 2 |
|
| 1 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 | 0 |
| Total | 219 | 12 | 41 | 13 | 44 | 0 | 42 | 9 | 7 | 50 |
The tomato MLO proteins.
| Accession * | Source | Protein Name in NCBI | No. of Isoforms | Genomic Feature ID ** | Clade | Zheng 2016 | Kusch 2016 | Clade (K.) |
|---|---|---|---|---|---|---|---|---|
| XP_004231310.1 | NCBI | MLO13 | 3 | Solyc01g102520.4.1 | c1.2.1 | SlMLO11 | SlMLO1 | II |
| XP_004242449.1 | NCBI | MLO1 | 3 | Solyc06g082820.4.1 | c1.2.2 | SlMLO9 | SlMLO7 | II |
| A0A3Q7HJ89_SOLLC | UniProt | 4 | Solyc08g015870.3.1 | c1.2.2 | SlMLO2 | SlMLO12 | II | |
| A0A1C9A1H9_SOLLC | UniProt | 1 | Solyc02g077570.3.1 | c2.2.2 | SlMLO15 | SlMLO2 | VII | |
| XP_004245231.1 | NCBI | MLO9 | 2 | Solyc08g067760.4.1 | c2.1.2 | SlMLO12 | SlMLO11 | III |
| XP_010314898.1 | NCBI | MLO9 | 2 | Solyc02g038806.1.1 *** | c2.1.2 | SlMLO4 | SlMLO17 | III |
| XP_004232584.1 | NCBI | MLO8 | 4 | Solyc02g082430.4.1 | c2.1.2 | SlMLO6 | SlMLO3 | III |
| XP_004240662.1 | NCBI | MLO3 | 2 | Solyc06g010010.3.1 | c2.2.1 | SlMLO16 | SlMLO8 | VI |
| XP_025884137.1 | NCBI | MLO6 | 2 | Solyc11g069220.2.1 | c2.2.3 | SlMLO8 | SlMLO16 | V |
| XP_004240581.1 | NCBI | MLO6 | 4 | Solyc06g010030.4.1 | c2.2.3 | SlMLO3 | SlMLO9 | V |
| A0A3Q7G0J2_SOLLC | UniProt | 3 | Solyc04g049090.3.1 | c2.2.3 | SlMLO1 | SlMLO6 | V | |
| XP_004235223.1 | NCBI | MLO2 | 6 | Solyc03g095650.3.1 | c2.2.3 | SlMLO5 | SlMLO5 | V |
| XP_004244217.1 | NCBI | MLO11 | 3 | Solyc07g063260.4.1 | c1.1.1 | SlMLO14 | SlMLO10 | I |
| XP_004248847.1 | NCBI | MLO4 | 4 | Solyc10g044510.2.1 | c1.1.2 | SlMLO13 | SlMLO15 | I |
| XP_019067935.1 | NCBI | MLO4 | 4 | Solyc02g083720.4.1 | c1.1.2 | SlMLO10 | SlMLO4 | I |
* The most complete isoform found (see extended table in the Supplementary Material for all present isoform accessions); ** identified by BLAST using the whole tomato genome assembly SL4.0 and annotation ITAG4.0 as a reference database (https://solgenomics.net/organism/Solanum_lycopersicum/genome; accessed on 9 July 2021); *** Solyc00g007200 according to Zheng et al. (2016).