| Literature DB >> 22607209 |
Niall J Haslam1, Denis C Shields.
Abstract
BACKGROUND: Short linear protein motifs are attracting increasing attention as functionally independent sites, typically 3-10 amino acids in length that are enriched in disordered regions of proteins. Multiple methods have recently been proposed to discover over-represented motifs within a set of proteins based on simple regular expressions. Here, we extend these approaches to profile-based methods, which provide a richer motif representation.Entities:
Mesh:
Substances:
Year: 2012 PMID: 22607209 PMCID: PMC3534220 DOI: 10.1186/1471-2105-13-104
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Performance using experimentally validated ELMs (dataset from [[14]]), searching for protein short linear motifs in disordered regions of proteins
| TRG_ER_KDEL_1 | 12 | [KRH][DENQ]EL | K.{0,2}DEL$ (1) | DEL (35) | DEL (1) |
| LIG_Dynein_DLC8_1 | 4 | [KR].TQT | S.K.TQT (1) | K[AESV]TQ[TE][PD] (1) | [KV][SAE]TQT 91) |
| LIG_PCNA | 13 | Q.[ILM].[FHM][FHM] | [IL].S[FH]F (1) | Q.[SRT][IL][DM]SFF (1) | [LI].SFF (3) |
| MOD_SUMO | 29 | [VILAFP]K.[EDNGP] | [FIV]K.E (1) | [IV]K[QE]E[PE] (1) | [IV]KEE (1) |
| LIG_SH3_2 | 9 | P.P.[KR] | P.P.R.{0,1}P (1) | - | - |
| LIG_CYCLIN_1 | 22 | [RK].L.{0-1}[FYLIVMP] | RR.{0,1}L.{0,1}F (1) | [GE]L[St]R[ED]L.[KE][HLR]L (5) | K[KR][KR] (1) |
| LIG_CtBP | 26 | P.[DEN]L[VAST] | P[ILM]DL (1) | PLDLS (1) | PLDLS (1) |
| LIG_AP_DAE_1 | 8 | [DE][DES].[F].[DE][LVIMFD] | D.F.F.S.P (1) | DDEF[GS][DE]FQ (1) | [GA]DF (1) |
| LIG_14-3-3_3 | 6 | [RHK][STALV].[ST]. [PEDSIF] | S.P.S.T.P (3) | R[TS]NSA (65) | - |
| LIG_RB | 25 | [LI].C.[DE] | L.C.E (6) | LVCFE (1) | - |
| LIG_Clathr_ClatBox_1 | 15 | l[ILM].[ILMF][DE] | L.{1,2}DL.{0,2}D (12) | [DE][ST][NSD]l[LI][DE][LF] (9) | [LG]L[DG]LD[SG](1) |
| LIG_14-3-3_1 | 4 | R[FSWY].S.P | RS.S.P (3) | RS[IPRT]S[ALMT]P (29) | S[AI]S[ALE]P (1) |
| LIG_RGD | 15 | RGD | R.D.V (7) | RGD (6) | RGD (3) |
| LIG_HP1_1 | 6 | P.V.[LM] | - | - | - |
| LIG_NRBOX | 9 | L.LL | - | [KN]H[AKP]LLS[RN]LL[RQ] (21) | L[KRS][QY]LL (1) |
| MOD_N-GLC_2 | 5 | N.C | - | [FHIMY][NS][EANS][CE][VENS] [CEHRV][VAF][MKLV][EAGS][NE] (42) | - |
| TRG_lysEnd_APsAcLL | 10 | [DER]…L[LVI] | - | - | - |
All proteins contain at least one experimentally determined motif instance. The regular expression that matches the annotated ELM regular expression is returned for each method along with its rank (in brackets). No result is indicated by a dash. Niall Haslam 24 October 2012 09:18.
Summary of performance using experimentally validated ELMs (see Table1)
| Number of First hits | 8 | 6 | 9 |
| Number in Top 10 | 12 | 9 | 11 |
| Total | 17 | 17 | 17 |
| Percentage First Hit | 47% | 35% | 53% |
| Percentage Top 10 | 71% | 53% | 65% |
Performance for MEME searching for short linear motifs in a realistic motif discovery scenario
| 150 | GRB2 | 164 (146) | Grb2 | LIG_SH3 | P.P | - | - | - |
| | | 164 (103) | | LIG_SH2_GRB2 | Y.N | Y.N[LMV] (3) | NK[NEK]KNRY[KV][DN]I (2) | KNRY[KPV][ND]ILP (1) |
| 215 | YWHAH | 47 (13) | 14-3-3 Eta | LIG_14-3-3_1 | R[SFYW].S.p | [KR]S.S.P (1) | - | PKIHRSASEP (17) |
| 350 | CLTC | 35 (15) | Clathrin, heavy polypeptide | LIG_Clathr_Clatbox_1 | L[ILM].[ILMF][DE] | LLDL (4) | - | LLDL[EDM][DS][FA]QP (18) |
| 453 | CCNA2 | 25 (23) | Cyclin, A2 | LIG_CYCLIN_1 | [RK].L.{0,1}[FYLIVMP] | - | A[CK]R[RN]LFG (7) | SA[CK]R[NR]LFG (8) |
| 607 | FNTA | 10 (2) | Farnessyltransferas e alpha subunit | MOD_ASX_betaOH_EGF | C[^DENQ][LIVM].$ | - | C[DT]IS (30) | C[DT]IS (17) |
| 627 | IGTAS | 15 (7) | Intergin alpha 5 | LIG_RGD | RGD | - | KGDRGDA (25) | RG[DQ] (34) |
| 1456 | PCNA | 65 (13) | Proliferating cell nuclear antigen | LIG_PCNA | Q.[ILM].[FHM][FHM] | Q.[IL].FF (1) | TL[YES]SFF (3) | TL[YES]SFF (2) |
| 1574 | RB1 | 110 (28) | Retinoblastoma 1 | LIG_RB | [LI].C.[DE] | - | - | - |
| 3288 | PPARG | 22 (15) | Peroxisome proliferator AR | LIG_NRBOX | L.LL | L.RLL (1) | HKILHRLLQ (4) | [LT][VS]HKLVQ[AL][IL] (1) |
| 3334 | DYNLL1 | 52 (5) | Dynien light chain 1 | LIG_Dynien_DLC8_1 | [KR].TQT | - | [MV]S[CY][DS]K[ES]TQTP (95) | KSTQT (10) |
| 3786 | NEDD4 | 28 (17) | NEDD4 | LIG_WW_1 | PP.Y | PP.Y (7) | PPAY (83) | PPPYSSI (2) |
| 3833 | TRAF6 | 22 (19) | TRAF6 | LIG_TRAF6 | .P.E.[FYWHDE]. | - | - | - |
| 4015 | CTBP1 | 26 (14) | C-terminal binding protien | LIG_CtBP | P.[DEN]LVAST] | D.P[IL]D (6) | - | P[LI]DLS (1) |
| 4946 | CCNA1 | 20 (20) | Cyclin A1 | LIG_CYCLIN_1 | [RK].L.{0,1}[FYLIVMP] | - | A[CK]R[RN]LFG (3) | A[CK]R[RN]LFG (1) |
| 5462 | GIPC1 | 26 (7) | GIPC1 | LIG_PDZ_1 | .[ST].[VIL]$ | S.V$ (1) | - | - |
| 5639 | YWHAG | 206 (28) | 14-3-3 gamma | LIG_14-3-3_1 | R[SFYW].S.P | R.RS.S.S (1) | SRSRSRS[KR]SR (51) | [SK]SRSRS[RK]SR (30) |
| 8968 | EPS15 | 24 (10) | Eps15 | LIG_EH | NPF | TNPF (1) | TNPF[LS](3) | TNPF (1) |
| 9045 | UBE2I | 87 (78) | Ubiquitin conjugating enzyme E1 | MOD_SUMO | [VILAFP]K.[EDNGP] | VK.E (2) | M[KM]VKDEY (18) | VEIVYE (7) |
| 9347 | GGA2 | 19 (2) | GGA2 | LIG_AP_GAE_1 | [DE][DES].F.[DE] [LVIMFD] | DDF.F.A (1) | D[DL]FG[GDE]F (6) | D[DL]FG[GDE]F (7) |
| 9424 | YAP1 | 15 (8) | YES associated protein | LIG_WW_1 | PP.Y | - | [PL][PD]PPY (50) | H[CT][TY][LP]PPPY (6) |
The HPRD dataset used, its name and the number of proteins in that dataset along with the ELM known to mediate some interactions with the protein hub are shown. The regular expression returned highest that matches the annotated ELM and its rank (in brackets) for SLiMFinder and MEME with and without RLC masking and evolutionary weighting are shown. No result is indicated with a dash. Niall Haslam 24 October 2012 09:19.
Summary of the results in a realistic motif discovery scenario (see Table3)
| Number returned | 12 | 14 | 17 |
| Number of first hits | 7 | 0 | 5 |
| Total | 21 | 21 | 21 |
| Number in Top Ten | 12 | 6 | 12 |
| Percentage Returned | 76% | 67% | 81% |
| Percentage Top 10 | 57% | 29% | 57% |
| Percentage Top Hit | 33% | 0% | 24% |
Sample of Sequence Logos from the MEME output from Table 1
| Lig_Dynein | [KV][SAE]TQT | |
| Lig PCNA | [LI].SFF | |
| Lig Cyclin 1 | K[KR][KR] | |
| Lig CtBP | PLDLS | |
| Lig KDEL | DEL |