| Literature DB >> 24589857 |
Ioanna Kalvari1, Stelios Tsompanis2, Nitha C Mulakkal3, Richard Osgood3, Terje Johansen4, Ioannis P Nezis3, Vasilis J Promponas1.
Abstract
Macroautophagy was initially considered to be a nonselective process for bulk breakdown of cytosolic material. However, recent evidence points toward a selective mode of autophagy mediated by the so-called selective autophagy receptors (SARs). SARs act by recognizing and sorting diverse cargo substrates (e.g., proteins, organelles, pathogens) to the autophagic machinery. Known SARs are characterized by a short linear sequence motif (LIR-, LRS-, or AIM-motif) responsible for the interaction between SARs and proteins of the Atg8 family. Interestingly, many LIR-containing proteins (LIRCPs) are also involved in autophagosome formation and maturation and a few of them in regulating signaling pathways. Despite recent research efforts to experimentally identify LIRCPs, only a few dozen of this class of-often unrelated-proteins have been characterized so far using tedious cell biological, biochemical, and crystallographic approaches. The availability of an ever-increasing number of complete eukaryotic genomes provides a grand challenge for characterizing novel LIRCPs throughout the eukaryotes. Along these lines, we developed iLIR, a freely available web resource, which provides in silico tools for assisting the identification of novel LIRCPs. Given an amino acid sequence as input, iLIR searches for instances of short sequences compliant with a refined sensitive regular expression pattern of the extended LIR motif (xLIR-motif) and retrieves characterized protein domains from the SMART database for the query. Additionally, iLIR scores xLIRs against a custom position-specific scoring matrix (PSSM) and identifies potentially disordered subsequences with protein interaction potential overlapping with detected xLIR-motifs. Here we demonstrate that proteins satisfying these criteria make good LIRCP candidates for further experimental verification. Domain architecture is displayed in an informative graphic, and detailed results are also available in tabular form. We anticipate that iLIR will assist with elucidating the full complement of LIRCPs in eukaryotes.Entities:
Keywords: Atg8-family interacting proteins; LC3 interacting region-motif; macroautophagy; prediction; selective autophagy; web server
Mesh:
Substances:
Year: 2014 PMID: 24589857 PMCID: PMC5119064 DOI: 10.4161/auto.28260
Source DB: PubMed Journal: Autophagy ISSN: 1554-8627 Impact factor: 16.016
Table 1. Sequences used in this study
| MOTIF | |||||||||
|---|---|---|---|---|---|---|---|---|---|
| ATG13_HUMAN | O75143 | EGFQTV | 166–171 | No | No | Yes | No | 11 (1.5e-01) | Human |
| DDFVMI | 442–447 | Yes | Yes | Yes | Yes | 20 (8.4e-03) | Human | ||
| Atg1_YEAST | P53104 | REYVVV | 427–432 | Yes | No | Yes | Yes | 14 (5.7e-02) | Yeast |
| Atg32_YEAST | P40458 | GSWQAI | 84–89 | Yes | No | Yes | Yes | 17 (2.2e-02) | Yeast |
| KEYQSL | 235–240 | No | No | Yes | No | 12 (1.1e-01) | Yeast | ||
| LGYILL | 524–529 | No | No | Yes | No | 10 (2.0e-01) | Yeast | ||
| ATG4B_HUMAN** [MM] | Q9Y4P1 | LTYDTL | 6–11 | Yes | No | Yes | No | 12 (1.1e-01) | Human |
| PMFELV | 347–352 | No | No | Yes | No | 10 (2.0e-01) | Human | ||
| EDFEIL | 386–391 | No | Yes | Yes | No | 17 (2.2e-02) | Human | ||
| Atg19_YEAST | P35193 | LTWEEL | 410–415 | Yes | No | Yes | No | 18 (1.6e-02) | Yeast |
| Atg3_YEAST | P40344 | GDWEDL | 268–273 | Yes | No | Yes | No | 22 (4.4e-03) | Yeast |
| BNI3L_HUMAN | O60238 | SSWVEL | 34–39 | Yes | No | Yes | Yes | 20 (8.4e-03) | Human |
| AEFLKV | 183–188 | No | No | Yes | No | 10 (2.0e-01) | Human | ||
| CALR_HUMAN | P27797 | GGYVKL | 107–112 | No | No | Yes | No | 12 (1.1e-01) | Human |
| DEFTHL | 166–171 | No | No | Yes | No | 14 (5.7e-02) | Human | ||
| DDWDFL | 198–203 | Yes | Yes | Yes | Yes | 26 (1.2e-03) | Human | ||
| CBL_HUMAN | P22681 | DTYQHL | 90–95 | No | No | Yes | No | 14 (5.7e-02) | Human |
| LTYDEV | 272–277 | No | No | Yes | No | 11 (1.5e-01) | Human | ||
| FGWLSL | 800–805 | Yes | No | Yes | Yes | 18 (1.6e-02) | Human | ||
| REFVSI | 893–898 | No | No | Yes | Yes* | 13 (7.9e-02) | Human | ||
| FUND1_HUMAN | Q8IVP5 | DSYEVL | 16–21 | Yes | Yes | Yes | No | 16 (3.0e-02) | Human |
| GGFLLL | 81–86 | No | No | Yes | No | 10 (2.0e-01) | Human | ||
| OPTN_HUMAN | Q96CV9 | DSFVEI | 176–181 | Yes | Yes | Yes | Yes | 15 (4.2e-02) | Human |
| Q8MQJ7_DROME | Q8MQJ7 | ADYLSV | 96–101 | No | No | Yes | No | 14 (5.7e-02) | |
| DDFVLV | 389–394 | Yes | Yes | Yes | Yes | 17 (2.2e-02) | |||
| Q9SB64_ARATH | Q9SB64 | RVWVLI | 479–484 | No | No | Yes | No | 15 (4.2e-02) | |
| SEWDPI | 659–664 | Yes | No | Yes | No | 20 (8.4e-03) | |||
| RBCC1_HUMAN | Q8TDY2 | FDFETI | 700–705 | Yes | No | Yes | Yes | 17 (2.2e-02) | Human |
| SQSTM_HUMAN** [LL] | Q13501 | DDWTHL | 336–341 | Yes | No | Yes | Yes | 24 (2.3e-03) | Human |
| STBD1_HUMAN** [LN] | O95210 | EEWEMV | 201–206 | Yes | Yes | Yes | No | 21 (6.1e-03) | Human |
| T53I1_HUMAN | Q96A56 | DEWILV | 29–34 | Yes | Yes | Yes | Yes | 20 (8.4e-03) | Human |
| TBC25_HUMAN | Q3MII6 | EVYLSL | 95–100 | No | No | Yes | No | 8 (3.9e-01) | Human |
| EDWDII | 134–139 | Yes | Yes | Yes | No | 24 (2.3e-03) | Human | ||
| TBCD5_HUMAN | Q92609 | KEWEEL | 57–62 | Yes | No | Yes | No | 20 (8.4e-03) | Human |
| DDFILI | 713–718 | No | Yes | Yes | Yes* | 17 (2.2e-02) | Human | ||
| SGFTIV | 785–790 | Yes | No | Yes | Yes | 11 (1.5e-01) | Human | ||
| T53I2_HUMAN | Q8IXH6 | DGWLII | 33–38 | Yes | No | Yes | Yes | 21 (6.1e-03) | Human |
| ULK1_HUMAN | O75385 | DDFVMV | 355–360 | Yes | Yes | Yes | Yes | 19 (1.2e-02) | Human |
| ULK2_HUMAN | Q8IYT8 | DDFVLV | 351–356 | Yes | Yes | Yes | Yes | 17 (2.2e-02) | Human |
| CLH1_HUMAN | Q00610 | PDWIFL | 512–517 | Yes | No | Yes | No | 22 (4.4e-03) | Human |
| GMFTEL | 1315–1320 | No | No | Yes | No | 11 (1.5e-01) | Human | ||
| EDYQAL | 1475–1480 | No | No | Yes | No | 16 (3.0e-02) | Human | ||
| DVL2_HUMAN | O14641 | RMWLKI | 442–447 | Yes | No | Yes | No | 18 (1.6e-02) | Human |
| FYCO1_HUMAN** [MM] | Q9BQS8 | ADYQAL | 644–649 | No | No | Yes | Yes* | 15 (4.2e-02) | Human |
| AVFDII | 1278–1283 | Yes | No | Yes | Yes | 8 (3.9e-01) | Human | ||
| NBR1_HUMAN | Q14596 | LSFELL | 561–566 | No | No | Yes | Yes* | 10 (2.0e-01) | Human |
| EDYIII | 730–735 | Yes | Yes | Yes | Yes | 17 (2.2e-02) | Human | ||
| BNIP3_HUMAN | Q12983 | GSWVEL | 16–21 | Yes | No | Yes | Yes | 19 (1.2e-02) | Human |
| AEFLKV | 159–164 | No | No | Yes | No | 10 (2.0e-01) | Human | ||
| MK15_HUMAN | Q8TD08 | RVYQMI | 338–343 | Yes | No | Yes | Yes | 10 (2.0e-01) | Human |
| CACO2_HUMAN | Q13137 | FMWVTL | 72–77 | No | No | Yes | No | 20 (8.4e-03) | Human |
| DILVV | 132–136 | Yes | No | No | No | N/A | Human | ||
| C0H519_PLAF7 | C0H519 | NDWLLP | 103–108 | Yes | No | No | No | 12 (1.2e-02) | |
| ATG34_YEAST | Q12292 | KVYEKL | 194–199 | No | No | Yes | No | 8 (3.9e-01) | Yeast |
| FTWEEI | 407–412 | Yes | No | Yes | No | 20 (8.4e-03) | Yeast | ||
| TAXB1_HUMAN | Q86VP1 | DMLVV | 139–143 | Yes | No | No | No | N/A | Human |
| ADFDIV | 514–519 | No | No | Yes | Yes | 15 (4.2e-02) | Human | ||
| CTNB1_HUMAN | P35222 | SHWPLI | 502–507 | Yes | No | No | No | 11 (1.5e-01) | Human |
| STK4_HUMAN [MM] | Q13043 | EVFDVL | 28–33 | No | No | Yes | No | 9 (2.8e-01) | Human |
| GDYEFL | 431–436 | No | No | Yes | Yes | 17 (2.2e-02) | Human | ||
| STK3_HUMAN [LM] | Q13188 | EVFDVL | 25–30 | No | No | Yes | No | 9 (2.8e-01) | Human |
| GDFDFL | 435–440 | No | No | Yes | Yes | 16 (3.0e-02) | Human | ||
| RASF5_HUMAN [MN] | Q8WWW0 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| NEDD4_HUMAN [LL] | P46934 | SEYIKL | 410–415 | No | No | Yes | No | 13 (7.9e-02) | Human |
| PGWVVL | 589–594 | No | No | Yes | Yes | 19 (1.2e-02) | Human | ||
| ESFEEL | 1296–1301 | No | Yes | Yes | No | 13 (7.9e-02) | Human | ||
| A16L1_HUMAN [MM] | Q676U5 | DEYDAL | 164–169 | No | Yes | Yes | Yes | 16 (3.0e-02) | Human |
| TFCP2_HUMAN [LN] | Q12800 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| SF3A1_HUMAN [LN] | Q15459 | PEFEFI | 148–153 | No | No | Yes | No | 13 (7.9e-02) | Human |
| FNBP1_HUMAN [MN] | Q96RU3 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| TBC15_HUMAN [LL] | Q8TC07 | AEWDMV | 96–101 | No | No | Yes | No | 20 (8.4e-03) | Human |
| PGFEVI | 295–300 | No | No | Yes | No | 12 (1.1e-01) | Human | ||
| FSFLDI | 540–545 | No | No | Yes | No | 11 (1.5e-01) | Human | ||
| ANFY1_HUMAN [MN] | Q9P2R3 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| TCPR2_HUMAN [LM] | O15040 | GDYIAV | 45–50 | No | No | Yes | No | 14 (5.7e-02) | Human |
| AVFQLV | 102–107 | No | No | Yes | No | 5 (1.0e+00) | Human | ||
| AVFVAL | 894–899 | No | No | Yes | No | 7 (5.3e-01) | Human | ||
| DEWEVI | 1406–1411 | No | Yes | Yes | No | 23 (3.2e-03) | Human | ||
| ECHA_HUMAN [LM] | P40939 | AVFEDL | 447–452 | No | No | Yes | No | 7 (5.3e-01) | Human |
| NIPS2_HUMAN [MM] | O75323 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| ATG5_HUMAN [MM] | Q9H1Y0 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| ATG7_HUMAN [MM] | O95352 | SSFQSV | 258–263 | No | No | Yes | No | 10 (2.0e-01) | Human |
| KPCI_HUMAN [LM] | P41743 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| EPN4_HUMAN [LM] | Q14677 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| ATG3_HUMAN [LL] | Q9NT62 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| DYXC1_HUMAN [LL] | Q8WXU2 | AVFLSL | 16–21 | No | No | Yes | No | 6 (7.4e-01) | Human |
| AMWETL | 81–86 | No | No | Yes | No | 19 (1.2e-02) | Human | ||
| NEK9_HUMAN [LL] | Q8TD19 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| UBA5_HUMAN [MM] | Q9GZZ9 | SDYEKI | 66–71 | No | No | Yes | No | 17 (2.2e-02) | Human |
| FDYDKV | 103–108 | No | No | Yes | No | 16 (3.0e-02) | Human | ||
| TBD2B_HUMAN [LM] | Q9UPU7 | EEWELL | 252–257 | No | Yes | Yes | Yes | 20 (8.4e-03) | Human |
| KBTB6_HUMAN [LL] | Q86V97 | ESFEVL | 120–125 | No | Yes | Yes | No | 13 (7.9e-02) | Human |
| IPO5_HUMAN [LN] | O00410 | ETYENI | 31–36 | No | Yes | No | No | 11 (1.5e-01) | Human |
| DGWEFV | 655–660 | No | No | Yes | No | 21 (6.1e-03) | Human | ||
| LSWLPL | 997–1002 | No | No | Yes | No | 16 (3.0e-02) | Human | ||
| NCOA7_HUMAN [LM] | Q8NI08 | AEYDKL | 185–190 | No | No | Yes | No | 13 (7.9e-02) | Human |
| GEWEDL | 308–313 | No | No | Yes | No | 19 (1.2e-02) | Human | ||
| DDFVDL | 414–419 | No | Yes | Yes | Yes | 18 (1.6e-02) | Human | ||
| KSWEII | 745–750 | No | No | Yes | No | 19 (1.2e-02) | Human | ||
| KAP0_HUMAN [MM] | P10644 | EEFVEV | 310–315 | No | Yes | Yes | No | 13 (7.9e-02) | Human |
| GYS1_HUMAN [NN] | P13807 | - | - | N/A | N/A | N/A | N/A | N/A | Human |
| KBTB7_HUMAN [LL] | Q8WVZ9 | ESFEVL | 120–125 | No | Yes | Yes | No | 13 (7.9e-02) | Human |
| ATG2A_HUMAN [LM] | Q2TAZ0 | PEYTEI | 534–539 | No | No | Yes | No | 13 (7.9e-02) | Human |
| EVYESI | 828–833 | No | No | Yes | No | 9 (2.8e-01) | Human | ||
| LEFLDV | 1090–1095 | No | No | Yes | No | 9 (2.8e-01) | Human | ||
| FAN_HUMAN [ML] | Q92636 | ESFEDL | 600–605 | No | Yes | Yes | No | 12 (1.1e-01) | Human |
| LVWDLL | 869–874 | No | No | Yes | No | 13 (7.9e-02) | Human | ||
Top rows contain sequence entries obtained from Alemu and colleagues used to construct the xLIR-motif and to validate both the cLIR- and xLIR-motifs. “Verified” signifies experimentally verified functional LIR-motifs. “Anchor” refers to a prediction of the ANCHOR software overlapping with a given LIR-motif in > 3 residues. Middle and bottom row blocks contain the additional sequences from the works of Birgisdottir and colleagues and Behrends and colleagues, respectively. Entries signified with (*) correspond to possibly spurious matches of the xLIR-motif which are simultaneously predicted as anchors, while entries marked with (**) were also present in the data set reported by Behrends and colleagues. For the atypical verified LIRs of CALCOCO2/NDP52 (CACO2_HUMAN) and TAX1BP1 (TAXB1_HUMAN), which are pentapeptides, a gapless PSSM match is not possible, thus the respective PSSM scores are marked as “N/A”. In square brackets the 2 characters correspond to one of the 3 possible observations of LIR-dependent interactions against GABARAP and MAP1LC3B, respectively, as reported in Behrends and colleagues; N, no binding against the wild-type Atg8 homolog; M, binding maintained; L, binding lost for the mutant forms.
Table 2. Amino acid residue background distribution
| Residue type | % Abundance | Residue type | % Abundance |
|---|---|---|---|
| Ala | 8.25 | Leu | 9.66 |
| Arg | 5.53 | Lys | 5.84 |
| Asn | 4.06 | Met | 2.42 |
| Asp | 5.45 | Phe | 3.86 |
| Cys | 1.37 | Pro | 4.70 |
| Gln | 3.93 | Ser | 6.56 |
| Glu | 6.75 | Thr | 5.34 |
| Gly | 7.07 | Trp | 1.08 |
| His | 2.27 | Tyr | 2.92 |
| Ile | 5.96 | Val | 6.87 |
Data regarding the 20 common amino acid residues, calculated from UniProtKB/Swiss-Prot release 2013_04, April 2013; available from the ProtScale tool http://web.expasy.org/protscale.
Table 3. Validation of xLIR- and cLIR-motif-based predictors
| xLIR | cLIR | xLIR + A | cLIR + A | xLIR + A + P13 | xLIR + A|P13 | |
|---|---|---|---|---|---|---|
| 27 | 11 | 17 | 8 | 15 | 26 | |
| 0 | 18 | 16 | 19 | 18 | 11 | |
| 20 | 2 | 4 | 1 | 2 | 9 | |
| 0 | 16 | 10 | 19 | 12 | 1 | |
| 57.4 | 61.7 | 70.2 | 57.4 | 70.2 | ||
| 40.7 | 63.0 | 29.6 | 55.6 | 96.3 | ||
| 0.0 | 90.0 | 80.0 | 90.0 | 55.0 | ||
| 50.0 | 65.4 | 71.5 | 62.3 | 72.8 |
Different schemes are validated for the prediction of functional LIR-motifs on the set of 26 proteins with validated LIRs described by Alemu and colleagues. xLIR and cLIR are based simply on the detection of the xLIR- and cLIR-motifs, respectively, whereas xLIR+A/cLIR+A require that a functional motif should overlap with an anchor as predicted by the ANCHOR tool. The 2 rightmost columns correspond to xLIR-motifs that overlap with an anchor and have a PSSM score > 13 (xLIR + A + P13) and xLIR-motifs that either overlap with an anchor or have a PSSM score > 13 (xLIR + A|P13). ACC, accuracy (%); Sens, sensitivity (%); Spec, specificity (%); BACC, balanced accuracy (%). For each validation metric the highest recorded value is depicted in bold typeface.

Figure 1. PSSM score distributions for different classes of hexapeptides. Box-plot representation of PSSM score distributions for xLIR-motifs in the 26 sequences of LIRCPs (verified and unverified; left and middle, respectively) and the remaining hexapeptides (“background”; right), obtained by scoring a sliding-PSSM along the sequences in the set of 26 sequences reported by Alemu and colleagues. The differences indicated here suggest that PSSMs may be able to reliably discriminate between functional and nonfunctional xLIRs. In particular, a Wilcoxon rank sum test with continuity correction, demonstrates significant differences between both verified and unverified xLIRs compared with background (P < 2.2 × 10−16 and 1.2 × 10−14, respectively) and verified vs. unverified xLIRs (P: 6.0 × 10−6).
Table 4. Validation of the PSSM method as a predictor of LIR-motifs
| Above cutoff | PSSM validation | ||||||
|---|---|---|---|---|---|---|---|
| 9 | 26 | 19 | 93 (85) | 57.4 | 5.0 | 50.7 | |
| 10 | 26 | 14 | 63 (63) | 68.1 | 30.0 | 63.2 | |
| 11 | 25 | 11 | 47 (49) | 72.3 | 92.6 | 45.0 | 68.8 |
| 12 | 24 | 9 | 28 (32) | 74.5 | 88.9 | 55.0 | 72.0 |
| 13 | 24 | 8 | 17 (25) | 76.6 | 88.9 | 60.0 | 74.5 |
| 14 | 23 | 5 | 13 (16) | 80.9 | 85.2 | 75.0 | 80.1 |
| 15 | 22 | 3 | 10 (14) | 81.5 | 85.0 | 83.3 | |
| 16 | 21 | 2 | 4 (11) | 77.8 | 90.0 | ||
| 17 | 16 | 0 | 2 (7) | 76.6 | 59.3 | 79.7 | |
| 18 | 13 | 0 | 0 (5) | 70.2 | 48.2 | 74.1 | |
We report the number of hexapeptides with a PSSM score above different threshold values. Peptides from the background data set scoring above the threshold would be regarded as false positives if there were no restriction to comply with the xLIR-motif. Results for the randomized versions of the 26 verified LIRCPs are displayed in parentheses next to “background” data. ACC, accuracy (%); Sens, sensitivity (%); Spec, specificity (%); BACC, balanced accuracy (%). For each validation metric the highest recorded value is depicted in bold typeface.

Figure 2. iLIR server user interface. A simple user interface enables sequence data entry in FASTA format either by copying-pasting data in the respective text area or by uploading a FASTA formatted text file. Currently, a single sequence can be processed at a given execution and no user-defined parameters are necessary/supported. Pre-run examples of known LIRCP sequences help users get accustomed to the iLIR output format.

Figure 3. iLIR results page. The output page for human SQSTM1 (UniProt Accession: Q13501) is displayed. A graphical depiction of identified domains (top) is accompanied with detailed tables (bottom). Some domains/features are kept hidden for maintaining an uncluttered graphic but are present in the tables. By moving the mouse over any domain/feature on the graphic, a pop-up tip (blue panel, top right) displays further information (in this case, details regarding the name, borders and origin database of the ubiquitin associated [UBA] domain). Notice that the tables containing anchors and SMART-derived domains may be shown/hidden according to the user’s preference.