| Literature DB >> 27508221 |
Katharina Janek1, Agathe Niewienda1, Johannes Wöstemeyer2, Jürgen Voigt2.
Abstract
The data provide information in support of the research article, "The cleavage specificity of the aspartic protease of cocoa beans involved in the generation of the cocoa-specific aroma precursors" (Janek et al., 2016) [1]. Three different protein substrates were partially digested with the aspartic protease isolated from cocoa beans and commercial pepsin, respectively. The obtained peptide fragments were analyzed by matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF/TOF-MS/MS) and identified using the MASCOT server. The N- and C-terminal ends of the peptide fragments were used to identify the corresponding in-vitro cleavage sites by comparison with the amino acid sequences of the substrate proteins. The same procedure was applied to identify the cleavage sites used by the cocoa aspartic protease during cocoa fermentation starting from the published amino acid sequences of oligopeptides isolated from fermented cocoa beans.Entities:
Keywords: Aspartic protease; Cleavage sites; Cocoa; In-vitro proteolysis; Mass spectrometry; Peptides
Year: 2016 PMID: 27508221 PMCID: PMC4950170 DOI: 10.1016/j.dib.2016.06.021
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
Specific and common cleavage sites of cocoa aspartic protease and pepsin in different protein substratesa.
| EWQQ|VLNV | 7–14 | DGEW|QQVL | 5–12 | GEWQ|QVLN | 6–13 | |
| WQQV|LNVW | 8–15 | QQVL|NVWG | 9–16 | VLNV|WGKV | 11–18 | |
| FDKF|KHLK | 44–51 | LNVW|GKVE | 12–19 | NVWG|KVEA | 13–20 | |
| LKTE|AEMK | 50–57 | HGQE|VLIR | 25–32 | GKVE|ADIA | 16–23 | |
| EDLK|KHGT | 60–67 | GQEV|LIRL | 26–33 | KVEA|DIAG | 17–24 | |
| AIIH|VLHS | 111–118 | QEVL|IRLF | 27–34 | VEAD|IAGH | 18–25 | |
| IHVL|HSKH | 113–120 | LIRL|FTGH | 30–37 | AGHG|QEVL | 23–30 | |
| HVLH|SKHP | 114–121 | TVVL|TALG | 67–74 | GHGQ|EVLI | 24–31 | |
| VLHS|KHPG | 115–122 | PIKY|LEFI | 101–108 | EVLI|RLFT | 28–35 | |
| HPGD|FGAD | 120–127 | KYLE|FISD | 103–110 | PETL|EKFD | 38–45 | |
| FRND|IAAK | 139–146 | YLEF|ISDA | 104–111 | HLKT|EAEM | 49–56 | |
| AKYK|ELGF | 145–152 | FISD|AIIH | 107–114 | KTEA|EMKA | 51–58 | |
| YKEL|GFQG | 147–154 | ISDA|IIHV | 108–115 | EAEM|KASE | 53–60 | |
| MTKA|LELF | 132–139 | GGIL|KKKG | 74–81 | |||
| ALEL|FRND | 135–142 | EAEL|KPLA | 84–91 | |||
| KYKE|LGFQ | 146–153 | PGDF|GADA | 121–128 | |||
| QGAM|TKAL | 129–136 | |||||
| GAMT|KALE | 130–137 | |||||
| TKAL|ELFR | 133–140 | |||||
| KALE|LFRN | 134–141 | |||||
| LELF|RNDI | 136–143 | |||||
| AAKY|KELG | 144–151 | |||||
| ELGF|QD−− | 149–154 | |||||
| GGLA|LGRA | 57–64 | VANA|ANSP | 23–30 | GRAT|GQSC | 62–69 | |
| GLAL|GRAT | 58–65 | YYVL|SSIS | 45–52 | CPEI|VVQR | 69–76 | |
| ATGQ|SCPE | 64–71 | EIVV|QRRS | 71–78 | VRVS|TDVN | 98–105 | |
| GKWW|VTTD | 132–139 | IVVQ|RRSD | 72–79 | NIEF|VPIR | 105–112 | |
| GYKF|RFCP | 163–170 | PVIF|SNAD | 85–92 | PIRD|RLCS | 110–117 | |
| KFRF|CPSV | 165–172 | VIFS|NADS | 86–93 | TSTV|WRLD | 118–125 | |
| AGKW|WVTT | 131–138 | AGVL|GYKF | 159–166 | |||
| PNTL|CSWF | 147–154 | SVCD|SCTT | 171–178 | |||
| TLCS|WFKI | 149–156 | SDDD|GQIR | 187–194 | |||
| LCSW|FKIE | 150–157 | IRLA|LSDN | 193–200 | |||
| CSWF|KIEK | 151–158 | RLAL|SDNE | 194–201 | |||
| QIRL|ALSD | 192–199 | |||||
| ASKT|IKQV | 209–216 | |||||
| NDYR|LAMF | 50–57 | PKRR|SFQT | 17–24 | RSEE|EEGQ | 1–8 | |
| ENKE|SYNV | 91–98 | RRSF|QTRF | 19–26 | PYYF|PKRR | 13–20 | |
| TVYV|VSQD | 111–118 | EGNF|KILQ | 30–37 | YYFP|KRRS | 14–21 | |
| GMFR|KAKP | 190–197 | FKIL|QRFA | 33–40 | YFPK|RRSF | 15–22 | |
| KAKP|EQIR | 194–201 | LQRF|AENS | 36–43 | RSFQ|TRFR | 20–27 | |
| AKPE|QIRA | 195–202 | KGIN|DYRL | 47–54 | FQTR|FRDE | 22–29 | |
| KPEQ|IRAI | 196–203 | GIND|YRLA | 48–55 | QTRF|RDEE | 23–30 | |
| ERLA|INLL | 216–223 | DYRL|AMFE | 51–58 | KILQ|RFAE | 34–41 | |
| FKLN|QGAI | 257–264 | RLAM|FEAN | 53–60 | ILQR|FAEN | 35–42 | |
| VPHY|NSKA | 266–273 | CDAE|AIYF | 70–77 | NPNT|FILP | 60–67 | |
| GYAQ|MACP | 284–291 | EAIY|FVTN | 73–80 | DAEA|IYFV | 71–78 | |
| VTFF|ASKD | 343–350 | TITF|VTHE | 84–91 | AEAI|YFVT | 72–79 | |
| LVDN|IFNN | 395–402 | TVVS|VPAG | 102–109 | AIYF|VTNG | 74–81 | |
| SVPA|GSTV | 105–112 | GTIT|FVTH | 83–90 | |||
| STVY|VVSQ | 110–117 | VTHE|NKES | 88–95 | |||
| TIAV|LALP | 124–131 | KESY|NVQR | 93–100 | |||
| VLAL|PVNS | 127–134 | ESYN|VQRG | 94–101 | |||
| KYEL|FFPA | 137–144 | YNVQ|RGTV | 96–103 | |||
| ELFF|PAGN | 139–146 | VQRG|TVVS | 98–105 | |||
| NKPE|SYYG | 147–154 | RGTV|VSVP | 100–107 | |||
| YGAF|SYEV | 153–160 | GTVV|SVPA | 101–108 | |||
| YEVL|ETVF | 158–165 | VVSV|PAGS | 103–110 | |||
| REKL|EEIL | 169–176 | AGST|VYVV | 108–115 | |||
| KLEE|ILEE | 171–178 | GSTV|YVVS | 109–116 | |||
| EEIL|EEQR | 173–180 | LTIA|VLAL | 123–130 | |||
| QIRA|ISQQ | 199–206 | IAVL|ALPV | 125–132 | |||
| GERL|AINL | 215–222 | PGKY|ELFF | 135–142 | |||
| AINL|LSQS | 219–226 | GKYE|LFFP | 136–143 | |||
| NGRF|FEAC | 233–240 | YELF|FPAG | 138–145 | |||
| AVSA|FKLN | 253–260 | PESY|YGAF | 149–156 | |||
| NQGA|IFVP | 260–267 | YYGA|FSYE | 152–159 | |||
| KATF|VVFV | 272–279 | GAFS|YEVL | 154–161 | |||
| SGRQ|DRRE | 302–309 | AFSY|EVLE | 155–162 | |||
| GRQD|RREQ | 303–310 | FSYE|VLET | 156–163 | |||
| RQDR|REQE | 304–311 | EVLE|TVFN | 159–166 | |||
| EETF|GEFQ | 316–323 | ETVF|NTQR | 162–169 | |||
| TFGE|FQQV | 318–325 | QQGM|FRKA | 188–195 | |||
| FGEF|QQVK | 319–326 | QGMF|RKAK | 189–196 | |||
| GDVF|VAPA | 332–339 | LAIN|LLSQ | 218–225 | |||
| AVTF|FASK | 342–349 | INLL|SQSP | 220–227 | |||
| AVAF|GLNA | 355–362 | GRFF|EACP | 234–241 | |||
| QRIF|LAGK | 366–373 | FSQF|QNMD | 244–251 | |||
| KKNL|VRQM | 373–380 | VSAF|KLNQ | 254–261 | |||
| EAKE|LSFG | 383–390 | AFKL|NQGA | 256–263 | |||
| FSKL|VDNI | 392–399 | GAIF|VPHY | 262–269 | |||
| ESYF|MSFS | 405–412 | FVVF|VTDG | 275–282 | |||
| CPHL|SRQS | 290–297 | |||||
| SRQS|QGSQ | 294–301 | |||||
| RQSQ|GSQS | 295–302 | |||||
| SQGS|QSGR | 297–304 | |||||
| QGSQ|SGRQ | 298–305 | |||||
| GSQS|GRQD | 299–306 | |||||
| SQSG|RQDR | 300–307 | |||||
| EEET|FGEF | 315–322 | |||||
| PGDV|FVAP | 331–338 | |||||
| PLNA|VAFG | 352–359 | |||||
| NAVA|FGLN | 354–361 | |||||
| AFGL|NAQN | 357–364 | |||||
| FGLN|AQNN | 358–365 | |||||
| NNQR|IFLA | 364–371 | |||||
| RIFL|AGKK | 367–374 | |||||
| IFLA|GKKN | 368–375 | |||||
| FLAG|KKNL | 369–376 | |||||
| VRQM|DSEA | 377–384 | |||||
| RQMD|SEAK | 378–385 | |||||
| QMDS|EAKE | 379–386 | |||||
| MDSE|AKEL | 380–387 | |||||
| GVPS|KLVD | 390–397 | |||||
| DNIF|NNPD | 397–404 | |||||
| NNPD|ESYF | 401–408 | |||||
| PDES|YFMS | 403–410 | |||||
| SQQR|QRGD | 412–419 | |||||
| QQRQ|RGDE | 413–420 | |||||
Octapeptide sequences around the cleavage sites for the cocoa aspartic protease and pepsin, respectively, detected by partial proteolysis of myoglobin, the cocoa 21-kDa seed protein, and the cocoa vicilin-class(7S) globulin. Data were separately listed for sites exclusively cleaved by the cocoa aspartic protease and pepsin, respectively, and those cleaved by both proteases (=unspecific cleavage sites).
Putative cleavage sites of the cocoa aspartic protease predicted from oligopeptides isolated from fermented cocoa beans.
| VANA|ANSP | 23–30 | N-terminal | yes | ||
| SPVL|DTDG | 29–36 | C-terminal | no | ||
| YYVL|SSIS | 45–52 | N-terminal | yes | ||
| SSIS|GAGG | 49–56 | N-terminal | no | ||
| GGGL|ALGR | 56–63 | C-terminal | no | ||
| IVVQ|RRSD | 72–79 | N-terminal | yes | ||
| SDLD|NGTP | 78–85 | N-terminal | no | ||
| PVIF|SNAD | 85–92 | N- and C-terminal | no | ||
| FSNA|DSKD | 88–95 | N-terminal | no | ||
| DVVR|VSTD | 96–103 | N-terminal | no | ||
| TDVN|IEFV | 102–109 | N- and C-terminal | no | ||
| NIEF|VPIR | 105–112 | C-terminal | no | ||
| CSTS|TVWR | 116–123 | N-terminal | no | ||
| STVW|RLDN | 119–126 | N-terminal | no | ||
| WRLD|NYDN | 122–129 | C-terminal | no | ||
| LALS|DNEW | 195–202 | N-terminal | no | ||
| AWMF|KKAS | 203–210 | C-terminal | no | ||
| EGQQ|RNNP | 6–13 | N- and C-terminal | no | ||
| GQQR|NNPY | 7–14 | N-terminal | no | ||
| QQRN|NPYY | 8–15 | N-terminal | no | ||
| QRNN|PYYF | 9–16 | N-terminal | no | ||
| PYYF|PKRR | 13–20 | C-terminal+CP | no | ||
| YFPK|RRSF | 15–22 | N- and C-terminal | no | ||
| FPKR|RSFQ | 16–23 | N-terminal | no | ||
| RRSF|QTRF | 19–26 | C-terminal | yes | ||
| RSFQ|TRFR | 20–27 | N-terminal | no | ||
| TRFR|DEEG | 24–31 | N-terminal | no | ||
| RDEE|GNFK | 27–34 | N- and C-terminal | no | ||
| EEGN|FKIL | 29–36 | N-terminal | no | ||
| EGNF|KILQ | 30–37 | N- and C-terminal | yes | ||
| FKIL|QRFA | 33–40 | C-terminal | yes | ||
| KILQ|RFAE | 34–41 | C-terminal | no | ||
| SPPL|KGIN | 43–50 | N-terminal | no | ||
| KGIN|DYRL | 47–54 | C-terminal | yes | ||
| INDY|RLAM | 49–56 | N-terminal | no | ||
| RLAM|FEAN | 53–60 | C-terminal+CP | yes | ||
| NPNT|FILP | 60–67 | N-terminal | no | ||
| ILPH|HCDA | 65–72 | C-terminal | no | ||
| YFVT|NGKG | 76–83 | N-terminal | no | ||
| VTNG|KGTI | 78–85 | N-terminal | no | ||
| TITF|VTHE | 84–91 | C-terminal±CP | yes | ||
| THEN|KESY | 89–95 | N-terminal | no | ||
| YNVQ|RGTV | 96–103 | N- and C-terminal | no | ||
| TVVS|VPAG | 102–109 | C-terminal | yes | ||
| VLAL|PVNS | 127–134 | N-terminal | yes | ||
| LPVN|SPGK | 129–138 | N-terminal | no | ||
| PGKY|ELFF | 135–142 | C-terminal | no | ||
| FPAG|NNKP | 142–149 | N-terminal | no | ||
| AGNN|KPES | 144–151 | N-terminal | no | ||
| NKPE|SYYG | 147–154 | C-terminal | no | ||
| KPES|YYGA | 148–155 | N- and C-terminal | no | ||
| FSYE|VLET | 156–163 | N-terminal | no | ||
| YEVL|ETVF | 158–167 | C-terminal | yes | ||
| EVLE|TVFN | 159–166 | C-terminal | no | ||
| PRHR|GGER | 209–217 | N-terminal | no | ||
| ERLA|INLL | 216–223 | N-terminal | yes | ||
| AINL|LSQS | 219–226 | C-terminal+CP | yes | ||
| INLL|SQSP | 220–227 | C-terminal | no | ||
| NLLS|QSPV | 221–228 | C-terminal | no | ||
| VAVS|AFKL | 252–259 | N-terminal | no | ||
| AVSA|FKLN | 253–260 | N-terminal | yes | ||
| FKLN|QGAI | 257–264 | C-terminal+CP | yes | ||
| KLNQ|GAIF | 258–265 | N- and C-terminal | no | ||
| LNQG|AIFV | 259–266 | N-terminal | no | ||
| NQGA|IFVP | 260–267 | N- and C-terminal | yes | ||
| QGAI|FVPH | 261–268 | N-terminal | no | ||
| GAIF|VPHY | 262–269 | N-terminal | no | ||
| VPHY|NSKA | 266–273 | C-terminal+CP | yes | ||
| PHYN|SKAT | 267–274 | C-terminal | no | ||
| HYNS|KATF | 268–275 | C-terminal | no | ||
| KATF|VVFV | 272–279 | C-terminal+CP | yes | ||
| SQSG|RQDR | 300–307 | N-terminal | no | ||
| EQEE|ESEE | 309–316 | C-terminal | no | ||
| GEFQ|QVKA | 320–327 | N-terminal | no | ||
| QQVK|APLS | 323–330 | N-terminal | no | ||
| KAPL|SPGD | 326–333 | N- and C-terminal | no | ||
| APLS|PGDV | 327–334 | N-terminal | no | ||
| PLSP|GDVF | 328–335 | N-terminal | no | ||
| GDVF|VAPA | 332–339 | N- and C-terminal | yes | ||
| VFVA|PAGH | 334–341 | N-terminal | no | ||
| APAG|HAVT | 337–344 | N-terminal | no | ||
| AVTF|FASK | 342–349 | C-terminal | yes | ||
| VTFF|ASKD | 343–350 | N- and C-terminal | yes | ||
| FFAS|KDQP | 345–352 | N-terminal | no | ||
| FASK|DQPL | 346–353 | N-terminal | no | ||
| AVAF|GLNA | 355–362 | C-terminal+CP | yes | ||
| LNAQ|NNQR | 360–367 | N-terminal | no | ||
| NAQN|NQRI | 361–368 | N-terminal | no | ||
| AQNN|QRIF | 362–369 | N-terminal | no | ||
| QNNQ|RIFL | 363–370 | N-terminal | no | ||
| QRIF|LAGK | 366–373 | C-terminal | no | ||
| GKKN|LVRQ | 372–379 | N-terminal | no | ||
| NLVR|QMDS | 375–382 | C-terminal | no | ||
| AKEL|SFGV | 384–391 | N-terminal | no | ||
| KELS|FGVP | 385–392 | N-terminal | no | ||
| PSKL|VDNI | 392–399 | C-terminal+CP | no | ||
| NPDE|SYFM | 402–409 | N-terminal | no | ||
| ESYF|MSFS | 405–412 | C-terminal | no |
Octapeptide sequence (P4–P4′) around the putative cleavage site.
Position of the octapeptide in the amino acid sequence of the degraded seed protein.
Localization of the cleavage site at the N-terminal or C-terminal end of the oligopeptide, from which the cleavage site was predicted. Since the peptides formed during cocoa fermentation are modified by a carboxypeptidase [2], [5], the N-terminal cleavage sites are more reliable than the C-terminal ones. In case of the C-terminal ends of the corresponding oligopeptide, a downstream localized cleavage site was predicted, whenever the resulting peptide fragment could be modified by the cocoa carboxypeptidase [6] to the finally detected oligopeptide (indicated by “+CP”).
Abundance of different amino acid residues in the P4 to P4′ positions of the predicted and experimentally detected cleavage sites of the cocoa aspartic protease.
| 1.02 | 0.93 | 1.02 | 0.93 | 0.00 | 1.88 | 1.02 | 4.67 | |
| 4.08 | 1.88 | 4.08 | 3.76 | |||||
| 5.10 | 4.67 | 3.06 | 4.08 | 3.76 | 3.06 | 3.76 | ||
| 4.08 | 5.61 | 5.54 | ||||||
| 4.08 | 2.80 | 4.08 | 1.02 | 0.00 | ||||
| 0.00 | 0.93 | 0.00 | 0.93 | 1.02 | 0.00 | 1.02 | 0.93 | |
| 5.61 | 0.00 | 4.67 | ||||||
| 4.67 | ||||||||
| 3.06 | 3.76 | 5.10 | 0.00 | |||||
| 1.02 | 1.88 | 0.00 | 0.93 | 0.00 | 0.93 | 0.00 | 0.00 | |
| 5.10 | 3.06 | 3.76 | 4.08 | 5.54 | 2.04 | 0.93 | ||
| 2.80 | 4.67 | 5.10 | 3.76 | 3.76 | ||||
| 4.67 | 4.08 | 3.76 | 5.54 | |||||
| 3.76 | 2.80 | 4.67 | 2.80 | |||||
| 4.08 | 5.54 | |||||||
| 1.02 | 1.88 | 4.08 | 4.67 | 2.04 | 2.80 | 2.04 | 4.67 | |
| 1.02 | 2.80 | 2.04 | 0.93 | 2.04 | 1.88 | 1.02 | 1.88 | |
| 4.08 | 3.74 | 4.67 | 1.02 | 3.76 | ||||
| 5.10 | 4.08 | 3.06 | 1.88 | |||||
| 3.74 | 2.80 | 5.10 | 2.80 | 1.02 | 0.93 | |||
| 0.00 | 1.88 | 0.00 | 0.00 | 1.02 | 1.88 | 1.02 | 0.93 | |
| 4.08 | ||||||||
| 1.02 | 0.93 | 3.76 | 3.06 | 1.88 | 4.08 | 0.00 | ||
| 3.06 | 5.10 | 5.66 | ||||||
| 3.06 | 4.08 | 5.66 | 4.08 | 4.67 | ||||
| 1.02 | 1.88 | 1.02 | 0.93 | 0.00 | 1.88 | 2.04 | 0.93 | |
| 5.66 | 5.10 | 5.66 | ||||||
| 5.66 | 3.76 | 4.67 | ||||||
| 0.00 | 1.88 | 1.02 | 0.93 | 0.00 | 1.88 | 0.00 | 0.93 | |
| 3.06 | 0.93 | 4.08 | 4.67 | 5.10 | 3.76 | 3.06 | 3.76 | |
| 4.08 | 3.76 | 3.06 | 3.06 | 5.66 | ||||
| 2.80 | 3.76 | 4.08 | 4.67 | |||||
| 4.67 | 4.08 | 3.76 | 3.06 | |||||
| 1.88 | 3.06 | 0.93 | 1.88 | |||||
| 2.04 | 0.93 | 0.00 | 2.80 | 2.04 | 3.76 | 2.04 | 2.80 | |
| 3.74 | 5.10 | 5.10 | 2.80 | |||||
| 5.66 | 5.10 | 4.67 | 4.67 | 4.08 | ||||
| 5.10 | 1.88 | 1.88 | 5.10 | 3.76 | ||||
Amino acid positions around the cleavage sites.
Predicted from the N-terminal and C-terminal ends of oligopeptides isolated from fermented cocoa beans [3], [4].
Detected by in vitro digestion of three different protein substrates with the cocoa aspartic protease (compare Table 1).
Values are expressed in percent of all amino acids found in these positions. Values above 6% are marked in bold.
| Subject area | |
| More specific subject area | |
| Type of data | |
| How data was acquired | |
| Data format | |
| Experimental factors | |
| Experimental features | |
| Data source location | |
| Data accessibility |