| Literature DB >> 27668276 |
Sebastiana Angelaccio1, Teresa Milano1, Angela Tramonti2, Martino Luigi Di Salvo1, Roberto Contestabile1, Stefano Pascarella1.
Abstract
Detailed data from statistical analyses of the structural properties of the inter-domain linker peptides of the bacterial regulators of the family MocR are herein reported. MocR regulators are a recently discovered subfamily of bacterial regulators possessing an N-terminal domain, 60 residue long on average, folded as the winged-helix-turn-helix architecture responsible for DNA recognition and binding, and a large C-terminal domain (350 residue on average) that belongs to the fold type-I pyridoxal 5'-phosphate (PLP) dependent enzymes such aspartate aminotransferase. Data show the distribution of several structural characteristics of the linkers taken from bacterial species from five different phyla, namely Actinobacteria, Alpha-, Beta-, Gammaproteobacteria and Firmicutes. Interpretation and discussion of reported data refer to the article "Structural properties of the linkers connecting the N- and C- terminal domains in the MocR bacterial transcriptional regulators" (T. Milano, S. Angelaccio, A. Tramonti, M. L. Di Salvo, R. Contestabile, S. Pascarella, 2016) [1].Entities:
Keywords: Dyad propensity; Flexibility; GabR; Hydrophobicity; Linker engineering; Linker length; Linker peptide; MocR regulators; PdxR; Residue propensity
Year: 2016 PMID: 27668276 PMCID: PMC5026710 DOI: 10.1016/j.dib.2016.08.064
Source DB: PubMed Journal: Data Brief ISSN: 2352-3409
List of MocR regulators predicted to have linkers of length equal or greater to 60 residues.
| A0A023C4T7_9PSED | 88 | 148 | 60 |
| A0A0B2AVS1_9ACTN | 85 | 145 | 60 |
| A0NP21_LABAI | 80 | 140 | 60 |
| I9W6R0_9RALS | 87 | 147 | 60 |
| W4CMK3_9BACL | 121 | 181 | 60 |
| A0A074LC92_PAEPO | 82 | 143 | 61 |
| I4N7I5_9PSED | 87 | 148 | 61 |
| A0A0D5NE20_9BACL | 85 | 147 | 62 |
| F8FPR4_PAEMK | 106 | 168 | 62 |
| G8QJ34_DECSP | 85 | 147 | 62 |
| V7DIJ8_9PSED | 88 | 150 | 62 |
| W4P2V0_9BURK | 87 | 149 | 62 |
| B9QZW6_LABAD | 80 | 143 | 63 |
| F3KUT6_9BURK | 118 | 181 | 63 |
| F7T5G0_9BURK | 85 | 148 | 63 |
| M2X958_9MICC | 80 | 143 | 63 |
| R9LS02_9BACL | 83 | 146 | 63 |
| S2WJB8_DELAC | 89 | 152 | 63 |
| A0A098SWK7_9PSED | 88 | 152 | 64 |
| A0A0J6J2M6_9PSED | 88 | 152 | 64 |
| A0A0F4KHT0_9ACTN | 101 | 166 | 65 |
| D5BN74_PUNMI | 82 | 147 | 65 |
| D7DQ74_METV0 | 90 | 156 | 66 |
| K0YXF4_9ACTN | 79 | 145 | 66 |
| A0A077LFC1_9PSED | 87 | 154 | 67 |
| A0A095YU49_9FIRM | 78 | 145 | 67 |
| H0BWG7_9BURK | 75 | 142 | 67 |
| A0A087DUC1_9BIFI | 78 | 146 | 68 |
| A0A090ZGE9_PAEMA | 83 | 152 | 69 |
| A0A0A6Q9N6_9BURK | 74 | 143 | 69 |
| F3JEN8_PSESX | 88 | 157 | 69 |
| W0HH53_PSECI | 88 | 157 | 69 |
| A0A0A6QBJ9_9BURK | 89 | 159 | 70 |
| A0A0B4DLS5_9MICC | 89 | 159 | 70 |
| A0A088Y9M0_BURPE | 88 | 159 | 71 |
| A0A0F4JB47_9ACTN | 62 | 135 | 73 |
| A0A069DE36_9BACL | 85 | 159 | 74 |
| A0A087EGV8_9BIFI | 105 | 181 | 76 |
| A0A089I7M0_9BACL | 82 | 158 | 76 |
| A8SVX0_9FIRM | 79 | 155 | 76 |
| A0A089N895_9BACL | 78 | 155 | 77 |
| A0A0F5JX35_9BURK | 84 | 161 | 77 |
| A0A0E4CZM5_9BACL | 90 | 168 | 78 |
| A0A061LXN0_9MICO | 84 | 163 | 79 |
| A0A0A8BLT7_9BURK | 89 | 168 | 79 |
| R6HHE8_9ACTN | 79 | 159 | 80 |
| X4ZGS7_9BACL | 84 | 164 | 80 |
| A0A089HPN9_PAEDU | 78 | 162 | 84 |
| D2PX75_KRIFD | 93 | 178 | 85 |
| D3F8U9_CONWI | 80 | 166 | 86 |
| A0A0A4HID4_9PSED | 88 | 179 | 91 |
| F2RK57_STRVP | 86 | 180 | 94 |
| C7MPD0_CRYCD | 79 | 174 | 95 |
| F4QXL0_BREDI | 83 | 179 | 96 |
| A0A087AB73_9BIFI | 78 | 175 | 97 |
| A0A087E7D4_9BIFI | 78 | 175 | 97 |
| A0A0B4DPH0_KOCRH | 85 | 183 | 98 |
| F2RA50_STRVP | 88 | 186 | 98 |
| V6KRX5_STRRC | 93 | 191 | 98 |
| M8D4I1_9BACL | 79 | 179 | 100 |
| A0A0A3JRX6_BURPE | 88 | 189 | 101 |
| M8DED6_9BACL | 80 | 183 | 103 |
| A0A087BLK1_BIFLN | 78 | 187 | 109 |
| A0A087CXD8_9BIFI | 78 | 187 | 109 |
| S6CDU1_9ACTN | 130 | 244 | 114 |
| A0A0A6SYE7_9BURK | 87 | 209 | 122 |
| F5LR05_9BACL | 82 | 209 | 127 |
| A0A089IZ38_PAEDU | 84 | 218 | 134 |
| A0A0B6S8F7_BURGL | 88 | 231 | 143 |
| A0A087A119_9BIFI | 78 | 222 | 144 |
| A0A089MC10_9BACL | 82 | 234 | 152 |
| A0A089KZI8_9BACL | 82 | 244 | 162 |
Linker N-terminal sequence position.
Linker C-terminal sequence position.
Residue propensities in the linkers of length range 0–20.
aAmino acid one-letter code.
bResidue propensity; cells containing values ≥1.01 and ≤1.19 and values ≥1.20 are shaded with light and dark grey respectively. In the latter case, numbers are boldfaces.
cNumber of residues in the sample.
Residue propensities in the linkers of length range 21–40.
aAmino acid one-letter code.
bResidue propensity; cells containing values ≥1.01 and ≤1.19 and values ≥1.20 are shaded with light and dark grey respectively. In the latter case, numbers are boldfaces.
cNumber of residues in the sample.
Residue propensities in the linkers of length range 41–60.
aAmino acid one-letter code.
bResidue propensity; cells containing values ≥1.01 and ≤1.19 and values ≥1.20 are shaded with light and dark grey respectively. In the latter case, numbers are boldfaces.
cNumber of residues in the sample.
Residue propensities in the linkers of length range 61–200.
aAmino acid one-letter code.
bResidue propensity; cells containing values ≥1.01 and ≤1.19 and values ≥1.20 are shaded with light and dark grey respectively. In the latter case, numbers are boldfaces.
cNumber of residues in the sample.
Average number of residue pairs in each data set.
| 53.5±93.1 | 9.2±17.6 | 29.2±53.8 | 10.0±16.4 | 5.0±8.5 | |
| 45.7±56.7 | 6.0±9.0 | 20.3±28.5 | 18.9±22.4 | 0.5±0.8 | |
| 57.1±78.2 | 3.2±5.1 | 25.5±35.1 | 25.1±34.8 | 3.0±5.8 | |
| 83.0±63.5 | 6.4±6.8 | 39.9±34.8 | 32.4±25.0 | 4.4±6.4 | |
| 82.0±81.9 | 8.7±9.4 | 50.8±54.1 | 20.9±20.6 | 1.5±3.5 | |
Fig. 1Dipeptide propensity for the entire set of linkers. Vertical and horizontal sides of each matrix indicate the N- and C-side residue of each dyad, respectively. Cells containing propensity values ≥1.1 and ≤1.99 or ≥2.0 and ≤3.99 or ≥4.0 are shaded with very light, light or dark grey respectively and numbers therein contained are boldfaced. A, B, C, D and E denote propensities for Actinobacteria, Alphaproteobacteria, Betaproteobacteria, Firmicutes and Gammaproteobacteria, respectively.
Fig. 2Dipeptide propensity for the 0–20 residue length linker set. Interpretation of figure refers to legend to Fig. 1.
Fig. 3Dipeptide propensity for the 21–40 residue length linker set. Interpretation of figure refers to legend to Fig. 1.
Fig. 4Dipeptide propensity for the 41–60 residue length linker set. Interpretation of figure refers to legend to Fig. 1.
Fig. 5Dipeptide propensity for the 61–200 residue length linker set. Interpretation of figure refers to legend to Fig. 1.
Fraction of predicted secondary structure in linker regions.
| 0.14 | 0.02 | 0.86 | |
| 0.19 | 0.03 | 0.78 | |
| 0.30 | 0.01 | 0.69 | |
| 0.02 | 0.06 | 0.92 | |
| 0.26 | 0.02 | 0.72 | |
Fig. 6Box-plots of the distribution of the average linker flexibility (index #425 of Table 2 in [1] and code VINM940101 in AAindex [10]). Horizontal axis indicates the average flexibility distribution in the wHTH, AAT domains, in all linkers, and in linkers belonging to different length intervals: 0–20, 21–40, 41–60 and >60 residues. Y-axis reports the flexibility scale (label AI stands for Average Index). A, B, and C, denote Alphaproteobacteria, Betaproteobacteria, and Gammaproteobacteria, respectively.
Fig. 7Box plots of the distribution of average linker hydrophobicity (index #58 of Table 2 in [1] and code CIDH920105 in AAindex [10]). For interpretation of plots, refer to Fig. 6 caption.
Fig. 8Box plots of the distribution of average Linker propensity index (#491 of Table 2 in [1] and code GEOR03010 in AAindex [10]). For interpretation of plots, refer to Fig. 6 caption.
Fig. 9Box plots of the distribution of the average normalized β-turn propensity (index #37 Table 2 in [1] and code CHOP780101 in AAindex [10]). For interpretation of plots, refer to Fig. 6 caption.
Fig. 10Box plots of the distribution of the average Chou–Fasman coil propensity (#24 of Table 2 in [1] and code CHAM830101 in AAindex [10]). For interpretation of plots, refer to Fig. 6 caption.
Fig. 11Box plots of the distribution of average normalized α-helix propensity (index #38 of Table 2 in [1] and code CHOP780102 in AAindex [10]). A, B, C, D and E denote Actinobacteria, Alphaproteobacteria, Betaproteobacteria, Firmicutes and Gammaproteobacteria, respectively.
Fig. 12Box plots of the distribution of average normalized β-sheet propensity (index #39 of Table 2 in [1] and code CHOP780103 in AAindex [10]). Letter interpretation is as in Fig. 11 caption.
GabR and PdxR sequences retrieved from RegPrecise data bank.
| A0A098SFD5 | ||
| A0A0H3LKN1 | ||
| A0A0H2XDM4 | ||
| A0A0H2VKR4 | ||
| A0A0Q1AKJ7 | ||
Fig. 13Histogram of the linker length distribution in the MocR subgroups GabR and PdxR. Horizontal axis labels indicate length intervals: 20 corresponds to 0–20, 30 (21–30), 40 (31–40), 50 (41–50), 60 (51–60) and >60 (longer than 60 residues). Percentage (%) on the vertical axis indicates the fraction of linkers in the length interval. Sequences were retrieved from the reference proteomes data bank available at the Hmmer web server [17] using a significance E-value thresholds equal to 10−120. With this threshold, 885 and 334 sequences were retrieved for GabR and PdxR, respectively.
| Subject area | |
| More specific subject area | |
| Type of data | |
| How data was acquired | |
| Data format | |
| Experimental factors | |
| Experimental features | |
| Data source location | |
| Data accessibility |