Yang Shen1, Ad Bax. 1. Laboratory of Chemical Physics, National Institute of Diabetes and Digestive and Kidney Diseases, National Institutes of Health, Bethesda, MD 20892-0520, USA.
Abstract
We present an empirical method for identification of distinct structural motifs in proteins on the basis of experimentally determined backbone and (13)C(β) chemical shifts. Elements identified include the N-terminal and C-terminal helix capping motifs and five types of β-turns: I, II, I', II' and VIII. Using a database of proteins of known structure, the NMR chemical shifts, together with the PDB-extracted amino acid preference of the helix capping and β-turn motifs are used as input data for training an artificial neural network algorithm, which outputs the statistical probability of finding each motif at any given position in the protein. The trained neural networks, contained in the MICS (motif identification from chemical shifts) program, also provide a confidence level for each of their predictions, and values ranging from ca 0.7-0.9 for the Matthews correlation coefficient of its predictions far exceed those attainable by sequence analysis. MICS is anticipated to be useful both in the conventional NMR structure determination process and for enhancing on-going efforts to determine protein structures solely on the basis of chemical shift information, where it can aid in identifying protein database fragments suitable for use in building such structures.
We present an empirical method for identification of distinct structural motifs in proteins on the basis of experimentally determined backbone and (13)C(β) chemical shifts. Elements identified include the n class="Chemical">N-terminal and C-terminal helix capping motifs and five types of β-turns: I, II, I', II' and VIII. Using a database of proteins of known structure, the NMR chemical shifts, together with the PDB-extracted amino acid preference of the helix capping and β-turn motifs are used as input data for training an artificial neural network algorithm, which outputs the statistical probability of finding each motif at any given position in the protein. The trained neural networks, contained in the MICS (motif identification from chemical shifts) program, also provide a confidence level for each of their predictions, and values ranging from ca 0.7-0.9 for the Matthews correlation coefficient of its predictions far exceed those attainable by sequence analysis. MICS is anticipated to be useful both in the conventional NMR structure determination process and for enhancing on-going efforts to determine protein structures solely on the basis of chemical shift information, where it can aid in identifying protein database fragments suitable for use in building such structures.
Authors: Christoph F Weise; Frédéric H Login; Oanh Ho; Gerhard Gröbner; Hans Wolf-Watz; Magnus Wolf-Watz Journal: Biophys J Date: 2014-10-21 Impact factor: 4.033
Authors: Nikolaos G Sgourakis; Kannan Natarajan; Jinfa Ying; Beat Vogeli; Lisa F Boyd; David H Margulies; Ad Bax Journal: Structure Date: 2014-08-07 Impact factor: 5.006
Authors: Soumya De; Anson C K Chan; H Jerome Coyne; Niraja Bhachech; Ulrike Hermsdorf; Mark Okon; Michael E P Murphy; Barbara J Graves; Lawrence P McIntosh Journal: J Mol Biol Date: 2013-12-12 Impact factor: 5.469