A Cavallo1, A C R Martin. 1. Department of Biochemistry and Molecular Biology, University College London, Gower Street, London WC1E 6BT, UK.
Abstract
MOTIVATION: Data on both single nucleotide polymorphisms and disease-related mutations are being collected at ever-increasing rates. To understand the structural effects of missense mutations, we consider both classes under the term single amino acid polymorphisms (SAAPs) and we wish to map these to protein structure where their effects can be analyzed. Our initial aim therefore is to create a completely automatically maintained database of SAAPs mapped to individual residues in the Protein Data Bank (PDB) updated as new mutations or structures become available. RESULTS: We present an integrated pipeline for the automated mapping of SAAP data from HGVbase to individual PDB residues. Achieving this in a completely automated and reliable manner is a complex task. Data extracted from HGVbase are mapped to EMBL entries to confirm whether the mutation occurs in an exon and, if so, where in the sequence it occurs. From there we map to Swiss-Prot entries and thence to the PDB. AVAILABILITY: The resulting database may be accessed over the web at http://www.bioinf.org.uk/saap/ or http://acrmwww.biochem.ucl.ac.uk/saap/ CONTACT: a.martin@biochem.ucl.ac.uk.
MOTIVATION: Data on both single nucleotide polymorphisms and disease-related mutations are being collected at ever-increasing rates. To understand the structural effects of missense mutations, we consider both classes under the term single amino acid polymorphisms (SAAPs) and we wish to map these to protein structure where their effects can be analyzed. Our initial aim therefore is to create a completely automatically maintained database of SAAPs mapped to individual residues in the Protein Data Bank (PDB) updated as new mutations or structures become available. RESULTS: We present an integrated pipeline for the automated mapping of SAAP data from HGVbase to individual PDB residues. Achieving this in a completely automated and reliable manner is a complex task. Data extracted from HGVbase are mapped to EMBL entries to confirm whether the mutation occurs in an exon and, if so, where in the sequence it occurs. From there we map to Swiss-Prot entries and thence to the PDB. AVAILABILITY: The resulting database may be accessed over the web at http://www.bioinf.org.uk/saap/ or http://acrmwww.biochem.ucl.ac.uk/saap/ CONTACT: a.martin@biochem.ucl.ac.uk.
Authors: Gloria M Sheynkman; Michael R Shortreed; Anthony J Cesnik; Lloyd M Smith Journal: Annu Rev Anal Chem (Palo Alto Calif) Date: 2016-03-30 Impact factor: 10.745