| Literature DB >> 26553811 |
Arumugam Gandhimathi1, Pritha Ghosh1, Sridhar Hariharaputran2, Oommen K Mathew3, R Sowdhamini4.
Abstract
Structure-based sequence alignment is an essential step in assessing and analysing the relationship of distantly related proteins. PASS2 is a database that records such alignments for protein domain superfamilies and has been constantly updated periodically. This update of the PASS2 version, named as PASS2.5, directly corresponds to the SCOPe 2.04 release. All SCOPe structural domains that share less than 40% sequence identity, as defined by the ASTRAL compendium of protein structures, are included. The current version includes 1977 superfamilies and has been assembled utilizing the structure-based sequence alignment protocol. Such an alignment is obtained initially through MATT, followed by a refinement through the COMPARER program. The JOY program has been used for structural annotations of such alignments. In this update, we have automated the protocol and focused on inclusion of new features such as mapping of GO terms, absolutely conserved residues among the domains in a superfamily and inclusion of PDBs, that are absent in SCOPe 2.04, using the HMM profiles from the alignments of the superfamily members and are provided as a separate list. We have also implemented a more user-friendly manner of data presentation and options for downloading more features. PASS2.5 version is available at http://caps.ncbs.res.in/pass2/.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26553811 PMCID: PMC4702857 DOI: 10.1093/nar/gkv1205
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Flow-chart for rigorous structure-based sequence alignment of distantly related proteins. The flow-chart describes three phases namely initial alignment phase, final alignment phase and alignment assessment phase. The features are mentioned at the end (outside rectangular boxes). In particular, features marked in green are in-house developed programs for annotation.
Figure 2.Conserved residues are mapped on part of the alignment of Nuclear receptor ligand-binding domain superfamily (SCOPe code: 48508). In the alignment, amino acids mentioned in uppercase are solvent inaccessible and residues in lowercase are solvent accessible. Hydrogen bond to main chain amide are denoted in bold,residues having hydrogen bond to mainchain carbonyl are underlined and hydrogen bond to other sidechain is indicated by a tilde (∼) over the amino acid concerned. By default, underneath the alignment is the consensus secondary structure. The definition of 'consensus’ is that a fraction of >0.7 is in a particular conformational state at a given position.
Figure 3.Whole-genome search for putative homologues in proteomes of four different model organisms. Results obtained by search using PASS2 HMM is shown in blue colour and search using HMM derived from the most populated RRM-1 Pfam family is shown in red. Higher coverage is obtained when searched using HMMs of PASS2 superfamilies in all the proteomes under study.