| Literature DB >> 22127860 |
Zheng Zhang1, Cheng Xing, Lushan Wang, Bin Gong, Hui Liu.
Abstract
Insertion/deletion (indel) is one of the most common methods of protein sequence variation. Recent studies showed that indels could affect their flanking regions and they are important for protein function and evolution. Here, we describe the Indel Flanking Region Database (IndelFR, http://indel.bioinfo.sdu.edu.cn), which provides sequence and structure information about indels and their flanking regions in known protein domains. The indels were obtained through the pairwise alignment of homologous structures in SCOP superfamilies. The IndelFR database contains 2,925,017 indels with flanking regions extracted from 373,402 structural alignment pairs of 12,573 non-redundant domains from 1053 superfamilies. IndelFR provides access to information about indels and their flanking regions, including amino acid sequences, lengths, locations, secondary structure constitutions, hydrophilicity/hydrophobicity, domain information, 3D structures and so on. IndelFR has already been used for molecular evolution studies and may help to promote future functional studies of indels and their flanking regions.Entities:
Mesh:
Substances:
Year: 2011 PMID: 22127860 PMCID: PMC3245007 DOI: 10.1093/nar/gkr1107
Source DB: PubMed Journal: Nucleic Acids Res ISSN: 0305-1048 Impact factor: 16.971
Figure 1.Data collection for IndelFR database. (A) Selection of superfamilies and non-redundant protein domains. (B) Structural alignment by PDBeFold and generation of match files. (C) Locating and extracting indels and their flanking regions from matches.
Figure 2.Samples of the IndelFR interfaces. (A) The ‘SCOP Tree’ interface. All indels and matches in the superfamilies can be browsed through this interface. (B) The ‘Indel Information’ interface. The indel files can be browsed online and downloaded. (C) The ‘Match Information’ interface. The match files can be browsed online and downloaded. (D) The format of indel files, containing detailed information about indels and their flanking regions. (E) The ‘Indel Search’ interface. Target indels can be searched in three ways. (F) The ‘Match Search’ interface. Target matches can be searched in four ways. (G) The ‘Download’ interface. Users can download the entire dataset stored in the database in this interface. (H) The ‘Online Indel Creation’ interface. Users can submit match files and extract indels and their flanking regions in this interface.
Figure 3.Display of some special qualities of indels and their flanking regions in a protein domain using data in IndelFR. (A) Comparison of amino acid composition between indel regions and flanking regions. (B) Amino acid hydrophilicity/hydrophobicity of indel flank sites. (C) Secondary structure element composition of indel flank sites. (D) Sequence identity of indel flank sites. (E) Tertiary structure shift of indel flank sites.