| Literature DB >> 17598914 |
Simon K Chan1, Michael Hsing, Fereydoun Hormozdiari, Artem Cherkasov.
Abstract
BACKGROUND: In a previous study, we demonstrated that some essential proteins from pathogenic organisms contained sizable insertions/deletions (indels) when aligned to human proteins of high sequence similarity. Such indels may provide sufficient spatial differences between the pathogenic protein and human proteins to allow for selective targeting. In one example, an indel difference was targeted via large scale in-silico screening. This resulted in selective antibodies and small compounds which were capable of binding to the deletion-bearing essential pathogen protein without any cross-reactivity to the highly similar human protein. The objective of the current study was to investigate whether indels were found more frequently in essential than non-essential proteins.Entities:
Mesh:
Substances:
Year: 2007 PMID: 17598914 PMCID: PMC1925122 DOI: 10.1186/1471-2105-8-227
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Selected query species. The three query species that had completed genome projects and complete global knockout data available
| Bacteria | 224308 | 4105 | 271/271 | |
| Bacteria | 83333 | 4237 | 299/303 | |
| Eukaryote | 4932 | 5872 | 1050/1105 |
Figure 1Sample alignment and pipeline. A) Sample Alignment: Gaps were reported as insertions/deletions with respect to the query sequence. There are seven insertions (red) and two deletions (blue) in this sample alignment. B) Pipeline: A summary of the steps taken to calculate the mean insertion and deletion frequencies for essential and non-essential proteins in B. subtilis, E. coli, and S. cerevisiae.
Figure 2Mean insertion and deletion frequencies in essential and non-essential proteins plotted against minimum indel length. Mean insertion and deletion frequencies were calculated for essential and non-essential query proteins aligned to proteins from the 22 bacteria or 15 eukaryote species. The t-test statistic is shown for the minimum indel lengths that were found significantly more often in essential (blue bars) than non-essential (purple bars) proteins. Significance was set at P < 0.05. Note that no such difference was observed in insertions within B. subtilis proteins.
Figure 3Proportion of essential and non-essential proteins with indels plotted against minimum indel length. Insertions are represented by blue bars while deletions are represented by purple bars.
Figure 4Approximation of abundance of indels with the Weibull distribution. r2 values close to 1.0 indicated that the abundance of insertions (blue points and blue line) and deletions (purple points and purple line) in essential and non-essential proteins of the three query species could be accurately modeled by the Weibull distribution.
Summary of mean connectivity and betweenness of S. cerevisiae proteins with and without indels: The mean connectivity and betweenness of indel containing proteins were significantly greater than those of the non-indel containing proteins. Significance was set at P < 0.05
| Min Indel Length (aa) | Number of proteins with at least one indel of at least 4 or 10 aa long | Mean connectivity of proteins with at least one indel of at least 4 or 10 aa long | Number of proteins without at least one indel of at least 4 or 10 aa long | Mean connectivity of proteins without at least one indel of at least 4 or 10 aa long | Betweenness of proteins with at least one indel of at least 4 or 10 aa long | Betweenness of proteins without at least one indel of at least 4 or 10 aa long |
| 4 | 907 | 4.194 | 562 | 3.986 | 15354 | 15133 |
| 10 | 381 | 4.394 | 1088 | 4.017 | 15712 | 15115 |