| Literature DB >> 27039071 |
Jouhyun Jeon1, Roland Arnold1, Fateh Singh1, Joan Teyra1, Tatjana Braun1, Philip M Kim2,3,4.
Abstract
BACKGROUND: The identification of structured units in a protein sequence is an important first step for most biochemical studies. Importantly for this study, the identification of stable structured region is a crucial first step to generate novel synthetic antibodies. While many approaches to find domains or predict structured regions exist, important limitations remain, such as the optimization of domain boundaries and the lack of identification of non-domain structured units. Moreover, no integrated tool exists to find and optimize structural domains within protein sequences.Entities:
Keywords: Antibody target molecules; Phage display; Protein domain; Protein sequence; Putative structural unit; Structural domains; Synthetic antibody
Mesh:
Substances:
Year: 2016 PMID: 27039071 PMCID: PMC4818438 DOI: 10.1186/s12859-016-1001-1
Source DB: PubMed Journal: BMC Bioinformatics ISSN: 1471-2105 Impact factor: 3.169
Fig. 1A flow scheme of PAT pipeline. a Procedure to find protein domain regions. b Procedure to find putative structural units. c Identification of two types of structured units
Comparative performance of PAT and TargetTrack
| Progress | PATa | TargetTrack |
|---|---|---|
| Number of targets | 210 | 49,227 |
| Expression | 173 (82.38 %) | 32,904 (66.84 %) |
| Purification | 145 (69.05 %) | 15,617 (31.72 %) |
aProtein domains were expressed in 1.4 ml 2YT media at 30 °C overnight and the soluble his-tagged proteins were purified by affinity
Performance of PAT to identify putative structural units
| Success | ||
|---|---|---|
| Positive | Negative | |
| Target score prediction | ||
| Positive | 2,370 (TP) | 1,888 (FP) |
| Negative | 799 (FN) | 3,601 (TN) |
| Metrics | ||
| Sensitivity (%) | 74.79 | |
| Specificity (%) | 65.60 | |
| Accuracy (%) | 68.97 | |
| Balanced accuracy (%) | 70.20 | |
| Precision (%) | 55.66 | |
Fig. 2Identification of putative structural units. Putative structural units of (a) tumor necrosis factor receptor (DR6, PDB ID: 2DBH) and (b) phosphatidylinositol 3-kinase regulatory subunit alpha (PIK3R1, PDB ID: 2V1Y) are shown as blue bars. The structures (colored as blue on structures) represent putative structural units that correspond to blue bars in the graph. Gray bars represent the regions whose known structures are not listed as domains. Black arrows indicate protein domains. Since we excluded protein domain region when we calculate target scores, protein domain regions have target score of zero
Fig. 3Distribution of the reciprocal overlap between PAT prediction and known experimental constructs that produce synthetic antibodies
Fig. 4PAT webserver outputs. a Output page of PAT. b Schematic view and boundary information of structured units. Structured units are colored as red (known domains) and blue (putative structural units). c Plot of target score. Putative structural unit is colored as blue. Residues that are not involved in known protein domains are considered to calculate target score. Known protein domain regions are scored as 0. d Summarized information of structured units. e Intermediate results that are created in the PAT pipeline