| Literature DB >> 19091773 |
Hailiang Huang1, Joel S Bader.
Abstract
MOTIVATION: Yeast two-hybrid screens are an important method to map pairwise protein interactions. This method can generate spurious interactions (false discoveries), and true interactions can be missed (false negatives). Previously, we reported a capture-recapture estimator for bait-specific precision and recall. Here, we present an improved method that better accounts for heterogeneity in bait-specific error rates. RESULT: For yeast, worm and fly screens, we estimate the overall false discovery rates (FDRs) to be 9.9%, 13.2% and 17.0% and the false negative rates (FNRs) to be 51%, 42% and 28%. Bait-specific FDRs and the estimated protein degrees are then used to identify protein categories that yield more (or fewer) false positive interactions and more (or fewer) interaction partners. While membrane proteins have been suggested to have elevated FDRs, the current analysis suggests that intrinsic membrane proteins may actually have reduced FDRs. Hydrophobicity is positively correlated with decreased error rates and fewer interaction partners. These methods will be useful for future two-hybrid screens, which could use ultra-high-throughput sequencing for deeper sampling of interacting bait-prey pairs. AVAILABILITY: All software (C source) and datasets are available as supplemental files and at http://www.baderzone.org under the Lesser GPL v. 3 license.Entities:
Mesh:
Substances:
Year: 2008 PMID: 19091773 PMCID: PMC2639075 DOI: 10.1093/bioinformatics/btn640
Source DB: PubMed Journal: Bioinformatics ISSN: 1367-4803 Impact factor: 6.937
Network properties and parameter estimations for the beta error rate model with TPL degree distributions
| Properties | Yeast | Worm | Fly |
|---|---|---|---|
| Network | |||
| | 1532 | 729 | 3639 |
| | 7.65 | 20.08 | 14.79 |
| | 2.97 | 5.55 | 5.69 |
| | 1.97 | 3.71 | 3.57 |
| Parameter | |||
| ɛ | 1.61729 | 0.84128 | 0.62162 |
| | 0.00354 | 0.06187 | 0.11412 |
| β1 | 0.76185 | 1.39727 | 0.76634 |
| β2 | 9.21670 | 8.43195 | 4.26594 |
| | 0.09930 | 0.13187 | 0.16999 |
| Estimates | |||
| | 4.49 | 5.02 | 4.43 |
| | 0.76 | 2.65 | 2.51 |
| FNR (%) | 51 | 42 | 28 |
| FDR per unique interaction (%) | 26 | 48 | 44 |
| FDR per singleton (%) | 39 | 71 | 70 |
| Bootstrap wins | 96/100 | 100/100 | 100/100 |
N is the number of baits. is the average number of preys sampled per bait, is the average number of unique preys and is the average number of singletons. is the estimated number of preys per bait and is the estimated number of false positives per bait. The FDR per clone () is , the FDR per unique interaction is and the FDR per singleton is .
Fig. 1.The distributions of FDRs for baits are displayed for Beta/TPL (solid line), Beta/PL (dashed line) and Mixture/TPL (black impulses) for yeast (A), worm (B) and fly (C). Posterior maximum likelihood estimates are displayed for the Beta/TPL model (open circles). Baits with a single clone do not contribute to the estimator and are not included in the histograms.
Comparison with previous studies using computational predictions, overlap with gold standards and capture–recapture theory
| Method | FNR (%) | FDR (%) | Reference |
|---|---|---|---|
| Yeast | |||
| Prediction | – | 72–84 | (Deane |
| Overlap | >70 | >50 | (von Mering |
| Overlap | 43–71 | – | (Edwards |
| Overlap | 76–96 | – | (Edwards |
| Overlap | – | 50 | (Sprinzak |
| Overlap | 80–85 | – | (Salwinski |
| Overlap | 50 | 70–90 | (Hart |
| Recap | 52 | 24 | (Huang |
| | |||
| | |||
| Worm | |||
| Prediction | 22–100 | – | (Salwinski |
| Recap | 47 | 44 | (Huang |
| | |||
| Fly | |||
| Prediction | 74–96 | – | (Salwinski |
| Recap | 32 | 41 | (Huang |
| |
aEstimated using crystal structure data.
bEstimated using MIPS complexes data.
cOverlap from comparison with data from Yu et al. (2008).
Bold values indicate results from this work.