| Literature DB >> 29023445 |
Kevin Drew1, Christian L Müller2, Richard Bonneau2,3, Edward M Marcotte1.
Abstract
Determining the three dimensional arrangement of proteins in a complex is highly beneficial for uncovering mechanistic function and interpreting genetic variation in coding genes comprising protein complexes. There are several methods for determining co-complex interactions between proteins, among them co-fractionation / mass spectrometry (CF-MS), but it remains difficult to identify directly contacting subunits within a multi-protein complex. Correlation analysis of CF-MS profiles shows promise in detecting protein complexes as a whole but is limited in its ability to infer direct physical contacts among proteins in sub-complexes. To identify direct protein-protein contacts within human protein complexes we learn a sparse conditional dependency graph from approximately 3,000 CF-MS experiments on human cell lines. We show substantial performance gains in estimating direct interactions compared to correlation analysis on a benchmark of large protein complexes with solved three-dimensional structures. We demonstrate the method's value in determining the three dimensional arrangement of proteins by making predictions for complexes without known structure (the exocyst and tRNA multi-synthetase complex) and by establishing evidence for the structural position of a recently discovered component of the core human EKC/KEOPS complex, GON7/C14ORF142, providing a more complete 3D model of the complex. Direct contact prediction provides easily calculable additional structural information for large-scale protein complex mapping studies and should be broadly applicable across organisms as more CF-MS datasets become available.Entities:
Mesh:
Substances:
Year: 2017 PMID: 29023445 PMCID: PMC5638211 DOI: 10.1371/journal.pcbi.1005625
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Fig 1Overview of direct contact prediction between protein complex subunits.
Co-fractionation / mass spectrometry (CF-MS) aims to repeatedly separate mixtures of native protein complexes (True Network) by non-denaturing chromatography. Protein elution profiles are generated by mass spectrometry identification of proteins across all chromatography fractions collected. Correlation between proteins’ elution profiles (left side) performs well for identifying the subunit composition of complexes [4, 6, 7], but suffers from indirect associations among proteins that inhibit its ability to identify directly contacting subunits within each complex. We predict direct contacts (right side) by effectively inverting the correlation matrix to discriminate between conditionally dependent and conditionally independent associations, which correspond to direct and indirect protein interactions respectively. Specifically, we incorporate pseudo-counts, scale and transform the correlation matrix, use a sparse graphical model learning framework to compute conditionally dependent partial correlations, followed by StARS stability analysis [29] to re-score the resulting conditional dependency matrix such that each entry corresponds to the frequency with which it is supported by subsample trials. We retain non-zero scores between subunits within each pre-defined human protein complex [32] as our prediction of direct contacts.