Deciphering interacting networks of the extracellular matrix is a major challenge. We describe an affinity purification and mass spectrometry strategy that has provided new insights into the molecular interactions of elastic fibers, essential extracellular assemblies that provide elastic recoil in dynamic tissues. Using cell culture models, we defined primary and secondary elastic fiber interaction networks by identifying molecular interactions with the elastic fiber molecules fibrillin-1, MAGP-1, fibulin-5, and lysyl oxidase. The sensitivity and validity of our method was confirmed by identification of known interactions with the bait proteins. Our study revealed novel extracellular protein interactions with elastic fiber molecules and delineated secondary interacting networks with fibronectin and heparan sulfate-associated molecules. This strategy is a novel approach to define the macromolecular interactions that sustain complex extracellular matrix assemblies and to gain insights into how they are integrated into their surrounding matrix.
Deciphering interacting networks of the extracellular matrix is a major challenge. We describe an affinity purification and mass spectrometry strategy that has provided new insights into the molecular interactions of elastic fibers, essential extracellular assemblies that provide elastic recoil in dynamic tissues. Using cell culture models, we defined primary and secondary elastic fiber interaction networks by identifying molecular interactions with the elastic fiber molecules fibrillin-1, MAGP-1, fibulin-5, and lysyl oxidase. The sensitivity and validity of our method was confirmed by identification of known interactions with the bait proteins. Our study revealed novel extracellular protein interactions with elastic fiber molecules and delineated secondary interacting networks with fibronectin and heparan sulfate-associated molecules. This strategy is a novel approach to define the macromolecular interactions that sustain complex extracellular matrix assemblies and to gain insights into how they are integrated into their surrounding matrix.
Mass spectrometry is emerging as a powerful approach to identify protein interaction partners in molecular complexes. We have developed an affinity purification and mass spectrometry strategy that is applicable to the analysis of molecular interactions of extracellular matrix complexes. The extracellular matrix provides structural support to tissues and profoundly influences cell survival, proliferation, migration, and phenotypic state. It is a complex multimolecular and three-dimensional milieu that comprises assembled networks of tissue-specific combinations of structural and cell-adhesive glycoproteins, proteoglycans, and cross-linking enzymes. The matrix also sequesters numerous growth factors and cytokines, thereby controlling their bioavailability. Delineating the molecular nature of the fundamental interacting networks within complex extracellular matrices is a challenging task. Here, mass spectrometry has given new insights into elastic fiber interactions.Elastic fibers are essential structural elements of the extracellular matrix of dynamic connective tissues such as blood vessels, lungs, skin, and ligaments, endowing these tissues with elastic recoil (1, 2). Their importance is emphasized by elastic fiber defects that cause severe acquired diseases such as aortic aneurysms and pulmonary emphysema and life-threatening heritable disorders such as Marfan syndrome, supravalvular stenosis, and cutis laxa. These fibers are extensive multimolecular assemblies that adopt intricate tissue-specific architectural arrangements. At the morphological level, the fibers comprise a cross-linked elastin core and an outer mantle of fibrillin microfibrils. It has proved challenging to define the composition of tissue elastic fibers biochemically. Cross-linked elastin is highly insoluble and its isolation from tissues requires extreme conditions of hot alkali, which destroys other proteins (2). The efficient extraction of tissue microfibrils requires collagenase and other proteolytic activities that may destroy associated molecules (3). Despite these difficulties, a number of associated proteins, including MAGP-1, βigH3, fibulins, and lysyl oxidases (LOX and LOXL (also known as LOXL1)), as well as latent TGFβ-binding proteins (LTBPs), collagen VIII, and emilin-1 have been identified in biochemical and/or colocalization studies (1).Fibrillins are very large glycoproteins (350 kDa) containing 43 calcium-binding epidermal growth factor-like domains and seven TGFβ-binding protein-like (8-cysteine) domains (4). Fibrillin-1 is the more abundant isoform; fibrillin-2 is mainly expressed during development (5, 6). Tropoelastin, the secreted soluble form of elastin, comprises alternating hydrophobic and lysine-rich cross-linking domains. LOX and LOXL are copper-dependent amine oxidases that cross-link elastin through the oxidative deamination of specific lysines (7–9). Elastin is mainly expressed and deposited early in life and undergoes very little turnover in healthy tissues (2). MAGP-1 is a microfibril-associated glycoprotein that binds fibrillin-1 and elastin (10, 11) but is not essential for elastic fiber formation (12). βigH3 was originally identified as a matrix protein, MP78/70, in tissue extracts that solubilized elastin-associated microfibrils (13, 14). Fibulin-4 and -5 play essential roles in elastic fiber formation (15, 16), most likely by regulating elastin deposition onto microfibrils (17, 18). Fibulin-2 interacts with fibrillin-1 (19) but is not essential for elastic fiber formation (20). Fibulin-1-null mice, among other symptoms, display anomalies of aortic arch arteries and hemorrhagic blood vessels, suggesting some involvement in elastic fiber biology (21). Fibulin-3 (also known as Efemp1)-deficient mice exhibit early aging and herniation associated with reduced elastic fiber integrity (22). Collagen VIII and emilin-1 also colocalize to elastic fibers (23, 24).The assembly of microfibrils and elastic fibers remains incompletely understood. We and others recently showed that assembly of the microfibril component is orchestrated by the cell surface through interactions with fibronectin and integrin receptors (25, 26). Heparan sulfate, an abundant pericellular glycosaminoglycan chain attached to syndecan and glypican proteoglycan receptors, also critically influences microfibril formation (27–29). Elastin deposition and stabilization on microfibrils require fibulins and the cross-linking enzymes LOX and/or LOXL.To obtain new insights into the molecular interactions of elastic fibers and how they are integrated into their surrounding matrix, we conducted a detailed affinity capture LC-MS/MS analysis of molecules that interact in culture specifically with four His6-tagged recombinant human elastic fiber molecules (fibrillin-1, MAGP-1, fibulin-5, and LOX). Tropoelastin was not used as bait because of its highly adhesive nature. Our protocol proved to be an effective strategy for defining specific interactions of elastic fiber molecules in the extracellular matrix. Efficacy was demonstrated through confirmation of known interactions and validation of novel extracellular matrix protein-protein interactions. This approach further allowed us to predict secondary elastic fiber interactions, giving powerful insights into the molecular networks that sustain elastic fibers within higher order extracellular matrices.
EXPERIMENTAL PROCEDURES
Cell Culture
The ARPE-19 humanretinal pigmented epithelial cell line was obtained from ATCC (Manassas, VA). Human dermal fibroblasts (HDFs) were obtained from Cascade Biologics (Portland, OR). Cells were cultured in Dulbecco's modified Eagle's medium supplemented with 10% fetal bovine serum, 100 units/ml penicillin, and 100 units/ml streptomycin. For some experiments, serum-free medium, which consisted of Dulbecco's modified Eagle's medium/F-12 (1:1) with GlutaMAX (Invitrogen), penicillin (100 units/ml), and streptomycin (100 units/ml), was used.
Recombinant Protein Production
The cloning, expression, and purification of humanfibrillin-1 fragments PF1, PF2, PF5, PF7, PF8, PF11, PF12, and PF13 (Fig. 1) using the mammalian expression vector pCEP-pu/AC7 and 293-EBNA cells have been described previously (27, 30, 31). Full-length humanMAGP-1 and fibulin-5 were expressed in the same mammalian expression system (31–33). All proteins, which contained an N-terminal His6 tag, were purified using nickel chromatography (HisTrap FF, GE Healthcare) and further purified by size exclusion chromatography (Superdex 200 10/300 GL, GE Healthcare) as described (28). MAGP-1 was refolded as described (31). Similarly, human collagen VIII (α2(VIII)3) was expressed as described previously (34).
Fig. 1.
Schematic diagram of recombinant fibrillin-1 protein fragments. Domain structures of fibrillin-1 fragments are shown with a key of the different domains, N-glycosylation sites, and the C-terminal furin cleavage site. All fibrillin-1 protein fragments are color-coded, and those used in this study are shown. TB, TGFβ-binding protein-like domain; (cb)EGF, calcium-binding epidermal growth factor-like domain.
Schematic diagram of recombinant fibrillin-1 protein fragments. Domain structures of fibrillin-1 fragments are shown with a key of the different domains, N-glycosylation sites, and the C-terminal furin cleavage site. All fibrillin-1 protein fragments are color-coded, and those used in this study are shown. TB, TGFβ-binding protein-like domain; (cb)EGF, calcium-binding epidermal growth factor-like domain.Full-length humanLOX (residues 22–417) was generated from a LOX construct (a gift from Dr. P. Sommer, Lyon, France), expressed, and purified using the same mammalian expression system as outlined above. A His6 tag was included at the N terminus of the propeptide sequence. The full-length protein was analyzed on a 4–12% bis-Tris gel (supplemental Fig. 1) and by multiangle laser light scattering (MALLS), which revealed that it was a single monodispersed species of 50 kDa corresponding to a monomer. The protein was confirmed to be unprocessed and contained the N-terminal propeptide.The extracellular region of humancalsyntenin-1 was cloned from mRNA obtained from ARPE-19 cells as a fragment that included residues 29–859. This fragment began after the signal peptide and terminated before the start of the transmembrane region. The recombinant protein, which had an N-terminal His6 tag, was expressed and purified using the same expression system as described above and was analyzed on a 4–12% bis-Tris gel (supplemental Fig. 1). Further analysis by MALLS, as described previously (35), revealed a single monodispersed 99-kDa species that had a hydrodynamic radius of 4.8 nm (data not shown). Small angle x-ray scattering (SAXS) data were collected on European Molecular Biology Laboratory beamline X33 at the light source facilities DORISIII at Deutsches Elektronen-Synchrotron (36). Data were collected on a MAR345 image plate detector using a 120-s exposure time and 2.4-m sample-to-detector distance to cover a momentum transfer interval 0.10 Å−1 < q < 0.50 Å−1. SAXS analysis calculated the recombinant calsyntenin to have a radius of gyration (R) of 4.1 nm and to have a molecular mass of 100 kDa.Humanproepithelin (acrogranin, granulin) (residues 18–576) was cloned from ARPE-19 cells and found to have a sequence identical to GenBank™ accession number X62320. The recombinant protein, which had an N-terminal His6 tag, was expressed and purified using the same system as described above. The protein was found have an apparent mass of 75 kDa on a 4–12% bis-Tris gel (supplemental Fig. 1). MALLS revealed the monomer to be a 76-kDa species with a hydrodynamic radius of 3.8 nm, although higher ordered species were also seen. SAXS analysis showed that the recombinant protein could be dimeric with a molecular mass of 150 kDa and R of 4.6 nm.
Affinity Capture LC Fishing
We defined “bait” proteins as the purified tagged recombinant fragments added to the cultures and “prey” proteins as the proteins that were pulled down with the bait proteins. Prey proteins were sourced from two types of sample: conditioned serum-free media and solubilized matrix. Cell lysate was initially used but was found to be too complex and not a good source of extracellular proteins. Mammalian cells were grown as described above in 75-cm2 flasks until confluent, then medium was removed, and cells were washed with PBS. 5 ml of serum-free medium were added to the cells and 50–100 μg of bait protein (His6-tagged pure recombinant elastic fiber protein). After 2 days, the conditioned medium was removed and kept for the next stage. The cell layer was gently lysed using PBS + 1% Nonidet P-40 for 1–2 min, and the cell lysate was then gently removed. The resulting cell matrix layer was washed with PBS and either removed using a cell scraper and then sonicated or solubilized using 2 ml of 0.5 mg/ml bacterial collagenase type 1A (Sigma) in 0.15 m NaCl, 0.05 m Tris-HCl, pH 7.4 at 4 °C overnight. 50–100 μg of bait protein were then added to the extracellular matrix extracts after sonication or collagenase treatment for 1 h at 4 °C prior to purification.Purification of bait-prey protein complexes was carried using an AKTA Purifier 10 (GE Healthcare) to ensure reproducibility between runs using the following procedure. Samples were loaded onto a 1-ml HisTrap FF column (GE Healthcare), which was then washed with 20 mm Tris, pH 8.0, 150 mm NaCl, 0.5 mm CaCl2 containing 25 mm imidazole to reduce nonspecifically bound proteins. The bait and prey proteins were eluted using a two-step process. Elution step 1 used 6 m urea, 1 m NaCl and was followed by a step 2 elution using 20 mm Tris, pH 8.0, 250 mm NaCl, 0.5 mm CaCl2 containing 400 mm imidazole. The two-step elution was chosen to separate bound prey proteins from bait proteins using the first elution, thereby allowing detection of low abundance prey species that may otherwise have been swamped by the large amount of bait protein found in the second elution. Fractions from each elution were pooled separately, desalted in 20 mm ammonium bicarbonate using a 5-ml HiTrap Desalt column, and freeze-dried.
Analysis of Nonspecifically Bound Proteins and Starting Material Composition
To test the robustness of the affinity capture LC fishing experimental procedures (to define a benchmark), two types of control experiments were carried out. First, ARPE and HDF cells were treated with serum-free media with no bait protein added in exactly the same manner as the affinity capture LC experiments described above. After removal of the serum-free media, the matrix layer from both cell types was also prepared by scraping and sonicating as described above. These media and matrix samples without any bait added were used in affinity capture LC experiments to identify any proteins that nonspecifically bound to the HisTrap column and other experimental contaminants. This process was carried out several times throughout the full series of experiments.The second type of control experiment was to analyze the composition (without HisTrap chromatography) of the most abundant proteins of the four starting materials (serum-free media and cell layer extracts from HDF and ARPE-19 cultures). In this case, the conditioned media and matrix layer were removed from the cells and concentrated, after desalting, by freeze drying in the same manner as the affinity capture LC elutions. Tryptic digestion and mass spectrometry were performed on the total media and matrix samples in the same manner as the affinity capture LC experiments.
Tryptic Digestion and Mass Spectrometry
Tryptic digestion of samples was carried out as described previously (37). Briefly, samples were resuspended in 8 m urea, 400 mm ammonium bicarbonate before reduction in 9 mm DTT at 50 °C for 30 min followed by alkylation with 20 mm iodoacetamide at room temperature (20 °C) for 15 min. Samples were diluted until the concentration of urea was 2 m before addition of 1 μg of purified trypsin (Promega) and incubation overnight at 37 °C. Trypsinized samples were analyzed using an Ultimate 3000 LC system (LC Packings) coupled to an HCT Ultra ion trap mass spectrometer (Bruker Daltonics). 5 μl of sample were concentrated/desalted on a precolumn (5 mm × 300-μm inner diameter; LC Packings). The peptides were then separated using a gradient from 98% A (0.1% formic acid in water), 1% B (0.1% formic acid in acetonitrile) to 75% A, 25% B over 40 min at 300 nl/min using a C18 PepMap column (150 mm × 75-μm inner diameter; LC Packings). Peak lists were created by Data Analysis 4.0 (Build 234) (Bruker Daltonics). The top 600 compounds from each run were extracted from the raw data with a threshold of 100,000 counts. Spectra were deconvoluted and deisotoped with a low mass cutoff of 300 Da and high cutoff of 3000 Da with a maximum charge of 4+, and no smoothing was applied. Peak lists were exported as Mascot generic format (mgf) files and limited to deconvoluted peaks plus the most abundant non-deconvoluted compounds.
Database Searching and Protein Identification Using Mascot, X! Tandem, and Scaffold
All peak list (mgf) files were analyzed using X! Tandem (The Global Proteome Machine Organization; version 2007.01.01.1) and Mascot (Matrix Science, London, UK; version 2.2.03) to validate samples and search the Swiss-Prot database (selected for Homo sapiens, release 54.3, 17,400 entries) assuming the digestion enzyme trypsin (Fig. 2). Mascot was searched with a fragment ion mass tolerance of 0.80 Da and a parent ion tolerance of 0.80 Da. X! Tandem was searched with a fragment ion mass tolerance of 0.100 Da. The iodoacetamide derivative of cysteine was specified in Mascot and X! Tandem as a fixed modification. Oxidation of methionine was specified in Mascot and X! Tandem as a variable modification.
Fig. 2.
Flow diagram showing experimental details and data analysis. Affinity capture LC was conducted using 16 bait proteins on four cellular starting materials (HDF media and matrix and ARPE-19 media and matrix). The resulting captured proteins were concentrated and digested with trypsin prior to analysis using LC-MS/MS. Database searching of the peak list files was performed with Mascot and Phenyx. Mascot output files were further validated using Scaffold, which incorporated a further search using X! Tandem. Protein and peptide identification data were combined with bait and source material details (“Experimental Details”) along with published interaction data from BioGRID using Microsoft Access. BEPro3 was used to statistically validate bait-prey pairs, and the resulting interactions were visualized using Cytoscape. The numbers of proteins, peptides, and unique peptide sequences detected in each process are shown. The numbers of primary interactions between bait and prey, potential secondary interaction between prey and prey before statistical analysis, and interactions with Bayes' odds >0 after BEPro3 analysis are shown.
Flow diagram showing experimental details and data analysis. Affinity capture LC was conducted using 16 bait proteins on four cellular starting materials (HDF media and matrix and ARPE-19 media and matrix). The resulting captured proteins were concentrated and digested with trypsin prior to analysis using LC-MS/MS. Database searching of the peak list files was performed with Mascot and Phenyx. Mascot output files were further validated using Scaffold, which incorporated a further search using X! Tandem. Protein and peptide identification data were combined with bait and source material details (“Experimental Details”) along with published interaction data from BioGRID using Microsoft Access. BEPro3 was used to statistically validate bait-prey pairs, and the resulting interactions were visualized using Cytoscape. The numbers of proteins, peptides, and unique peptide sequences detected in each process are shown. The numbers of primary interactions between bait and prey, potential secondary interaction between prey and prey before statistical analysis, and interactions with Bayes' odds >0 after BEPro3 analysis are shown.Scaffold (version Scaffold_2_00_03, Proteome Software Inc., Portland, OR) was used to validate MS/MS-based peptide and protein identifications (Fig. 2). To ensure the highest confidence in the data set, peptide and protein identifications by Scaffold were filtered to show only those with high probability. Peptide identifications were accepted if they could be established at greater than 95.0% probability as specified by the Peptide Prophet algorithm (38). Protein identifications were accepted for each affinity capture experiment if they could be established at greater than 99.0% probability and contained at least two unique identified peptides. Protein probabilities were assigned by the Protein Prophet algorithm (39). Proteins that contained similar peptides and could not be differentiated based on MS/MS analysis alone were grouped to satisfy the principles of parsimony.
Database Searching and Protein Identification Using Phenyx
Peptide identification by Phenyx version 2.2 (Geneva Bioinformatics (GeneBio)) was performed using Swiss-Prot database release 50.5 with the searches restricted to H. sapiens protein entries only (Fig. 2). The parent ion tolerance was set to 0.8 Da assuming digestion with trypsin with iodoacetamide derivative of cysteine as a fixed modification and oxidation of methionine as a variable modification. Only proteins that had at least two unique peptide sequences detected were included.
Data Analysis
Each experiment (biological replicate) generated two peak lists, one for each elution step; and the data from each experiment were treated independently (intersection) through the analysis process up to the final global statistical analysis. To combine data exported from Scaffold and Phenyx and to link it with the experimental details of each affinity capture LC experiment, a database was constructed using Access 2003 (Microsoft) (Fig. 2). This database facilitated merging of data from the two peak lists generated for each experiment (union) as well as visualization of the combined data for each bait protein or each source material. Data from Scaffold were exported as protein report and peptide report formats in an Excel spreadsheet (2003, Microsoft), which was imported as a separate table into Access. Data from Phenyx were exported for each search as an Excel spreadsheet, and the protein data and peptide data were combined for all experiments and imported into Access as two new tables. A spreadsheet containing experimental details (which was linked to the mass spectrometry data by the file name from the mass spectrometry sample) was imported. A further table was constructed containing a list of the proteins detected along with their Swiss-Prot accession numbers, accession names, and gene identities; this table allowed the combining of data from Phenyx and Scaffold for each protein. To match published interactions for each protein with protein interactions detected in the study, interaction data from BioGRID (40) were extracted from the complete downloaded data set and then added into Access as a further table, and the interaction data set was further expanded using five other interaction databases and manual insertion of published data as described below. A flow diagram showing the bait proteins and cell culture source materials, the number of experiments, and software used for each step of the data analysis, along with the number of proteins and peptides detected, is shown in Fig. 2.To calculate the percent coverage of the unique peptide sequences for each respective protein, a list of peptide sequences was generated (see supplemental Tables 4 and 5) along with a FASTA file containing the relevant protein sequences. The percent coverage was then calculated for each protein using the application “Protein Coverage Summarizer,” which was obtained from the Pacific Northwest National Laboratory.
Statistical Analysis of Bait-Prey Interactions using BEPro3
To assign a statistical probability that each bait-prey interaction was specific, an application for analyzing multiple bait and multiple replicate pulldown experiments called BEPro3 (41, 42) (Fig. 2) was used. BEPro3 characterizes each prey protein as either specific or nonspecific and undertakes Bayesian analysis to calculate the posterior probabilities of each protein-protein association. Data were inputted as a cross-tab table generated from the MS Access database (see above), which contained a complete list of proteins detected and the number of peptides detected for each pulldown experiment (combined from both elution steps). A second table, which was known as a pedigree file, contained the list of bait proteins and their corresponding experiment number. This table was generated for data from both Phenyx and Scaffold. The analysis was carried out using the default parameters, the resulting interaction table containing the Bayes' odds for each interaction was imported into the database, and data from the two search engines were combined. To allow easy visualization of combined interaction networks, Cytoscape (v2.6) (43) was used. As part of the analysis process, BEPro3 also gives a global value for the false positive rate of each prey protein detected; this is a value between 0 and 1 with 1 being totally false positive. The percent ubiquity is also calculated from the global data with a high percent ubiquity indicating that the protein was detected in a large number of experiments, indicating that the protein was either bound readily to the HisTrap beads or was a common contaminant of the overall process.
Measurements of Data Set Quality
A gold standard (GS) interaction data set was constructed using the interaction database BioGRID (40) as the starting point. To this initial data set, interactions identified from five other databases were added. These were DIP (Database of Interacting Proteins), BIND (Biomolecular Interaction Network Database), MINT (Molecular INTeraction database (44)), IntAct (45), and HPRD (Human Protein Reference Database (46)). Finally, manual collation of interactions from published literature was carried out using the criteria that the interactions were binary in nature and had been rigorously demonstrated in vitro. The numbers of interactions for each bait protein that were identified from each database and the GS data set are listed in Table III. The full GS interaction data set is shown in supplemental Table 6.
Table III
Measurements of data set quality
Interactions for the seven bait proteins were queried using six interaction databases along with manually added interactions from the literature to form the GS interaction data set (supplemental Table 6). The number of interactions is shown for each database and for each bait protein with interactions confirmed only by two-hybrid experiments shown in parentheses. The precision (TP/D) and sensitivity (TP/P) values were then calculated for each bait protein and for the entire data set as described under “Experimental Procedures.” The false positive rate was calculated using two methods using either a GS negative data set or the LRT-Bayes algorithm, and the average reproducibility rate was calculated as described under “Experimental Procedures.” Coll, collagen; FBLN5, fibulin-5; DIP, Database of Interacting Proteins; BIND, Biomolecular Interaction Network Database; MINT, Molecular INTeraction database; HPRD, Human Protein Reference Database.
Database
Bait protein
Totals
Fibrillin-1
MAGP-1
LOX
FBLN5
Coll (VIII)
Calsyntenin
Proepithelin
BioGRID
5 (0)
3 (0)
1 (0)
3 (1)
4 (3)
3
10 (4)
29 (8)
BIND
2
0
1
0
2
3
5 (3)
13 (3)
DIP
1
1
0
0
0
0
0
2 (0)
MINT
0
0
0
3 (2)
1 (1)
2
5 (4)
11 (8)
IntAct
0
0
1 (1)
3 (3)
5 (5)
2 (1)
7 (5)
18 (15)
HPRD
12 (0)
5 (1)
4 (1)
6 (3)
7 (4)
13 (8)
14 (7)
58 (24)
GS (P)
31 (0)
9 (0)
6 (1)
9 (1)
6 (3)
3 (0)
14 (4)
78 (9)
True positives (TP)
8
2
1
1
0
0
0
12
Total interactions (D)
104
51
15
29
31
14
4
248
Precision
0.077
0.039
0.067
0.034
0.000
0.000
0.000
0.048
Sensitivity
0.258
0.222
0.167
0.111
0.000
0.000
0.000
0.154
False positive rate (GS negative)
26.8%
False positive rate (using LRT-Bayes)
8.4%
False negative rate
84.6%
Average reproducibility rate
22.0%
Precision, sensitivity, and false positive rate were calculated using the methods described in Yu et al. (47) using interactions found in this study and in the generated GS interaction data set (supplemental Table 6) according to the equations
where TP (true positives) is the number of interactions detected in this study that match the GS interaction data set, D is the total number of interactions in a data set, P is the total number of positives in the GS interaction data set, and FP (false positives) is the number of interactions detected in this study that match the GS negative set. The GS negative set was constructed using the premise described by Yu et al. (47) and Jansen and Gerstein (48) that proteins from a different cellular compartment are unlikely to interact. As all the bait proteins are extracellular in nature, interactions involving intracellular proteins were assigned to the GS negative set.The false positive rate was also calculated for each interaction using a likelihood ratio test (LRT)-Bayes algorithm using the application BEPro3 (41, 42). This approach uses no prior knowledge of interacting partners but is calculated statistically from the global experimental data set. To estimate the overall false positive rate for the data set, the average false positive rate was calculated. The false negative rate was determined for the entire data set by using the following equation.
The reproducibility rate for each interaction with each bait protein was calculated by using the percentage of identification of a prey protein divided by the number of experiments completed (supplemental Table 3). The overall reproducibility rate was then calculated by averaging this reproducibility rate of each interaction. The values for the bait reproducibility rates were removed before the final calculation was made.
Statistical Analysis of Prey-Prey Interactions Using BEPro3
As bait proteins could pull down individual interacting proteins or protein complexes, analysis was also conducted to predict secondary interactions (prey-prey interactions). Using the premise that each pulled down prey protein could possibly be a bait for all the other prey proteins seen in a particular experiment, a list was generated that contained the experiment number, each protein detected, and the number of peptides detected. The data were analyzed using BEPro3 (Fig. 2) using the single pedigree file method, which then globally analyzed each possible interaction and gave Bayes' odds for each potential interaction. Pairs of prey proteins that were detected in more than one experiment would therefore have a greater Bayes' odds than pairs of proteins that only appeared together once. The secondary interaction data network generated was then further analyzed and visualized using Cytoscape.
BIAcore Analysis of Calsyntenin and Proepithelin Interactions
The calsyntenin-1 extracellular region was immobilized by amine coupling onto a CM5 sensor chip (GE Healthcare) in 50 mm sodium acetate buffer, pH 5.5 at a concentration of 30 μg/ml, giving typical immobilization of 3000 response units. An analyte scan using all the fibrillin-1 fragments was performed as described previously (31), and only the N-terminal fragment PF1 (see Fig. 1) was found to interact. For kinetic studies, fibrillin-1 fragment PF1 was injected at concentrations ranging from 0 to 500 nm at a flow rate of 30 μl/min for 6 min and dissociated for 10 min. Regeneration was performed by a single injection of 0.4 m NaCl. Binding was calculated independently using equilibrium analysis. The equilibrium response was plotted against concentration, and non-linear regression was used to calculate K using the equation for one-site binding.Kinetic binding studies of proepithelin with fibrilin-1 fragments did not show any significant binding. However, because interaction of fibrillin with heparin has been found to be important for the formation of the extracellular matrix (27, 28), proepithelin interactions with heparin were investigated. For the kinetic binding studies, a heparin saccharide consisting of 24 sugar moieties (dp24) (Iduron) was biotinylated and immobilized onto SA sensor chips (GE Healthcare) as described previously (28). Heparin was used at 1 μm, and 400 response units were immobilized. All binding experiments were performed in 10 mm HEPES, pH 7.4, 0.1 m NaCl, 0.005% surfactant P20 (designated HBS-P). Proepithelin was injected at concentrations ranging from 0 to 200 nm at a flow rate of 30 μl/min for 6 min and dissociated for 10 min. Regeneration was performed by two 30-s injections of 0.5 mm NaOH, 1 m NaCl. Curves were fitted using the 1:1 Langmuir association/dissociation model (BIAevaluation 4.1, GE Healthcare).
RESULTS
Using a molecular fishing strategy (Fig. 2), we identified a number of extracellular proteins that interact directly or indirectly with elastic fiber molecules, thereby highlighting the molecular complexity of elastic fiber matrices. Our strategy involved detailed mass spectrometry analysis and validation of extracellular proteins from cultures that specifically co-purified with four His6-tagged recombinant human elastic fiber proteins (fibrillin-1, fibulin-5, MAGP-1, and LOX). Tropoelastin was not utilized as an affinity ligand because of its unique chemical structure that renders this soluble form highly adhesive. The molecular interactions identified were independent of the His6 tag because they were molecule-specific, and the tag was free to interact with nickel beads during purification of affinity complexes.
Control Strategy and Composition of Starting Materials
To test the robustness of the affinity capture LC experimental procedures (to define a benchmark), two types of control experiments were carried out. For no-bait protein control affinity capture LC experiments, very few proteins were detected after tryptic digestion and mass spectrometry (see supplemental Table 1A). These results indicated that the LC conditions were sufficiently stringent, and the inclusion of 25 mm imidazole in the initial wash was needed to reduce nonspecific bead binding protein to a negligible level. The proteins that were detected included histones H3 and H4 and keratins (type I cytoskeletal 9 and type II cytoskeletal 1). These proteins were found to have a high percent ubiquity throughout all the experiments.We also analyzed the composition (with no bait added and without HisTrap chromatography) of the most abundant proteins of the four starting materials (serum-free media and cell layer extracts from HDF and ARPE-19 cultures) (supplemental Table 1B). Tryptic digestion and mass spectrometry revealed that both media and matrix sources still contained large amounts of serum albumin, and the matrix sources also contained actin as a major abundant protein. When compared with the most abundant proteins that were detected in the affinity capture LC-MS/MS experiments (which included mainly extracellular matrix proteins; Table I and supplemental Table 2), the list of proteins from the no bait added and without chromatography controls was remarkably different with virtually no extracellular matrix proteins detected, and only the keratins were detected in both probably due to a high percent ubiquity. The composition of the starting material had a marked difference from the detected pulled down proteins, showing that the affinity capture LC was specific in pulling down extracellular proteins instead of the more abundant serum albumin and actin in the initial starting materials.
Table I
Total number of peptides and unique peptide sequences for each protein detected after Scaffold validation
Scaffold was used to validate protein and peptide identifications. Only proteins with >2 unique peptide sequences were counted, and each peptide had greater than 95.0% probability as specified by the Peptide Prophet algorithm (38). The proteins were then ranked by total number of peptides detected. The cell culture compartments in which the identified peptides were detected are shown. Also shown is the number of unique peptide sequences in parentheses and the total percent coverage of the unique peptide sequences (calculated as described under “Experimental Procedures”). Peptides detected from bait proteins were included (shown in bold). The global percent ubiquity was calculated for each detected protein using BEPro3; proteins with a percent ubiquity greater than 40% were treated as nonspecific contaminants. Shown are all proteins with a total peptide count >5. A complete list is shown in supplemental Table 2, which also includes a complete breakdown of percent coverage from each source. The sequences of the peptides identified for each protein are shown in supplemental Table 4.
Protein identity
Swiss-Prot entry name
ARPE matrix
ARPE media
HDF matrix
HDF media
Protein peptide total
Coverage
Ubiquity
%
%
Fibrillin-1
FBN1_HUMAN
394 (76)
760 (115)
334 (58)
649 (118)
2137 (145)
65.8
51.4
Keratin type II cytoskeletal 1a
K2C1_HUMAN
167 (34)
130 (25)
154 (32)
181 (33)
632 (42)
49.7
86.9
Plasminogen activator inhibitor 1
PAI1_HUMAN
33 (15)
320 (21)
2 (2)
54 (18)
409 (22)
65.4
38.0
Perlecan
PGBM_HUMAN
117 (71)
256 (96)
23 (20)
2 (2)
398 (113)
36.5
18.6
Keratin type I cytoskeletal 10a
K1C10_HUMAN
96 (32)
37 (20)
63 (21)
123 (31)
319 (40)
63.9
48.0
Keratin type I cytoskeletal 9a
K1C9_HUMAN
46 (16)
45 (15)
63 (14)
49 (21)
203 (25)
47.4
50.2
βigH3
BGH3_HUMAN
7 (6)
123 (23)
2 (2)
132 (23)
47.0
7.3
Histone H4
H4_HUMAN
66 (7)
18 (5)
32 (7)
12 (4)
128 (7)
52.4
14.6
Fibronectin
FINC_HUMAN
2 (2)
57 (37)
26 (24)
19 (16)
104 (49)
30.6
17.8
Stanniocalcin-2
STC2_HUMAN
8 (6)
44 (6)
6 (6)
45 (6)
103 (7)
30.1
38.2
Thrombospondin-1
TSP1_HUMAN
93 (28)
5 (4)
98 (28)
33.2
2.0
Proepithelin
GRN_HUMAN
21 (11)
46 (15)
5 (4)
17 (9)
89 (19)
34.2
8.2
Keratin type IIa
K22E_HUMAN
15 (18)
6 (9)
25 (22)
41 (27)
87 (36)
45.0
48.6
Sulfhydryl oxidase 1
QSOX1_HUMAN
67 (28)
5 (4)
72 (29)
50.3
0.4
Serum albumin
ALBU_HUMAN
10 (7)
15 (5)
18 (8)
22 (8)
65 (12)
18.2
22.2
Collagen α1(XVIII) chain
COIA1_HUMAN
26 (10)
36 (13)
62 (15)
13.0
3.4
Lamin-A/C
LMNA_HUMAN
17 (8)
42 (17)
59 (19)
34.5
8.1
Calsyntenin-1
CSTN1_HUMAN
29 (16)
16 (14)
45 (19)
24.1
2.9
Agrin
AGRIN_HUMAN
42 (21)
42 (21)
12.8
6.4
Collagen α1(I) chain
CO1A1_HUMAN
5 (4)
36 (21)
41 (22)
25.3
11.4
Insulin-like growth factor-binding protein 7
IBP7_HUMAN
2 (2)
38 (11)
40 (11)
51.8
0.4
Keratin type I cytoskeletal 14
K1C14_HUMAN
2 (10)
2 (3)
15 (16)
18 (14)
37 (25)
38.1
8.7
Collagen α1(IV) chain
CO4A1_HUMAN
30 (8)
5 (4)
35 (8)
7.5
0.3
Insulin-like growth factor-binding protein 4
IBP4_HUMAN
2 (2)
30 (7)
32 (7)
31.0
0.4
MAGP-1
MFAP2_HUMAN
2 (2)
13 (5)
17 (6)
32 (6)
35.5
4.1
Collagen α2(IV) chain
CO4A2_HUMAN
31 (13)
31 (13)
16.1
0.4
Calsyntenin-2
CSTN2_HUMAN
30 (16)
30 (16)
21.4
2.3
Collagen α2(I) chain
CO1A2_HUMAN
28 (18)
28 (18)
23.4
3.3
Annexin A2
ANXA2_HUMAN
2 (2)
24 (15)
26 (15)
50.7
4.2
Keratin type II cytoskeletal 6
K2C6A_HUMAN
5 (15)
15 (17)
6 (16)
26 (34)
43.1
7.5
Insulin-like growth factor-binding protein 3
IBP3_HUMAN
21 (7)
4 (3)
25 (7)
24.4
5.4
TGFβ-2
TGFB2_HUMAN
3 (3)
22 (10)
25 (10)
30.9
0.6
Histone H2B
H2B1M_HUMAN
18 (5)
4 (3)
2 (2)
24 (6)
52.4
2.4
Retinoic acid receptor responder protein 2
RARR2_HUMAN
22 (5)
22 (5)
39.3
1.7
Myosin-9
MYH9_HUMAN
16 (12)
2 (2)
3 (3)
21 (13)
9.5
0.4
Protein NOV homolog
NOV_HUMAN
21 (9)
21 (9)
36.1
0.4
Extracellular matrix protein 1
ECM1_HUMAN
20 (11)
20 (11)
29.1
0.3
Histone H2A type 2-A
H2A2A_HUMAN
11 (4)
6 (4)
2 (2)
19 (5)
67.7
0.4
Collagen α1(XII) chain
COCA1_HUMAN
18 (15)
18 (15)
6.4
0.4
Keratin type II cytoskeletal 5
K2C5_HUMAN
2 (13)
2 (11)
12 (20)
16 (29)
33.2
7.4
Titin
TITIN_HUMAN
3 (3)
11 (11)
2 (2)
16 (16)
0.9
0.4
Protein S100-A7
S10A7_HUMAN
4 (3)
10 (5)
14 (5)
54.5
2.1
Vimentin
VIME_HUMAN
2 (2)
11 (11)
13 (13)
31.5
4.9
Keratin type I cytoskeletal
K1C16_HUMAN
(9)
(1)
10 (18)
(10)
10 (24)
38.5
7.5
Annexin V
ANXA5_HUMAN
2 (2)
4 (4)
3 (3)
9 (6)
21.9
4.1
Actin cytoplasmic 1
ACTB_HUMAN
5 (3)
3 (3)
8 (5)
14.9
1.6
Follistatin-related protein
FSTL1_HUMAN
8 (6)
8 (6)
31.5
0.3
%
%
Lysyl oxidase
LYOX_HUMAN
3 (3)
5 (4)
8 (5)
15.3
4.1
Zinc-α2-glycoprotein
ZA2G_HUMAN
8
(2)
8 (9)
1.5
0.3
Fibulin-1
FBLN1_HUMAN
7 (5)
7 (5)
8.1
3.6
Fibrillin-3
FBN3_HUMAN
(3)
(1)
6 (2)
6 (5)
1.6
0.3
Histone H3
H33_HUMAN
2 (2)
2 (2)
2 (2)
6 (3)
16.2
0.4
Insulin-like growth factor-binding protein 5
IBP5_HUMAN
6 (4)
6 (4)
18.0
0.4
LTBP-2
LTBP2_HUMAN
6 (5)
6 (5)
3.5
3.1
Single-stranded DNA-binding protein
ALBU_HUMAN
(7)
6 (5)
(8)
(8)
6 (12)
18.2
0.3
Spectrin β chain brain 1
SPTB2_HUMAN
6 (3)
6 (3)
1.7
0.3
Cell culture compartment total (for complete table (supplemental Table 2))
1200 (470)
2432 (683)
925 (374)
1475 (500)
6032 (1211)
Proteins with a percent ubiquity greater than 40% treated as nonspecific contaminants.
Total number of peptides and unique peptide sequences for each protein detected after Scaffold validation
Scaffold was used to validate protein and peptide identifications. Only proteins with >2 unique peptide sequences were counted, and each peptide had greater than 95.0% probability as specified by the Peptide Prophet algorithm (38). The proteins were then ranked by total number of peptides detected. The cell culture compartments in which the identified peptides were detected are shown. Also shown is the number of unique peptide sequences in parentheses and the total percent coverage of the unique peptide sequences (calculated as described under “Experimental Procedures”). Peptides detected from bait proteins were included (shown in bold). The global percent ubiquity was calculated for each detected protein using BEPro3; proteins with a percent ubiquity greater than 40% were treated as nonspecific contaminants. Shown are all proteins with a total peptide count >5. A complete list is shown in supplemental Table 2, which also includes a complete breakdown of percent coverage from each source. The sequences of the peptides identified for each protein are shown in supplemental Table 4.Proteins with a percent ubiquity greater than 40% treated as nonspecific contaminants.
Identification of Proteins That Interact with Elastic Fiber Molecular Baits
Molecular fishing experiments were conducted in HDF and ARPE-19 cell cultures using purified His-tagged humanfibrillin-1 fragments, MAGP-1, fibulin-5, and LOX (Fig. 2). We conducted a total of 69 affinity experiments on ARPE-19 cells and 67 experiments on HDF cultures with 16 molecular baits and at least three biological repeats for each molecular fishing experiment. Using Scaffold, a total of 6032 peptides (a total of 1211 unique peptide sequences) from proteins that interact with these molecules were detected, 3632 from ARPE-19 cells and 2400 from the HDF cultures (Fig. 2, Table I, and supplemental Tables 2 and 4). These peptides were derived from 112 different proteins with 48 from the extracellular matrix, 17 membrane proteins, four growth factors/cytokines, 34 other cellular proteins, and nine types of keratins (Table II). A comparison of all proteins that interacted with elastic fiber proteins revealed 37 identical proteins in both culture types and 37 and 38 proteins uniquely detected within either ARPE-19 or HDF cultures, respectively (Table II). Several interacting molecules were identified in all ARPE-19 and HDF culture compartments, including fibrillin-1, fibronectin, proepithelin, perlecan, plasminogen activator inhibitor 1 (PAI-1), and stanniocalcin-2. The percent ubiquity was calculated using BEPro3 for all proteins detected (Table I and supplemental Table 2). Four of the detected keratin proteins (keratin type I cytoskeletal 9 and cytoskeletal 10, keratin type II, and keratin type II cytoskeletal 1) were found to have a percent ubiquity greater than 40%. These proteins were also detected in the negative controls (supplemental Table 1A). These proteins were removed from further analysis as they were considered nonspecific contaminants. Fibrillin-1 also had a high percent ubiquity (51.4%), but this was to be expected as it was added as a bait to approximately half the experiments.
Table II
Total number of proteins detected in each cellular or extracellular protein category after Scaffold validation
Proteins detected in Table I were characterized into five categories. The number of proteins detected is shown for each cell culture compartment. Also shown (in parentheses) are the number of proteins detected that were unique to either cell line and cell culture compartment. Common protein numbers between cell lines or cell culture compartments are shown in italics.
ARPE (37) (37)
HDF (38)
Category total
ARPE matrix (10) (4)
ARPE media (23)
HDF matrix (18) (1)
HDF media (19)
Extracellular matrix
19
32
10
23
48
Growth factors/cytokines
2
3
1
2
4
Membrane proteins
0
4
9
4
17
Other cellular proteins
14
13
13
14
34
Keratins
7
5
8
8
9
Cell culture compartment total
42
57
41
51
112
Total number of proteins detected in each cellular or extracellular protein category after Scaffold validation
Proteins detected in Table I were characterized into five categories. The number of proteins detected is shown for each cell culture compartment. Also shown (in parentheses) are the number of proteins detected that were unique to either cell line and cell culture compartment. Common protein numbers between cell lines or cell culture compartments are shown in italics.Of the reported elastic fiber component and associated molecules (1), ARPE-19 cultures gave hits for perlecan, βigH3, fibronectin, LTBP-2, MAGP-1, LOX and LOXL, collagen VIII, fibrillin-2, and fibulin-3, whereas the HDF cultures gave hits for fibronectin, perlecan, fibulin-1, MAGP-1, LOX, βigH3, and elastin. There were several hits for collagens, which in ARPE-19 cultures were collagen chains α1(XVIII), α1(IV), α2(IV), α3(IV), α1(VIII), α1(XI), and α1(XII) and in HDF cultures were collagen chains α1(I), α2(I), α1(IV), α3(V), and α3(IX). Of other extracellular matrix molecules, notable hits were agrin, thrombospondin-1, and tenascin-C. Both cultures gave numerous hits for PAI-1 (ubiquity of 38%), whereas hits for stanniocalcin-2 (ubiquity of 38.2%) and calsyntenin-1 and -2 were frequent. Proepithelin and insulin-like growth factor-binding proteins were significant growth factor-type interactions in both cultures.More proteins and peptides were pulled down from media samples in both cell types with 2432 and 1475 peptides from ARPE-19 and HDF media, respectively, compared with 1200 and 925 peptides from the ARPE-19 and HDF matrix, respectively (Table I). The media samples also contained more growth factor-related proteins, such as βigH3, TGFβ-2, and the insulin-like growth factor-binding proteins, along with calsyntenin-1 and -2, thrombospondin, and agrin. Proteins from the list of the most abundant proteins (Table I) that were found exclusively in the matrix were collagen chains α1(IV) and α2(IV), lamin-A/C, and annexin A2.
Primary Interactions of Elastic Fiber Proteins
Microfibrillar Proteins
The microfibrillar component of elastic fibers comprises assembled fibrillins and associated MAGP-1 (37). Interactions with fibrillin-1 were identified using overlapping fibrillin-1 fragments encompassing the entire molecule (Fig. 1). The eight overlapping fibrillin-1 fragments interacted significantly with a total of 21 proteins (Fig. 3). Seven of these interactions were with extracellular matrix molecules, five were with growth factors and hormone-related molecules, and the remainder were with cellular proteins; no membrane proteins interacted significantly. The N-terminal PF1 fragment bound to PAI-1 as well as to lamin-A/C and stanniocalcin-2 but proved not to be as interactive as predicted from in vitro binding assays (28, 30). PF2 bound to endogenous fibrillin-1, insulin-like growth factor-binding protein 3, βigH3, stanniocalcin-2, PAI-1, perlecan, thrombospondin-1, and LOXL. PF5 bound to fibronectin, annexins A2 and V, lamin-A/C, and stanniocalcin-2. PF8 bound to PAI-1 and stanniocalcin-2. PF11 bound to endogenous fibrillin-1, stanniocalcin-2, perlecan, and PAI-1. PF12 bound to TGFβ-2, PAI-1, endogenous fibrillin-1, insulin growth factor-binding protein 7, and proepithelin. PF13 bound to endogenous fibrillin-1 and -2. A complete list of all the prey proteins detected with each bait protein, along with the corresponding Bayes' odds, false positive rates, and percent reproducibility, is presented in supplemental Table 3. The overall reproducibility rate, calculated by averaging the reproducibility rate for each interaction, was 25% (Table III).
Fig. 3.
Interaction map showing significant interactions of prey proteins using fibrillin-1 fragments as bait. After statistical analysis of bait-prey interactions using BEPro3, interactions were mapped using Cytoscape for the fibrillin-1 fragment bait proteins. Interactions shown had Bayes' odds >0 and were detected using both Scaffold and Phenyx search processes. The schematic diagram of fibrillin-1 and the recombinant protein fragments are as described in Fig. 1. Bait proteins are indicated as squares, and prey proteins are indicated as circles. The width of each interaction line is proportional to the Bayes' odds (average of Scaffold and Phenyx). A complete list of all prey proteins is shown in supplemental Table 3.
Interaction map showing significant interactions of prey proteins using fibrillin-1 fragments as bait. After statistical analysis of bait-prey interactions using BEPro3, interactions were mapped using Cytoscape for the fibrillin-1 fragment bait proteins. Interactions shown had Bayes' odds >0 and were detected using both Scaffold and Phenyx search processes. The schematic diagram of fibrillin-1 and the recombinant protein fragments are as described in Fig. 1. Bait proteins are indicated as squares, and prey proteins are indicated as circles. The width of each interaction line is proportional to the Bayes' odds (average of Scaffold and Phenyx). A complete list of all prey proteins is shown in supplemental Table 3.
Measurements of data set quality
Interactions for the seven bait proteins were queried using six interaction databases along with manually added interactions from the literature to form the GS interaction data set (supplemental Table 6). The number of interactions is shown for each database and for each bait protein with interactions confirmed only by two-hybrid experiments shown in parentheses. The precision (TP/D) and sensitivity (TP/P) values were then calculated for each bait protein and for the entire data set as described under “Experimental Procedures.” The false positive rate was calculated using two methods using either a GS negative data set or the LRT-Bayes algorithm, and the average reproducibility rate was calculated as described under “Experimental Procedures.” Coll, collagen; FBLN5, fibulin-5; DIP, Database of Interacting Proteins; BIND, Biomolecular Interaction Network Database; MINT, Molecular INTeraction database; HPRD, Human Protein Reference Database.Using Scaffold, peptides from native fibrillin-1 could be identified from those of the bait proteins as they were from the whole length of fibrillin-1 rather than just the area of the bait protein. Native fibrillin-1 was pulled down in large amounts with recombinant fibrillin-1 fragments PF2 and PF13 and to a lesser extent with PF12. Because of the homologous nature of fibrillin-1 and fibrillin-2, several peptides assigned to fibrillin-2 could also be from fibrillin-1. Further analysis using the similarity function of Scaffold, which identified peptides that are exclusive to each fibrillin, confirmed that fibrillin-2-exclusive peptides were present in more than one experiment when the recombinant fibrillin-1 N- and C-terminal fragments (PF1 and PF13) were used as baits and were also seen when fibrillin fragment PF5 and fibulin-5 (see below) were used. The interactors of MAGP-1 with the most significant Bayes' odds were fibrillin-1 as well as PAI-1 and perlecan (Fig. 4).
Fig. 4.
Interaction map showing significant interactions of prey proteins using non-fibrillin-1 elastic fiber proteins as bait. Using the same process as described for Fig. 3, the interactions shown had Bayes' odds >0 and were detected using both Scaffold and Phenyx search processes. The interactions of recombinant baits fibulin-5, LOX, calsyntenin-1 extracellular (Ex) region, MAGP-1, collagen α1(VIII) and α2(VIII) NC2 domains, and full-length (FL) collagen α2(VIII) are shown. The width of each interaction line is proportional to the Bayes' odds (average of Scaffold and Phenyx). A complete list of all bait proteins is shown in supplemental Table 3.
Interaction map showing significant interactions of prey proteins using non-fibrillin-1 elastic fiber proteins as bait. Using the same process as described for Fig. 3, the interactions shown had Bayes' odds >0 and were detected using both Scaffold and Phenyx search processes. The interactions of recombinant baits fibulin-5, LOX, calsyntenin-1 extracellular (Ex) region, MAGP-1, collagen α1(VIII) and α2(VIII) NC2 domains, and full-length (FL) collagen α2(VIII) are shown. The width of each interaction line is proportional to the Bayes' odds (average of Scaffold and Phenyx). A complete list of all bait proteins is shown in supplemental Table 3.
Proteins Associated with Elastin Deposition
Elastin deposition on microfibrils involves fibulin-5 interactions and requires LOX cross-linking (1, 2). We investigated interactions with fibulin-5, LOX, and the elastic fiber-associated molecule collagen VIII (23) (Fig. 4). Fibulin-5 was found to interact with PAI-1 with the highest Bayes' odds, whereas LOX bound to fibrillin-1, perlecan, PAI-1, and stanniocalcin-2 (Fig. 4). In addition, we detected significant interactions between collagen VIII and fibronectin (Fig. 4).
Measurement of Data Set Quality
To assess the data set quality, precision, sensitivity, and false positive rates were calculated as described under “Experimental Procedures.” The overall precision of the whole data set, a measure of the true positive interactions, as defined by the GS data set over the total interactions detected, was 0.048. This figure rose to 0.077 for fibrillin-1 alone, which may be because more published interactions are known for this molecule (Table III). The overall sensitivity, the number of true positive interactions over the total positive interactions in the GS data set, was 0.154 (15.4% of all published interactions). This figure rose to 0.258 (25.8% of all published interactions) for fibrillin-1 interactions. These two values are of course restricted due to the fact that not all the true interacting prey proteins are expressed in each cell line, reflecting the tissue-specific nature of gene expression in mammalian cell lines, compared with the more homogenous nature of protein expression in a yeast cell. The precision score of 0.048 for the experiment is also lower due to the fact that protein complexes containing primary and secondary interactions are pulled down as a consequence of the macromolecular interactions of extracellular proteins.The false positive rate is more complex to calculate, so two approaches were used. The first method used a generated GS negative set, which has been described as difficult to construct (see Ref. 47 supplemental material). GS negative sets are based on cellular location of the proteins (47, 48), so using the knowledge that all bait proteins were extracellular in nature, all interactions with intracellular proteins should be considered part of the negative interaction set. Using this method, the false positive rate was calculated to be 26.8%, but this value dropped to 8.4% if the false positive rate was calculated by the second method using the LRT-Bayes algorithm using BEPro3 (Table III). The false negative rate was calculated for the whole data set using the sensitivity calculations and was found to be 84.6% (Table III).
Secondary Interactions
We explored whether some identified prey proteins might be associated with elastic fiber molecular bait proteins by secondary interactions. To examine possible secondary interactions, all proteins detected in a particular experiment were treated as potential bait proteins and potential prey proteins. Global statistical analysis, using BEPro3, was then performed, and the potential interactors with the highest Bayes' odds were mapped (Fig. 5). Many of the statistically significant secondary interactions grouped around fibronectin. Other interactions identified included those between insulin growth factor-binding proteins 3, 4, and 5, with insulin growth factor II, and between collagens.
Fig. 5.
Secondary interaction map showing highly statistically significant potential interactions. A list of potential secondary interactions was generated as described under “Experimental Procedures.” Briefly, each protein detected in a single affinity capture LC experiment was treated as a bait protein and prey protein with a potential interaction between each pair of proteins. Each interaction was analyzed using BEPro3, and the Bayes' odds were assigned for each interaction. Shown are all potential secondary interactions that had Bayes' odds >0.4 (on a scale of 0–1), the top 2.5% of potential interactions. The network was then visualized using Cytoscape. The width of each interaction line is proportional to the averaged Bayes' odds for proteins identified with both Scaffold and Phenyx search processes. Green nodes indicate heparin binding molecules.
Secondary interaction map showing highly statistically significant potential interactions. A list of potential secondary interactions was generated as described under “Experimental Procedures.” Briefly, each protein detected in a single affinity capture LC experiment was treated as a bait protein and prey protein with a potential interaction between each pair of proteins. Each interaction was analyzed using BEPro3, and the Bayes' odds were assigned for each interaction. Shown are all potential secondary interactions that had Bayes' odds >0.4 (on a scale of 0–1), the top 2.5% of potential interactions. The network was then visualized using Cytoscape. The width of each interaction line is proportional to the averaged Bayes' odds for proteins identified with both Scaffold and Phenyx search processes. Green nodes indicate heparin binding molecules.The secondary interaction network described above (Fig. 5) was further expanded to include all experimentally identified interactions, not just those with high statistical significance (Fig. 6). These possible secondary interactions were matched to published elastic fiber interactions in the constructed GS interaction data set (supplemental Table 6). This interaction map of published interactions shows several hubs centered around specific proteins, including fibrillin-1, the basement membrane proteoglycan perlecan, fibronectin, and thrombospondin. These “hub” proteins provide links to known interactions with collagens and growth factors. Information about glycan chain binding was also mapped (Fig. 6). It was found that many of the published possible secondary interacting proteins had the ability to bind to heparan sulfate; notably, the three basement proteoglycans detected (perlecan, agrin, and collagen XVIII) all contain heparan sulfate chains. Together, these approaches highlighted the likely existence of an extensive interacting network of elastic fiber-associated molecules with major “hubs” on fibronectin and heparan sulfate.
Fig. 6.
Secondary interaction map showing published interactions. A list of potential secondary interactions (prey-prey), including interactions with bait proteins (bait-prey, primary interactions), was generated as described under “Experimental Procedures.” Each interaction was analyzed using BePro3, and the Bayes' odds were assigned for each interaction. Each interaction was then cross-referenced with the GS data set for published interactions, and those interactions seen in the literature were visualized using Cytoscape. Interactions shown in red involve bait proteins (square nodes, primary interactions), and secondary interactions are shown in blue. The width of each interaction line is proportional to the Bayes' odds, and the direction of the arrow indicates the interaction with the highest Bayes' odd. Where the Bayes' odds were equal to 0 in both directions, no arrow is shown. Interactions with proteins only detected using one of the search engines (Scaffold and Phenyx) are shown as a dotted line. Each interaction number (shown in red) indicates the corresponding reference (10, 19, 25, 26, 31, 32, 49, 50, 65–107). Green nodes indicate heparin binding molecules, and diamond nodes indicate heparan sulfate proteoglycans.
Secondary interaction map showing published interactions. A list of potential secondary interactions (prey-prey), including interactions with bait proteins (bait-prey, primary interactions), was generated as described under “Experimental Procedures.” Each interaction was analyzed using BePro3, and the Bayes' odds were assigned for each interaction. Each interaction was then cross-referenced with the GS data set for published interactions, and those interactions seen in the literature were visualized using Cytoscape. Interactions shown in red involve bait proteins (square nodes, primary interactions), and secondary interactions are shown in blue. The width of each interaction line is proportional to the Bayes' odds, and the direction of the arrow indicates the interaction with the highest Bayes' odd. Where the Bayes' odds were equal to 0 in both directions, no arrow is shown. Interactions with proteins only detected using one of the search engines (Scaffold and Phenyx) are shown as a dotted line. Each interaction number (shown in red) indicates the corresponding reference (10, 19, 25, 26, 31, 32, 49, 50, 65–107). Green nodes indicate heparin binding molecules, and diamond nodes indicate heparan sulfate proteoglycans.
Validation of Novel Primary Interactions
Some of the molecular associations detected have previously been identified using in vitro binding assays; they include homotypic fibrillin-1 interactions (30, 49) and fibrillin-1 interactions with fibronectin, βigH3, and perlecan (13, 25, 26, 50). Using sensitivity calculations (Table III), 15.4% (12 of 78) of the validated published interactions were seen in this study, but this figure rose to 25.8% (8 of 31) for the fibrillin-1 interactions. However, using precision measurements, 12 published interactions were seen in this study, but there were 236 interactions seen that were not published. There could be several reasons for this high figure. The first reason could be that the number of published interactions is far from complete with some proteins such as collagen VIII and calsyntenin having very little information available. A second reason could be that most of the proteins detected may result from secondary protein-protein or protein-glycan interactions, which are common in the extracellular matrix. A third reason is possibly that the remaining interactions are novel interactions not yet reported. These considerations make calculation of the exact number of novel interactions identified in this study difficult.We recombinantly expressed two of the novel interactor proteins to validate these interactions. Calsyntenin-1 was pulled down by fibrillin-1 fragments PF1 and PF2, and calsyntenin-2 was pulled down by fibrillin-1 (fragment PF2), although the Bayes' odds were found to be 0. When used as bait, recombinant calsyntenin-1 pulled down fibrillin-1 and perlecan. Calsyntenin-1 also bound directly to fibrillin-1 fragment PF1 in BIAcore surface plasmon resonance binding studies (Fig. 7A) with an equilibrium binding constant of 240 ± 18 nm. In contrast, although proepithelin was pulled down by fibrillin-1 (fragment PF12), when used as bait, recombinant proepithelin did not pull down any proteins, so this interaction may be indirect. Proepithelin was found to bind to heparin with an equilibrium binding constant of 10.5 ± 1.1 nm (Fig. 7B); heparin in turn strongly binds multiple fibrillin-1 sites (27, 28).
Fig. 7.
BIAcore analysis of calsyntenin-1 binding to fibrillin-1 fragment PF1 ( A, N-terminal fibrillin-1 protein fragment PF1 was injected over the immobilized calsyntenin-1 extracellular region at concentrations of 0–500 nm, including one duplicate. The saturated (Sat.) response level was plotted against concentration (inset), and the equilibrium binding constant was calculated to be 240 ± 18 nm. B, proepithelin was injected over the heparin-oligosaccharide-immobilized surface at concentrations ranging from 0 to 200 nm. Curves were fitted using the 1:1 Langmuir association/dissociation model, and the equilibrium binding constant was calculated to be 10.5 ± 1.1 nm. One typical response curve is shown for each interaction, showing response difference (Resp. Diff.) plotted against time. Each experiment was repeated three times.
BIAcore analysis of calsyntenin-1 binding to fibrillin-1 fragment PF1 ( A, N-terminal fibrillin-1 protein fragment PF1 was injected over the immobilized calsyntenin-1 extracellular region at concentrations of 0–500 nm, including one duplicate. The saturated (Sat.) response level was plotted against concentration (inset), and the equilibrium binding constant was calculated to be 240 ± 18 nm. B, proepithelin was injected over the heparin-oligosaccharide-immobilized surface at concentrations ranging from 0 to 200 nm. Curves were fitted using the 1:1 Langmuir association/dissociation model, and the equilibrium binding constant was calculated to be 10.5 ± 1.1 nm. One typical response curve is shown for each interaction, showing response difference (Resp. Diff.) plotted against time. Each experiment was repeated three times.
DISCUSSION
We conducted a comprehensive LC-MS/MS mass spectrometry analysis of the molecular interactions of elastic fibers using two cell culture models. A novel His6 tag affinity purification protocol was developed that enabled an affinity protocol using a panel of recombinant human elastic fiber proteins (fibrillin-1, MAGP-1, fibulin-5, and LOX). This strategy proved highly effective for identifying specific extracellular matrix protein interactions and complex secondary interaction networks and also gave insights into how elastic fibers are integrated with their surrounding matrix. Efficacy was confirmed by identification of known interactions. Novel protein-protein interactions were validated by recombinant expression of novel interactors followed by in vitro interaction analyses and “reverse” affinity experiments.Recently, a strategy for identifying specific protein interaction partners using quantitative mass spectrometry and bead proteomes was described (51). In that study, the issue of nonspecific binding was addressed by effectively depleting the targeted complex using a green fluorescent protein binder (52). Here, we have demonstrated the efficacy of our His6 tag affinity purification strategy to probe the molecular interactions of extracellular elastic fiber proteins in cell culture. Advantages of our protocol include controllability of the affinity LC method using an HPLC system rather than manual spin columns. The use of two (three including X! Tandem) search engines, combined with the use of Scaffold to filter the data set to include peptides that were assigned the highest identification probability (95%) and a high probability for the protein identifications (90%), improved validation of peptide detection and greatly improved the confidence of the data set. Using this approach, the 317 potential proteins identified by Mascot were reduced to 112 using Scaffold (Fig. 2). The quality of the data set allowed better global statistical analysis using BEPro3 to exclude ubiquitously pulled down proteins. The ability to link mass spectrometry data to a published interaction database using an Access database allowed us to highlight novel interactions and confirm reported interactions that we detected by affinity LC and to increase understanding of their molecular networks. Furthermore, many extracellular proteins were detected, showing that the method minimized intracellular protein contaminants. Our strategy thus offers the means to probe directly the interactions of recombinant tagged proteins, including extracellular matrix proteins.Our previous examination of the composition of tissue-isolated microfibrils by mass spectrometry identified the known microfibril molecules fibrillin-1 and MAGP-1 (37). However, difficulties in isolating intact elastic fibers precluded a similar approach for these complex insoluble higher order extracellular matrix assemblies. Therefore, we adopted a novel approach of detecting extracellular interactions of secreted extracellular matrix molecules with soluble recombinant elastic fiber molecules. This approach is consistent with various studies that have shown that exogenous recombinant extracellular matrix molecules can assemble in cell layers and that recombinant molecules such as elastin, fibulin-5, and LTBP-1 associate efficiently with microfibrils deposited by cultured cells (17, 53). Our new approach has contributed new knowledge of elastic fiber composition.Differences between HDF and ARPE-19 cultures in profiles of prey proteins identified by our affinity capture LC experiments using elastic fiber bait proteins may reflect the different tissue-specific origins of each cell type. Whereas HDFs from skin are of mesenchymal origin, ARPE-19 cells from the pigmented epithelium of the retina are of epithelial origin. These differences may be significant in regard to the very different elastic fiber architectures within these tissues.As well as confirming the major known components and associated molecules, we identified a number of novel associations that have potentially important functional significance. Six fragments of fibrillin-1 (PF1, PF2, PF5, PF8, PF11, and PF12) as well as LOX and MAGP-1 were found to bind stanniocalcin-2, a secreted homodimeric glycoprotein hormone that has potential roles in carcinogenesis and may influence apoptosis (54). Five fibrillin-1 fragments (PF1, PF2, PF8, PF11, and PF12), MAGP-1, LOX, and fibulin-5 bound to PAI-1, a 47-kDa glycoprotein that inhibits tissue plasminogen activator, thereby regulating the fibrinolytic system. Thus, microfibrils may regulate bioavailability of PAI-1 and blood clotting. It is of interest that fibrillin microfibrils can also modulate platelet adhesion during thrombus formation in shear flow (55). Two fibrillin-1 fragments (PF2 and PF5) showed statistically significant binding to insulin-like growth factor-binding proteins, which are members of a family of proteins that bind insulin growth factors, and may thus regulate cell proliferation. Insulin-like growth factor-binding proteins 3 and 5 bind heparin and extracellular matrix (56). We found that fibrillin-1 fragment PF2 bound thrombospondin-1, supporting early reports indicating that thrombospondin may be associated with subendothelial microfibrils (57). It is also known that the chondroitin sulfate proteoglycanversican both binds thrombospondin-1 and colocalizes with microfibrils after induction by inflammation on vascular smooth muscle cells (58, 59). We have previously reported interactions of fibrillin-1 with annexins (37).An important outcome of our study was the identification of secondary elastic fiber-associated networks indexing on fibronectin and heparan sulfate, respectively. It is known that fibrillin-1 interacts directly with fibronectin, and we and others have shown that microfibril assembly is dependent upon integrin-mediated assembly of fibronectin, itself a major heparan sulfate binding molecule (25, 26, 28). Fibrillin-1 is also a major heparin/heparan sulfate binding molecule, and cell surface heparan sulfate is a critical determinant of microfibril deposition because culture supplementation with heparin, blocking heparan sulfate attachment to core proteins, or disrupting sulfation all block assembly (29, 60, 61). Furthermore, we previously showed that fibronectin is critical for the deposition of LTBP-1, which is associated with microfibrils and regulates TGFβ bioavailability (53), whereas heparin also interacts with elastin (62, 63), and fibronectin interacts with LOX (64). Through their interactions with many other extracellular molecules, these two key matrix molecules, fibronectin and heparan sulfate, may thus integrate elastic fibers within the surrounding extracellular matrix. Given numerous reports in the literature of the production of tagged recombinant extracellular matrix molecules for in vitro structure/function studies, our affinity purification and mass spectrometric protocol offers the potential to rapidly resolve many novel biological interactions and interacting networks that contribute to diverse and complex extracellular matrix assemblies.
Authors: Zenzo Isogai; Anders Aspberg; Douglas R Keene; Robert N Ono; Dieter P Reinhardt; Lynn Y Sakai Journal: J Biol Chem Date: 2001-11-28 Impact factor: 5.157
Authors: J M Ross; L V McIntire; J L Moake; H J Kuo; R Q Qian; R W Glanville; E Schwartz; J H Rand Journal: Thromb Haemost Date: 1998-01 Impact factor: 5.249
Authors: Emily Feneberg; Petra Steinacker; Alexander Erich Volk; Jochen Hans Weishaupt; Marc Axel Wollmer; Adam Boxer; Hayrettin Tumani; Albert Christian Ludolph; Markus Otto Journal: J Neural Transm (Vienna) Date: 2015-12-11 Impact factor: 3.575
Authors: Ana Paula Cleto Marolla; Jaques Waisberg; Gabriela Tognini Saba; Daniel Reis Waisberg; Fernando Beani Margeotto; Maria Aparecida da Silva Pinhal Journal: Einstein (Sao Paulo) Date: 2015 Oct-Dec
Authors: Alexander Eckersley; Kieran T Mellody; Suzanne Pilkington; Christopher E M Griffiths; Rachel E B Watson; Ronan O'Cualain; Clair Baldock; David Knight; Michael J Sherratt Journal: J Biol Chem Date: 2018-02-16 Impact factor: 5.157
Authors: Andrew K Baldwin; Stuart A Cain; Rachel Lennon; Alan Godwin; Catherine L R Merry; Cay M Kielty Journal: J Cell Sci Date: 2013-11-04 Impact factor: 5.285