Literature DB >> 19329653

Identification of a putative protein profile associated with tamoxifen therapy resistance in breast cancer.

Arzu Umar1, Hyuk Kang, Annemieke M Timmermans, Maxime P Look, Marion E Meijer-van Gelder, Michael A den Bakker, Navdeep Jaitly, John W M Martens, Theo M Luider, John A Foekens, Ljiljana Pasa-Tolić.   

Abstract

Tamoxifen resistance is a major cause of death in patients with recurrent breast cancer. Current clinical factors can correctly predict therapy response in only half of the treated patients. Identification of proteins that are associated with tamoxifen resistance is a first step toward better response prediction and tailored treatment of patients. In the present study we intended to identify putative protein biomarkers indicative of tamoxifen therapy resistance in breast cancer using nano-LC coupled with FTICR MS. Comparative proteome analysis was performed on approximately 5,500 pooled tumor cells (corresponding to approximately 550 ng of protein lysate/analysis) obtained through laser capture microdissection (LCM) from two independently processed data sets (n = 24 and n = 27) containing both tamoxifen therapy-sensitive and therapy-resistant tumors. Peptides and proteins were identified by matching mass and elution time of newly acquired LC-MS features to information in previously generated accurate mass and time tag reference databases. A total of 17,263 unique peptides were identified that corresponded to 2,556 non-redundant proteins identified with > or = 2 peptides. 1,713 overlapping proteins between the two data sets were used for further analysis. Comparative proteome analysis revealed 100 putatively differentially abundant proteins between tamoxifen-sensitive and tamoxifen-resistant tumors. The presence and relative abundance for 47 differentially abundant proteins were verified by targeted nano-LC-MS/MS in a selection of unpooled, non-microdissected discovery set tumor tissue extracts. ENPP1, EIF3E, and GNB4 were significantly associated with progression-free survival upon tamoxifen treatment for recurrent disease. Differential abundance of our top discriminating protein, extracellular matrix metalloproteinase inducer, was validated by tissue microarray in an independent patient cohort (n = 156). Extracellular matrix metalloproteinase inducer levels were higher in therapy-resistant tumors and significantly associated with an earlier tumor progression following first line tamoxifen treatment (hazard ratio, 1.87; 95% confidence interval, 1.25-2.80; p = 0.002). In summary, comparative proteomics performed on laser capture microdissection-derived breast tumor cells using nano-LC-FTICR MS technology revealed a set of putative biomarkers associated with tamoxifen therapy resistance in recurrent breast cancer.

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19329653      PMCID: PMC2690491          DOI: 10.1074/mcp.M800493-MCP200

Source DB:  PubMed          Journal:  Mol Cell Proteomics        ISSN: 1535-9476            Impact factor:   5.911


Tamoxifen is an antiestrogenic agent that has been widely and successfully used in the treatment of breast cancer over the past decades (1). Tamoxifen targets and inhibits the estrogen receptor-α, which is expressed in ∼70% of all primary breast tumors and is known to be important in the development and course of the disease. When diagnosed at an early stage, adjuvant systemic tamoxifen therapy can cure ∼10% of the patients (1). In recurrent disease, ∼50% of patients have no benefit from tamoxifen (intrinsic resistance). From the other half of patients who initially respond to therapy with an objective response (OR)1 or no change (NC), a majority eventually develop progressive disease (PD) due to acquired tamoxifen resistance (2, 3). With the markers available to date we can insufficiently predict therapy response. Therefore, identification of new biomarkers that can more effectively predict response to treatment and that can potentially function as drug targets is a major focus of research. The search for new biomarkers has been enhanced by the introduction of microarray technology. Gene expression studies have resulted in a whole spectrum of profiles for e.g. molecular subtypes, prognosis, and therapy prediction in breast cancer (4–10). Corresponding studies at the protein level are lagging behind because of immature technology. However, protein-level information is crucial for the functional understanding and the ultimate translation of molecular knowledge into clinical practice, and proteomics technologies continue to progress at a rapid pace. Proteomics studies reported so far have mainly been performed with breast cancer cell lines using either two-dimensional gel electrophoresis (11–14) or LC-MS for protein separation (15–17). However, it is known that the proteomic makeup of a cultured cell is rather different from that of a tumor cell surrounded by its native microenvironment (18). Furthermore cell lines lack the required follow-up information for answering important clinical questions. In addition, tumor tissues in general and breast cancer tissues in particular are very heterogeneous in the sense that they harbor many different cell types, such as stroma, normal epithelium, and tumor cells. LCM technology has emerged as an ideal tool for selectively extracting cells of interest from their natural environment (19) and has therefore been an important step forward in the context of genomics and proteomics cancer biomarker discovery research. LCM-derived breast cancer tumor cells have been used for comparative proteomics analyses in the past using both two-dimensional gel electrophoresis (20, 21) and LC-MS (22). This has resulted in the identification of proteins involved in breast cancer prognosis (21) and metastasis (20, 22). Although these studies demonstrated that proteomics technology has advanced to the level where it can contribute to biomarker discovery, major drawbacks, such as large sample requirements (42–700 μg) and low proteome coverage (50–76 proteins), for small amounts of starting material (∼1 μg) persist. Because clinical samples are often available in limited quantities, in-depth analysis of minute amounts of material (<1 μg) necessitates advanced technologies with sufficient sensitivity and depth of coverage. Recently we demonstrated the applicability of nano-LC-FTICR MS in combination with the accurate mass and time (AMT) tag approach for proteomics characterization of ∼3,000 LCM-derived breast cancer cells (23). This study showed that proteome coverage was improved compared with conventional techniques. The AMT tag approach initially utilizes conventional LC-MS/MS measurements to establish a reference database of AMT tags specific for a particular proteome sample (e.g. breast cancer tissue). Each tag consists of a theoretical mass calculated from the peptide sequence, an LC normalized elution time (NET) value, and an indicator of quality. The AMT tag database serves as a “lookup table” for identifying peptides in subsequent quantitative LC-MS analyses. Substituting routine LC-MS/MS analyses (shotgun approach) with LC-FTICR MS analyses (AMT tag approach) significantly increases overall throughput and sensitivity while reducing sample requirements. Additionally quantitative intensity information related to the abundance of the protein can be discerned from these MS analyses (24). In the present study, we used the same strategy to analyze eight pools of tumor cells in duplicate or triplicate (resulting in 19 samples) derived from 51 fresh frozen primary invasive breast carcinomas that appeared to be either sensitive or resistant to tamoxifen treatment after recurrence. This work resulted in the identification of a putative protein profile associated with tamoxifen therapy resistance. In addition, the top discriminating protein of the putative profile, extracellular matrix metalloproteinase inducer (EMMPRIN), was validated in an independent patient cohort and was significantly associated with resistance to tamoxifen therapy and shorter time to progression upon tamoxifen treatment in recurrent breast cancer.

EXPERIMENTAL PROCEDURES

Patients and Tumor Tissues—

For the discovery phase of the study, 51 different fresh frozen primary breast cancer tissues from our liquid N2 tissue bank were used. Primary tumors were selected from patients that did not receive any systemic adjuvant hormonal therapy and were treated with the antiestrogen tamoxifen as first line therapy upon detection of recurrent breast cancer. Furthermore tumors were selected on the basis of positive estrogen receptor-α expression as assessed by ligand binding assay or enzyme-linked immunosorbent assay (≥10 fmol/mg of cytosolic protein). Tumor tissues were divided into two classes based on the type of response to tamoxifen therapy. 24 tumors were sensitive to tamoxifen therapy, showing either complete remission (CR) or partial remission (PR), and were assigned as OR. 27 tumors were resistant to therapy, showing an increase in tumor size, and were designated as PD. Clinical response was defined by standards of the International Union against Cancer criteria of tumor response (25). 20 of the above mentioned tumor tissues were selected for the verification study. Tissues were included based on their high tumor cell content of >70%. Tumor cell content was judged after hematoxylin/eosin stain of a separately cut 4-μm tissue section. For immunohistochemical validation, a primary breast tissue microarray (TMA) containing 0.6-μm cores of formalin-fixed paraffin-embedded tumors was used. Within the TMA, there were 156 tumor tissues from patients that received tamoxifen as first line treatment upon recurrence. Median follow-up of patients alive after primary surgery was 103 months (range, 16–222 months) and 51 months after the onset of tamoxifen treatment (range, 9–136 months). Included patients showed CR, PR, PD, and NC of >6 and ≤6 months. Further patient and tumor characteristics are summarized in Table IV.
T

Patient characteristics

Patient and tumor characteristics for samples included in the validation set are shown.

ER, estrogen receptor α; PgR, progesterone receptor.

CharacteristicsNumbersMedianPercent
Patients130100
Age (years)
    Primary surgery53.5
    Start first line56.5
Menopausal status at start first line
    Pre4030.8
    Post9069.2
ER (fmol/mg protein)97
PgR (fmol/mg protein)54.5
Response
    Clinical benefit: CR, PR, S.D. ≥ 6 months7759.2
    No clinical benefit: S.D. < 6 months, PD5340.8
Dominant site of relapse
    Local regional relapse1511.5
    Bone6550.0
    Other5038.5
Disease-free interval (months)
    ≤121612.3
    12–365945.4
    ≥365542.3
Nodal status
    N06247.7
    N1–33023.1
    N > 33426.2
    Unknown43.1
Tumor size
    ≤2 cm6751.5
    >2 cm6348.5
Tumor grade
    Poor4635.4
    Unknown5240.0
    Good/moderate3224.6
This study was approved by the Medical Ethics Committee of the Erasmus Medical Center Rotterdam, The Netherlands (MEC 02.953) and was performed in accordance to the Code of Conduct of the Federation of Medical Scientific Societies in The Netherlands, and wherever possible we adhered to the Reporting Recommendations for Tumor Marker Prognostic Studies (REMARK) (26).

Laser Capture Microdissection—

LCM was performed on 8-μm tissue cryosections that were fixed in ice-cold 70% ethanol and stained with hematoxylin as described previously (27). Briefly slides were washed in Milli-Q water, stained for 30 s in hematoxylin, washed again in Milli-Q water, subsequently dehydrated twice in 50, 70, 95, and 100% ethanol for 30 s each, and air-dried. Laser microdissection and pressure catapulting was performed directly after staining. Tumor epithelial cells were collected, using a P.A.L.M. LCM device, type P-MB (P.A.L.M. Microlaser Technologies AG, Bernried, Germany). From each cryosection an area of ∼500,000 μm2 that corresponds to ∼4,000 cells (area × slide thickness/1,000-μm3 cell volume) was collected in P.A.L.M. tube caps containing 10 μl of 0.1% RapiGest (Waters Corp., Milford, MA) and then spun down into 0.5-ml Eppendorf Protein LoBind tubes (Eppendorf, Hamburg, Germany). Collected cells were stored at −80 °C until further processing. Because we used small numbers of microdissected cells in this study, the protein concentration was typically below the detection limit of any protein assay. Hence the protein concentration for samples undergoing LC-MS analysis was estimated based on microdissected tissue area and extrapolations from protein assays performed on whole tissue lysates (i.e. ∼4,000 cells corresponds to ∼400 ng of total protein).

Sample Preparation—

Microdissected cell batches were pooled into OR and PD tumor groups (corresponding to ∼25,000 cells/pool) prior to sample preparation. Briefly cells were lysed by sonication directly in RapiGest solution using an Ultrasonic Disruptor Sonifier II (Model W-250/W-450, Branson Ultrasonics, Danbury, CT) for 1 min at 60% amplitude. Proteins were subsequently equilibrated for 2 min at 37 °C, denatured at 99 °C for 5 min, and processed for overnight trypsin digestion according to the instructions of the manufacturer using MS-grade porcine modified trypsin gold (Promega, Madison, WI) at a 1:20 (w/v) ratio as described previously (23). Digestion was stopped by incubation with 0.5% TFA at 37 °C for 30 min. Remaining cellular debris were spun down for 20 min at 10,600 × g, and supernatant was transferred to a new Eppendorf LoBind cup. Peptides were lyophilized and stored at −80 °C until further analysis. Prior to FTICR MS analysis, samples were reconstituted in 18 μl of NH4HCO3, vortexed briefly, and spun down again for 10 min at 10,600 × g to pellet any contaminating particulate material. For the verification study, whole tissue lysates were prepared from 20 tumor tissues from which 6 × 4-μm cryosections per sample were cut. Tissue cryosections were placed in a Teflon container, frozen in liquid N2, and then pulverized in a frozen state in a microdismembrator (Braun Biotech International). The resulting powder was resuspended in 100 μl of 0.1% RapiGest. Cell lysis and trypsin digestion were performed as described above. Prior to trypsin digestion, a BCA protein assay (Pierce) was performed to determine protein concentration. From each total tissue sample, 50 μg of protein lysate was used for trypsin digestion at a trypsin:protein ratio of 1:50 (w/w) and further handled as described above.

Nano-LC-FTICR MS—

Nano-LC-FTICR MS was performed using a slightly modified procedure as described previously (23, 28). Each pooled sample was analyzed in triplicate by injecting 4 μl (equivalent to ∼5,500 cells or ∼550 ng) directly via a 3-μl sample loop onto a custom-built reversed-phase (RP) 80-cm × 50-μm-inner diameter fused silica capillary column (Polymicro Technologies, Phoenix, AZ) packed in house with 3-μm C18 particles (300-Å pore size; Jupiter, Phenomenex, Torrence, CA) and subjected to an applied pressure of 10,000 p.s.i. through a high pressure syringe pump (ISCO, Lincoln, NE). Flow rate over the column was ∼ 250 nl/min. After an injection period of 45 min, peptides were eluted from the column using a gradient from 100% mobile phase A (99.75% H2O, 0.2% acetic acid, 0.05% TFA) to ∼70% mobile phase B (90% acetonitrile, 9.9% H2O, 0.1% TFA) over a ∼200-min period. The nano-LC column outlet was coupled on line to a 7-tesla FTICR mass spectrometer through a nano-ESI emitter; 4,000 mass spectra were acquired in each LC-MS analysis using 0.3-s ion accumulation time and 50-μs gas pulse (29).

LC-MS/MS—

In the verification study, tryptic digests of 20 different whole tissue lysates (8 OR and 12 PD) were analyzed on a custom-built RPLC system via ESI utilizing an ion funnel (30) coupled to a ThermoFisher Scientific LTQ-Orbitrap mass spectrometer (ThermoFisher Scientific, San Jose, CA). Separation was performed using a custom-made column (60 cm × 75-μm inner diameter) packed in house with Jupiter particles (C18 stationary phase, 5-μm particles, 300-Å pore size). The capillary RPLC system used for peptide separations has been described previously (23, 28). Mobile phase A consisted of 0.1% formic acid in water, and mobile phase B consisted of 100% acetonitrile. The column was equilibrated at 10,000 p.s.i. with 100% mobile phase A. A mobile phase selection valve was switched 50 min after injection to create a near exponential gradient as mobile phase B displaced mobile phase A in a 2.5-ml mixer. A split was used to provide an initial flow rate through the column of ∼400 nl/min. The column was coupled to the mass spectrometer using an in-house manufactured ESI interface with homemade 20-μm-inner diameter chemically etched emitters (31). The heated capillary temperature and spray voltage were 200 °C and 2.2 kV, respectively. Mass spectra were acquired for 80 min over the m/z range 400–2,000 at a resolving power of 100,000. An inclusion list with m/z values corresponding to peptide masses of 100 target proteins was used to select precursor ions. In cases when no targeted precursor ion was present, a maximum of six data-dependant LTQ tandem mass spectra were recorded for the most intense peaks in each survey mass spectrum.

Protein Identification and Quantitation—

FT mass spectra, acquired with the 7-tesla FTICR or LTQ-Orbitrap, were processed using ICR-2LS, Decon2LS (32), and VIPER v3.39 software developed in house (33). The output data files were visualized as two-dimensional displays of peptide monoisotopic mass versus LC elution time (i.e. spectrum number). Next MS peaks with similar measured neutral masses and LC elution times were clustered to form LC-MS features (or unique mass classes). LC elution times were converted into NET to make multiple LC-MS runs comparable (34). The assembled set of LC-MS features was then searched against the human mammary epithelial cell line AMT tag database (35), MCF-7 epithelial breast carcinoma cell line AMT tag database (36), and a composite database for a mixture of human mammary epithelial cells and MCF-7-c18, BT-474, MDA-231, and SKBR-3 breast cancer cell lines (37) using stringent filtering criteria: Xcorr ≥1.5, 2.7, and 3.3 for 1+, 2+, and 3+ fully tryptic peptides, respectively, and Xcorr ≥3.0, 3.7, and 4.5 for 1+, 2+, and 3+ partially tryptic peptides (with a minimum length of 6 amino acids), respectively, as reported previously (23). The LCMSWARP (liquid chromatography-based mass spectrometric warping and alignment of retention times of peptides) algorithm (38) was used to match LC-MS features to AMT tags. A tolerance window of mass measurement accuracy <6 ppm and NET error <0.025 was applied to ensure reliable peptide identification with false discovery rate of ≤10%. Identified peptides were coupled to their corresponding proteins using the human International Protein Index (IPI) databases, 2006 version 3.20 including 61,255 protein entries (discovery phase) and 2008 version 3.39 including 69,731 protein entries (verification phase), and in-house built Qrollup v2.2 software. Two or more constituent peptides were required to confidently identify a protein. In the case of proteins with multiple splice isoforms, these isoforms were only specifically listed if they were identified by at least one unique peptide (in addition to overlapping peptide sequences). For average abundance calculation, only highly abundant and, where possible, unique peptides were used. Protein names and descriptions were then converted to TrEMBL, NCBI (National Center for Biotechnology Information), and Swiss-Prot database formats. Protein information was retrieved from European Molecular Biology Laboratory-European Bioinformatics Institute databases. Proteins identified from all available AMT tag databases were assembled into a single list, giving rise to some redundancy. A final non-redundant protein list was generated using ProteinProphet software (SourceForge, Inc.). MS peak intensities were used as a measure of the relative peptide abundances. The mean abundance of the LC-MS features was used, and the relative abundances of constituent peptides were averaged to derive the relative abundance of the parent protein. Tandem mass spectra acquired with the LTQ-Orbitrap were searched against the human IPI 2008 database using TurboSEQUEST v27. We used in-house developed DeconMSn software to correct the monoisotopic masses prior to generation of the dta files used for subsequent database search. Peptide sequences were considered confident with the following filtering criteria: Xcorr of 1.9, 2.2, and 3.75 for 1+, 2+, and ≥3+ peptides and ΔC ≥ 0.1. We also applied the AMT tag strategy to identify peptides in survey mass spectra acquired with the LTQ-Orbitrap by matching the accurate masses and elution times against the composite breast cancer cell line AMT database. Peak intensities measured in high resolution survey spectra were used to retrieve relative abundance information as described above.

Immunohistochemistry—

Immunohistochemical validation was performed with an in-house prepared TMA. The TMA was established in close collaboration with a dedicated pathologist (M. A. d. B.) who evaluated all tissues for histology, grade, and Bloom and Richardson scoring (39). Tissue sections of 4 μm were stained overnight at 4 °C for EMMPRIN using a 1:100 diluted antibody directed against the C terminus of the protein (8D6, sc-21746, Santa Cruz Biotechnology, Inc., Santa Cruz, CA). Antigen retrieval was performed prior to antibody incubation for 40 min at 95 °C using DAKO retrieval solution, pH 6 (DakoCytomation, Carpinteria, CA) after which the slides were cooled down to room temperature. Staining was visualized using the anti-mouse EnVision+® System-HRP (DAB) (DakoCytomation) according to the instructions provided by the manufacturer. Scoring of immunostaining was performed by two independent observers who recorded both percentage of positive tumor cells and staining intensity.

Data Analysis and Statistics—

Relative abundance levels of all identified proteins in one sample were intra- and intersample normalized by log2 transformation using in-house developed MultiAlign software v1.1. Subsequently Z-score normalization was applied to each protein across the samples using the formula (value − mean)/standard deviation. Sample sets 1 and 2 were separately Z-score-normalized to correct for time and experimental variation. Normalized values were subjected to class comparison and prediction analysis using BRB-ArrayTools version 3.5.0 beta1 developed by Dr. Richard Simon and Amy Peng Lam. Class comparison involved finding differentially abundant proteins between therapy-sensitive (OR) and therapy-resistant (PD) tumors using a univariate two-sample t test with a significance threshold of 0.05. All data from sample sets 1 and 2 were combined to create a general list of differentially abundant proteins between OR and PD tumors and subjected to a Mann-Whitney Wilcoxon rank sum test performed with the STATA statistical package, release 10.0 (STATA, College Station, TX). Hierarchical clustering of the data was performed using the OmniViz Desktop 3.8.0 package. For clustering, average linkage and the Euclidian similarity metric were used. Principal component analysis (PCA) was performed using Spotfire DecisionSite 8.1, version 14.3. Kaplan-Meier survival analysis as a function of time to progression after the onset of first line tamoxifen treatment as well as correlation with response and other clinical parameters was performed using STATA. The primary end point for the Cox proportional hazard model was disease progression after the onset of tamoxifen treatment.

RESULTS

Protein Identification by Nano-LC-FTICR MS—

Large scale protein identification is a pivotal step in the discovery of a predictive protein profile. We have previously shown that nano-LC-FTICR MS coupled to AMT tag-based protein identification provides the sensitivity and proteome coverage required to achieve this goal (23). In the present study, we describe the clinical applicability of this approach by analyzing the proteome of eight pools of tumor cells procured by LCM from breast cancer tissues derived from 24 tamoxifen therapy-sensitive and 27 therapy-resistant patients. Fig. 1 summarizes our study design.
F

Experimental flow chart. All steps from sample preparation, MS analysis, and protein identification are described in the text. nLC, nano-LC; DB, database.

Tryptic peptides corresponding to ∼550 ng of protein lysate were analyzed using nano-LC-FTICR MS. Resulting data sets were visualized in a form of a two-dimensional plot, displaying monoisotopic mass versus spectrum number (NET) as shown in supplemental Fig. 1. On average ∼40,000 LC-MS features were detected in each analysis. These features were matched against previously established breast (cancer) cell line AMT tag databases. On average, ∼20% of LC-MS features matched with peptides in the database and were thus identified as illustrated in supplemental Fig. 1B. For this study, two sample sets were independently prepared and analyzed, using a different set of tumors, as shown in Fig. 2. Sample set 1 consisted of 24 tumors of which 11 were sensitive (OR) and 13 were resistant (PD) to tamoxifen treatment. Sample set 2 contained 27 tumors, 13 OR and 14 PD tissues. Microdissected cells were pooled to average sample heterogeneity and to enable triplicate analysis and were analyzed by nano-LC-FTICR MS. Replicate MS analyses, for which technical problems such as clogged tips were observed, were excluded from further data analysis, leaving 19 LC-MS data sets for further analysis (Table I). In total, 17,263 peptides corresponding to 2,556 proteins were identified through AMT tag database matching. Between the two sample sets 1,713 proteins, identified by 13,729 peptides, were identical, corresponding to an overlap of 67% (Table I). Protein abundance was computed by averaging intensities of the highly abundant peptides identified for the given protein and, where possible, using unique peptide sequences to account for multiple splice isoforms. It needs to be mentioned that it is difficult to correctly assess average protein abundance of highly homologous proteins that may have different abundance levels if these proteins are identified through identical peptides. In those cases, the additional use of unique peptide sequences may partly overcome this problem. Information on protein identification, such as filtering scores, assigned peptides and number of peptides used for abundance, mass and NET errors, and additional information is reported in supplemental Table S1. Normalized protein abundances for 1,713 proteins are displayed in supplemental Table S2.
F

Data analysis flow chart. Tryptic digests from two independently processed sample sets were analyzed in triplicate by nano-LC-FTICR. MS peak intensity-derived peptide/protein abundances were subjected to statistical analysis to determine differentially abundant proteins between OR and PD samples in both sample sets combined as well as in the two samples sets separately. Subsequently hierarchical clustering and class prediction was performed. nLC, nano-LC.

T

FTICR MS summary

Peptide and protein information for LC-MS analyses that were used for further statistical analysis are summarized for sample set 1, set 2, the combined set, and the overlap.

Data setTumor set 1Tumor set 2TotalOverlap sets 1 + 2
Number of analyzed samples5 OR; 4 PD5 OR; 5 PD10 OR; 9 PD
Total unique peptides14,93316,05917,26313,729
Total unique proteins1,9982,2712,5561,713 (67%)

Discovery of Tamoxifen Therapy Response-associated Proteins—

For the discovery of proteins that were associated with tamoxifen resistance, the 1,713 overlapping proteins were subjected to statistical analysis. The univariate two-sample t test from BRB-ArrayTools was used to search for differentially abundant proteins between OR and PD samples (Fig. 2). Protein abundances from all OR and PD samples from the two sample sets were analyzed together and compared with each other. The BRB analysis resulted in a list of 153 discriminating proteins using a significance threshold of p < 0.05 (the complete BRB analysis list is provided in supplemental Table S3). These 153 proteins were subsequently subjected to a Wilcoxon rank sum test, which narrowed the list down to 100 proteins with a p value <0.05. These 100 differentially abundant proteins were designated as a putative protein profile associated with the type of response to tamoxifen therapy. In this putative protein profile, 46 proteins had higher relative abundance in PD, and 54 had higher abundance in OR tissues. Protein information as well as OR:PD ratios and p values are listed in Table II in which the order and numbering of proteins corresponds to the order in Fig. 4. Our top discriminating protein in the putative protein profile was splice isoform 2 of basigin precursor (number 13 in Table II), also described in the literature as CD147 or EMMPRIN.
T

Tamoxifen-response protein profile

Name, ratio, p value, IPI number, molecular mass, localization are given on the putative 100-protein profile. The order and numbering are identical to Fig. 4. EPH, ephrin; snRNA, small nucleolar RNA; GDNF, glial cell-derived neurotrophic factor.

No.aProtein descriptionRatio of geometric means, OR:PDWilcoxon rank sumIPIGene symbolMolecular massLocalizationb
kDa
1EPH receptor B20.4580.0057IPI00252979.7EPHB2 (ERK, EPTH3)110Membrane
2Splice isoform 1 of protein kinase C and casein kinase substrate in neurons protein 20.5380.0366IPI00027009.2PACSIN256Cytoplasm
340 S ribosomal protein S4, X isoformc,d0.4780.0412IPI00217030.5RPS4X (SCAR)29Ribosome
4Calponin-2c0.4780.0134IPI00015262.9CNN233.5Cytoskeleton
5Calgranulin Bc,d0.4830.0127IPI00027462.1S100A9 (CAGB, MRP14)13Cytoplasm
6Anchor attachment protein 10.5130.047IPI00021594.2GPAA168ER
7Epididymal secretory protein E1 precursor0.540.0085IPI00301579.3NCP216.5Secreted protein
8Pyrroline-5-carboxylate reductase 1c,d0.490.0274IPI00550882.2PYCR1 (P5CR1)33u
9Nucleolar protein NOP5c0.4530.0202IPI00006379.1NOP5 (HSPC120)60Nucleus
10Annexin A80.3940.0127IPI00218835.4ANXA8 (ANX8)37u
11Lysyl-tRNA synthetasec,d0.5390.0127IPI00014238.2KARS (KIAA0070)68Cytoplasm
12Syntaxin 70.4840.0338IPI00289876.2STX730Endosome
13Splice isoform 2 of basigin precursore0.3420.0004IPI00019906.1BSG (EMMPRIN, CD147)42Cell membrane
14FLJ20625 protein0.4570.0411IPI00016670.2FLJ2062518u
15Eukaryotic translation initiation factor 50.4910.003IPI00022648.2EIF549Cytosol
16Splice isoform 1 of Surfeit locus protein 4c,d0.4490.0097IPI00005737.1SURF430ER
17Splice isoform 1 of calumenin precursorc,d0.5140.0222IPI00014537.1CALU38ER/Golgi
18Coronin-1Ac,d0.5070.0179IPI00010133.1CORO1A (CLIPINA)51Actin cytoskeleton
19RAS-related protein RAB-10c0.5160.0412IPI00016513.3RAB1022.5Cell membrane
20Splice isoform long of potential phospholipid-transporting ATPase IIA0.5190.0221IPI00024368.1ATP9A (ATPIIA, KIAA0611)119Membrane
21DNA replication licensing factor MCM20.5280.0292IPI00184330.5MCM2 (BM28, CDCL1, KIAA0030)102Nucleus
22Splice isoform long of splicing factor, proline- and glutamine-richc,d0.4750.0179IPI00010740.1SFPQ (PSF)76Nucleus
23Collagen-binding protein 2 precursorc,d0.5340.0114IPI00032140.2SERPINH1 (SERPINH2, CBP2, HSP47, Colligin)46ER
24Small nuclear ribonucleoprotein SM D2c,d0.4950.0412IPI00017963.1SNRPD2 (Sm-D2)13.5Nucleus
254F2 cell surface antigen heavy chainc,d0.5460.05IPI00027493.1SLC3A2 (MDU1)58Membrane
26Growth factor receptor-bound protein 70.4830.038IPI00448767.3GRB760u
27Copine I0.5070.0221IPI00018452.1CPNE1 (CPN1)59u
28Serum amyloid A protein precursor0.4420.0055IPI00022368.1SAA1 (SAA2)13.5u
29Ephrin type-A receptor 2 precursor0.5030.0221IPI00021267.1EPHA1 (ECK)108Membrane
30T-complex protein 1, η subunitc,d0.4920.05IPI00018465.1CCT7 (TCP-1η, CCTH)59Cytoplasm
31Guanine nucleotide-binding protein β subunit 4c,d0.530.0403IPI00012451.1GNB437u
32Metalloprotease 10.4940.022IPI00219613.3PITRM1 (hMP1)117Mitochondrion
33C-1-Tetrahydrofolate synthase, cytoplasmicc0.4850.0275IPI00218342.9MTHFD1 (MTHFC)101Cytoplasm
34Predicted: septin 80.540.0179IPI00022082.4SEPT8 (KIAA0202)50u
35Acetolactate synthase homolog0.4650.0135IPI00549240.1OR10B1P (ILVBL)68u
36Predicted: hypothetical protein XP_1143170.5110.0085IPI00145623.1RPL22L115–21u
37Prefoldin subunit 60.5120.0395IPI00005657.1PFDN6 (HKE2)14.5Cytosol
38NADH-cytochrome b5 reductasec,d0.3950.0071IPI00328415.8CYB5R3 (DIA1)34ER/mitochondrion/cytoplasm
39Adenylate kinase isoenzyme 4, mitochondrial0.4280.0036IPI00016568.1AK3L1(AK3, AK4)25Mitochondrion
40Phosphoprotein enriched in astrocytes 15c0.4810.0363IPI00014850.3PEA15 (PED)15Cytoplasm
41Thioredoxin domain-containing protein 50.510.0496IPI00171438.2TXNDC5 (TLP46, ERp46)48ER
42Coronin-1Bc,d0.4250.0055IPI00007058.1CORO1B54Leading edge
43Ephrin type-B receptor 3 precursor0.4550.0077IPI00289329.1EPHB3 (ETK2, HEK2)110Membrane
44RAB11 family-interacting protein 1Bc0.5180.0191IPI00419433.1RAB11 FIP1(RCP)137Membrane
45Splice isoform 1 of exocyst complex component SEC60.4670.0231IPI00157734.2EXOC3 (SEC6, SEC6L1)87u
46Splice isoform 1 of protein C20ORF116 precursor0.540.0266IPI00028387.3C20ORF11636Secreted protein
47Hypothetical protein DKFZP434E2480.5250.0221IPI00300094.5LSG175u
48Adenylate kinase 2 isoform Ac,d2.0360.0338IPI00215901.1AK2 (ADK2)26Mitochondrion
49Trifunctional enzyme α subunit, mitochondrial precursorc,d2.1050.0071IPI00031522.2HADHA (HADH)83Mitochondrion
50Nucleosome assembly protein 1-like 1c,d1.8470.0275IPI00023860.1NAP1L1 (NRP)45Nucleus
51Secretory carrier-associated membrane protein 12.3010.0055IPI00005129.6SCAMP140Membrane
52Sphingosine-1-phosphate lyase 1d1.9240.0135IPI00099463.2SGPL164ER membrane
53Splice isoform 1 of glucosamine-fructose-6-phosphate aminotransferase (isomerizing) 1c,d2.3770.0101IPI00217952.6GFPT1 (GFAT)79u
54Ubiquinol-cytochrome c reductase iron-sulfur subunit, mitochondrial precursorc,d1.9910.0236IPI00026964.1UQCRFS130Mitochondrion
55U6 snRNA-associated SM-like protein LSM21.9060.0394IPI00032460.3LSM2 (G7B)10Nucleus
56Lisch protein, isoform 22.1780.0084IPI00409640.1LSR (LISCH)71Membrane
57Splice isoform 1 of epsin 42.3850.0064IPI00291930.5CLINT1 (EPN4)68Cytoplasm
58Endothelial protein C receptor precursor1.9320.0178IPI00009276.1PROCR (EPCR)30Membrane
59Annexin VI isoform 2c,d2.0350.0114IPI00002459.3ANXA675u
60Pyridoxine-5′-phosphate oxidase2.0810.0238IPI00018272.3PNPO30u
61Ectonucleotide pyrophosphatase/phosphodiesterase 1d2.0870.0193IPI00184311.2ENPP1 (NPPS, PC1)105Membrane
62Protein C20ORF178, charged multivesicular body protein 4bc,d1.9970.0275IPI00025974.3CHMP4B (SHAX1)25Cytoplasm
63Occludind1.8480.0178IPI00003373.1OCLN59Membrane
64Adipose most abundant gene transcript 2c2.0990.0141IPI00020017.1APM2(C10ORF116)8u
65Eukaryotic translation initiation factor 3 subunit 42.4130.0062IPI00290460.3EIF3F36u
66Hypothetical protein MGC5395c,d1.8960.05IPI00031605.1AHNAK16u
67Splice isoform 2 of methylcrotonoyl-CoA Carboxylase β chain, mitochondrial precursor1.9020.0066IPI00294140.4MCCC2 (MCCB)58Mitochondrion
68Tubulin β-3 chainc,d1.9070.009IPI00013683.2TUBB3 (TUBB4)50u
69KIAA2014 protein (formin-like protein 1)2.0910.0236IPI00385874.4KIAA2014117u
70Hypothetical protein FLJ906972.3480.0377IPI00329600.3u
71Hypothetical protein, isoform 1 of protein CDV3 homolog1.9860.0193IPI00014197.1CDV322–27u
72ATP synthase oligomycin sensitivity conferral protein, mitochondrial precursorc,d2.1530.0179IPI00007611.1ATP5O (ATPO)23Mitochondrion
73Ubiquilin-21.8430.0412IPI00409659.1UBQLN2 (PLIC2)66Cytoplasm/nucleus
74Ubiquitin and ribosomal protein S27Ac,d2.2730.0071IPI00179330.5RP27A18Ribosome
75Tubulin α-1 chainc,d1.9530.0222IPI00007750.1TUBA150u
76ATP synthase α chain, mitochondrial precursorc,d1.9470.0412IPI00440493.2ATP5O (ATPO)60Mitochondrion
77Chaperonin containing TCP1, subunit 3c,d1.8720.0412IPI00290770.2CCT360Cytoplasm
78Nascent polypeptide-associated complex α subunitc,d2.1590.0143IPI00023748.3NACA (HSD48)23Cytoplasm/nucleus
79Emerin2.1890.0178IPI00032003.1EMD29Nuclear inner membrane
80Hypothetical protein KIAA0152c,d1.9740.0412IPI00029046.1KIAA015232Membrane
81Histone H1.5c,d2.5330.05IPI00217468.2HIST1H1B (H1F5)23Nucleus
82Cation channel TRPM4B2.030.0465IPI00294933.6TRPM4B (TRPM4)134Membrane
83Calcyclinc,d2.1220.0363IPI00027463.1S100A6 (CACY)10Cytoplasm/nucleus
84Splice isoform 2 of GDNF family receptor α 1 precursor2.260.0184IPI00220291.1GFRA1 (GDNFRA, TRNR1)51Cell membrane
85Complement component 1, Q subcomponent-binding protein, mitochondrial precursorc,d2.2220.0274IPI00014230.1C1QBP (GC1QBP)31Mitochondrion
86Chloride intracellular channel protein 4c,d2.0230.0275IPI00001960.2CLIC429Cytoplasm/mitochondrion
87Eukaryotic translation initiation factor 3 subunit 6c,d2.0760.0178IPI00013068.1EIF3E (INT6)52Cytoplasm
88Protein-disulfide isomerase A4 precursorc,d2.1570.0275IPI00009904.1PDIA4 (ERP70)73ER
89Hypothetical protein MGC5352c,d1.8670.0394IPI00063242.3PGAM528u
90Splice isoform 1 of polypeptide N-acetylgalactosaminyltransferase 3d2.2290.0066IPI00004670.1GALNT373Golgi
91OTTHUMP00000028732 (thioredoxin, mitochondrial precursor)1.8230.0462IPI00017799.3TXN2 (TRX2)18–22Mitochondrion
92Fatty acid-binding protein, epidermal2.2030.0075IPI00007797.1FABP515Cytoplasm
93Programmed cell death 6-interacting protein, PDCD6IP proteinc,d2.2020.0199IPI00246058.3PDCD6IP (AIP1)97Cytoplasm
94Ezrin-radixin-moesin-binding phosphoprotein 50c,d2.420.0025IPI00003527.3SLC9A3R1 (EBP50, NHERF1)39Intracytoplasmic membrane, actin cytoskeleton
95Splice isoform 1 of ubiquitin thiolesterase protein1.9740.05IPI00549574.2OTUB1u
96Endozepinec,d2.1840.0211IPI00010182.3ACBP (DBI, EZ)10u
97Phosphoribosylformylglycinamidine synthase2.2210.0177IPI00004534.3PFAS (KIAA0361)145Cytoplasm
98Histidine triad nucleotide-binding protein 1c1.8530.0274IPI00239077.4HINT1 (PKCI1)14Cytoplasm/nucleus
99BAG family molecular chaperone regulator-31.970.0175IPI00000644.3u
100Exocyst complex component SEC8d1.8560.0109IPI00059279.5EXOC4 (KIAA1699, SEC8)110u

Numbering according to Fig. 4.

u, data unknown in database; ER, endoplasmic reticulum.

Presence verified in individual tumors by MS/MS.

Presence verified in individual tumor MS survey spectrum and quantified by AMT database match.

Validated by immunohistochemistry.

F

Hierarchical clustering of OR and PD samples. Red and blue colors indicate relative high and low protein abundance, respectively, and white equals median abundance. Gray bars represent sample and protein clusters. The length of the tree arms is inversely correlated with similarity. Proteins are listed vertically from top to bottom and numbered from 1 to 100 in the same order as in Table II.

Multiple isoforms of EMMPRIN have been described that are identical in their C-terminal sequence but vary in length and sequence at the N-terminal part of the protein (EntrezGene 682). In our final, non-redundant protein list we report the identification of isoforms 1 and 2 by five and six peptides, respectively (supplemental Table S1). Only one of the six peptides (AAGTVFTTVEDLGSK) was unique for isoform 2. Isoform 1 is the longer variant of 385 amino acids, whereas isoform 2 lacks amino acids 24–139. Peptide AAGTVFTTVEDLGSK is uniquely positioned at the splice site in which the first two amino acids (AA) are positioned at residues 22 and 23 and the third amino acid (Gly) is positioned at residue 140 in the full-length sequence. Therefore, this peptide sequence is specific for isoform 2. The raw mass spectrum for EMMPRIN peptide AAGTVFTTVEDLGSK (Mr = 1,496.75 and m/z = 748.38) showed a 3-fold higher intensity for the PD sample (Fig. 3) in comparison with the OR sample (Fig. 3). The spectra also showed that there is no significant difference in peak intensity between OR and PD for the second feature appearing at m/z 749.76, suggesting that the observed difference in peak intensity for the AAGTVFTTVEDLGSK peptide is not an artifact introduced by e.g. loading differences. It needs to be mentioned, however, that we did not use single spectra to determine abundance ratios of peptides but LC-MS feature intensity, which is defined as a sum of intensities of all members of the unique mass class. Using LC-MS feature intensity, we investigated the relative abundance of three EMMPRIN peptides across all of the samples. The peptides AAGTVFTTVEDLGSK and GGVVLKEDALPGQK were present in virtually all samples and clearly showed a 2–3-fold increase in abundance in PD samples. SESVPPVTDWAWYK peptide was only present in a few samples but showed the same increase in PD (Fig. 3). This increase in relative peptide abundance therefore correlated very well with the observed 2-fold increase of EMMPRIN at the protein level (Fig. 3).
F

EMMPRIN differential peptide and protein abundance. Representative mass spectra of an LC-MS feature identified as EMMPRIN peptide AAGTVFTTVEDLGSK in OR (A) and PD (B) indicate a 3-fold increase in intensity for PD sample. C, relative abundance ratios of four EMMPRIN peptides in OR (gray) and PD (black) samples. D, average relative abundance of EMMPRIN protein in all OR and PD samples. p value was calculated using the Wilcoxon rank sum test. Box-Whisker plot in which each dot represents the value of a sample, and the error bars show the highest and lowest value. The line in the box represents the mean value.

To test the predictive power of the putative profile of 100 proteins within the two sample sets, supervised hierarchical clustering was performed, represented as a tree-shaped dendrogram (Fig. 4). Vertically the different proteins are listed numbered from 1 to 100 from top to bottom. Horizontally the different samples are listed. Based on their average relative abundances, OR and PD samples were effectively separated from each other as illustrated by the two main clusters in the dendrogram (Fig. 4). Separation of the samples was based on higher (red) and lower (blue) than median abundance of each protein within all samples. Furthermore the length of the dendrogram arms shows that some samples (replicates) show more similarity to each other than to the rest of the samples as expected. The order of the proteins numbered from 1 to 100 is identical to the order and numbering in Table II. Similar results were obtained by PCA (supplemental Fig. 2). In the PCA complex information is reduced to three principal components, represented by the x, y, and z axes. Samples are visualized in a three-dimensional plot and cluster according to their relative protein abundance. From this PCA it is clear that, in this sample set, OR (green squares) and PD samples (red squares) were completely separated from each other based on their protein abundance profile. To verify that individual peptides showed differential abundance similar to that of their corresponding proteins, we performed hierarchical clustering on all peptides corresponding to the putative 100-protein profile. As expected, clustering based on peptides resembled the results of protein clustering (data not shown).

Verification of Differential Protein Abundance—

Our next goal was to verify the presence and abundance level of all profile proteins in separate tumor samples. Because we used pooled microdissected tumor cells for the discovery study, information on the single tumor level as well as the relation with clinical factors was lost. To verify our putative profile proteins, we performed targeted LC-MS/MS analyses using an inclusion list (supplemental Table S4) compiled from the m/z values of the peptides that corresponded to the 100 putative profile proteins. We prepared whole tissue protein lysates from tumors (eight OR and 12 PD) with a high tumor cell content (>70%) so that microdissection could be omitted. Using this approach, we identified and therefore verified the presence of 50 proteins from the inclusion list. In addition, peak intensities of survey mass spectra (on average ∼14,000 LC-MS features per sample) were used for quantitation. In this case, peptide identity was derived by matching LC-MS features from survey spectra to the composite breast cancer cell line AMT tag database. This resulted in the identification and quantitation of 47 target proteins of which 42 were also identified by MS/MS sequencing (Fig. 5). Overall a total of 55 proteins (50 by MS/MS sequencing and five additional by LC-MS feature (survey mass spectra) matching with the AMT database of the 100-putative protein list) were verified in an independent targeted LC-MS/MS experiment. The 47 proteins for which relative abundance was available were used in further analyses. Surprisingly the top discriminating protein in the original profile, EMMPRIN, was not identified through this targeted approach. Raw MS/MS data obtained for verified proteins and relative abundance ratios for verified proteins are listed in supplemental Tables S5 and S6, respectively.
F

Verification of putative profile proteins. Putative profile proteins were verified in non-microdissected tumor samples through targeted MS/MS. Peptide abundance information was retrieved from peak intensities of MS survey spectra. For protein identification MS survey spectra were matched with the AMT database (DB).

Relative abundances of the 47 verified proteins were statistically analyzed using either Wilcoxon rank sum or Student's t test depending on the outcome of a test for normality based on skewness and kurtosis. Three proteins, ectonucleotide phosphatase/phosphodiesterase 1 (ENPP1; number 61 in Table II), guanine nucleotide-binding protein β subunit 4 (GNB4; number 31 in Table II), and ubiquinol-cytochrome c reductase iron-sulfur subunit mitochondrial precursor (UQCRFS1; number 54 in Table II) were significantly differentially abundant between OR and PD with p values of 0.043 (Fig. 6), 0.026 (Fig. 6), and 0.036 (not shown), respectively (Table III). ENPP1 was not detected in any of the OR samples but in five of 12 PD samples (Fig. 6), whereas GNB4 (Fig. 6) and UQCRFS1 were higher in OR samples (Table III). In addition, eukaryotic translation initiation factor 3 subunit 6/E (EIF3E) (Fig. 6), occludin (OCLN), splice isoform 1 of surfeit locus protein 4 (SURF4), thioredoxin domain-containing protein 5 precursor (TXNDC5), and ubiquitin and ribosomal protein S27A (RP27A) showed a trend toward differential abundance (0.05 < p < 0.1). Mean abundance and 95% confidence intervals (CIs) are listed in Table III. It needs to be mentioned that analysis groups for verification were rather small (eight OR versus 12 PD); thus the outcomes may change when more samples are analyzed in future studies. Subsequently relative abundance of all 47 verified proteins was coupled to clinical end points of patients. Of these 47 proteins, ENPP1, EIF3E, and GNB4 showed significant association with progression-free survival, whereas UQCRFS1 did not, although it did associate with response as described above. Kaplan-Meier analysis as a function of ENPP1 status showed that the presence of ENPP1 was significantly correlated with shorter progression-free survival after the start of tamoxifen treatment with a hazard ratio (HR) of 1.63 (95% CI, 1.15–2.32; p = 0.005) (Fig. 6). Survival analyses as a function of EIF3E and GNB4 levels were performed after dividing the relative abundance levels into low + median versus high because low and median level survival curves were superimposable. High levels of EIF3E and GNB4 were significantly associated with prolonged progression-free survival with HRs of 0.22 (95% CI, 0.07–0.71; p = 0.01) (Fig. 6) and 0.24 (95% CI, 0.07–0.79; p = 0.02) (Fig. 6), respectively. In conclusion, we were able to associate high GNB4 and EIF3E levels with a favorable outcome and ENPP1 with an adverse outcome on tamoxifen therapy.
F

Clinical association of verified proteins. Differences in relative abundance ratios between OR (red) and PD (green) tumors for ENPP1 (A), GNB4 (B), and EIF3E (C) are shown. Shown is the Kaplan-Meier survival analysis of time to progression upon tamoxifen treatment for recurrent breast cancer patients according to LC-MS abundance levels. For ENPP1, absence (abs) (green line) and presence (pres) (red line) of abundance was compared (D). For GNB4 (E) and EIF3E (F) low abundance and medium abundance were grouped (green line) and compared with high abundance (red line). The number of patients at risk in each group is displayed together with the hazard ration, 95% confidence interval, and p value. Avg, average; Cum, cumulative; CI, confidence interval.

T

Verified differentially abundant proteins

Shown are a subset of putative profile proteins verified in targeted MS/MS experiment with a p value <0.1.

Protein descriptionGene symbolHigher inΔ mean/median (95% CI)p value
Guanine nucleotide-binding protein β subunit 4GNB4OR−35.1 (−65.2 to −4.8)0.026
Ubiquinol-cytochrome c reductase iron-sulfur subunit, mitochondrial precursorUQCRFS1OR−31.6 (−61.0 to −2.3)0.036
Ectonucleotide pyrophosphatase/phosphodiesterase 1aEPP1PD0 (0–1.3)0.043
Thioredoxin domain-containing protein 5 precursoraTXNDC5PD2.8 (−0.02 to 20.3)0.081
Eukaryotic translation initiation factor 3 subunit 6EIF3EOR−2.2 (−4.8 to 0.3)0.085
OccludinaOCLNOR0 (−1.6 to 0)0.087
Splice isoform 1 of O15260 Surfeit locus protein 4SURF4PD3.7 (−0.8 to 8.3)0.098
Ribosomal protein S27ARP27AOR−168.1 (−376.3 to 40.2)0.100

Wilcoxon rank sum.

Validation of EMMPRIN and Association with Clinical End Points—

A pivotal step in the process of biomarker discovery is the validation of putative markers in independent patient cohorts and preferably by using a different methodology, such as using immunohistochemistry (IHC). In our case, validation was only performed for the top discriminating protein, EMMPRIN, because there are no appropriate antibodies available for ENPP1, EIF3E, and GNB4 or for any of the other differentially abundant proteins we discovered. The antibody we used in this study was directed against the C-terminal part of EMMPRIN and therefore recognizes all splice isoforms. To independently validate differential EMMPRIN protein abundance between OR and PD patients, IHC was performed using our primary breast cancer TMA. Among the different tissues, there were 156 breast tumors of patients who received first line tamoxifen therapy after recurrence. This set of tumors had no overlap with the discovery set tumors. In total, 130 tumors showed reproducible IHC staining on the TMA when assays were performed in triplicate. Patient and tumor characteristics are described in Table IV. Different staining outcomes were categorized as undetectable, weak, medium, and strong membrane staining. Weak membrane staining, present in <10% of tumor cells, was scored as 1+. Medium membrane staining, present in 10–50% of tumor cells, was scored as 2+. Strong membrane staining, observed in >50% of tumor cells, was assigned score 3+ (Fig. 7). These scoring outcomes were subsequently related to clinical endpoints. We observed that none of the CR tumors displayed EMMPRIN staining, whereas highest EMMPRIN staining (3+) was observed in PD tumors (Table V). This finding, originally indicated using LC-MS-based technology, was thus confirmed by IHC. For comparison, we defined a “clinical benefit” group composed of tumors showing NC for >6 months, CR, and PR and a “no clinical benefit” group representing NC for ≤6 months and PD tumors. Absence of detectable EMMPRIN levels showed a significant clinical benefit with an odds ratio of 2.98 (95% CI, 1.32–6.73; p = 0.009). The presence of detectable EMMPRIN levels was more frequently observed in premenopausal women (X2 = 11.7; p < 0.001) and in patients with a shorter disease-free interval (X2 = 11.2; p = 0.004) defined as the time from primary diagnosis to recurrence (Table VI). In addition, Cox regression analysis showed that presence of EMMPRIN significantly correlated with shorter progression-free survival from the start of tamoxifen treatment (HR, 1.87; 95% CI, 1.25–2.80; p = 0.002) (Fig. 8). Thus, high EMMPRIN levels correlate with poor outcome on first line tamoxifen treatment.
F

Immunohistochemical staining of EMMPRIN. EMMPRIN immunohistochemical staining was performed on an independent sample set of 156 tissues using TMA. A, overview of TMA; B, negatively stained tissue; C, 1+ membrane stain; D, 2+ stain; E, 3+ stain. Overview picture was taken at 5× magnification; other pictures were taken at 100× magnification.

T

IHC score of EMMPRIN

The average (Avg) EMMPRIN score in tumors grouped by therapy response is shown.

AvgscoreCRPRNC > 6monthsNC ≤ 6monthsPDTotal
04204082597
104631225
2002046
3001012
Total424491142130
T

EMMPRIN correlation with clinical factors

EMMPRIN protein abundance correlated with menopausal status and disease-free interval is shown.

EMMPRINn(%)Menopausal status
Disease-free interval (months)
Pre (%)Post (%)≤12 (%)12–36 (%)≥36 (%)
Absent97 (74.6)22 (55.0)75 (83.3)9 (56.3)39 (66.1)49 (89.1)
Present33 (25.4)18 (45.0)15 (16.7)7 (43.7)20 (34.9)6 (10.9)
Total130 (100)40 (30.7)90 (69.2)16 (12.3)59 (45.3)55 (42.3)
Pearson χ211.711.2
p value<0.0010.004
F

Kaplan-Meier survival analysis. EMMPRIN abundance was measured by IHC using TMA and was correlated to time to progression after the onset of first line tamoxifen treatment. Absence (abs) of detectable EMMPRIN (green line) was compared with presence (pres) (1+, 2+, and 3+) of EMMPRIN staining (red line).

DISCUSSION

We performed a comparative proteomics study using nano-LC-FTICTR MS analyses of tamoxifen therapy-resistant and therapy-responsive tumor cells isolated from breast cancer tissue by LCM. This approach proved to be extremely powerful as exemplified by identification of several thousand unique proteins from sub-μg quantities of clinically relevant samples. These efforts resulted in the identification of a putative protein profile that is associated with the type of response to tamoxifen therapy. Furthermore we validated our top discriminating protein, EMMPRIN, in an independent patient cohort and confirmed its association with tamoxifen therapy resistance in recurrent breast cancer. Many different proteomics technologies are available nowadays that all aid in the quest for cancer biomarkers. The method of choice will depend on the type of question asked, the type of material being investigated, and the availability of resources. Several studies have shown that the combination of dedicated nano-LC separation coupled to high end FT MS offers the best potential for in-depth analysis of limited sample quantity, which is usually the case with clinical material (23, 28, 36, 37, 40). In the present study, we used nano-LC-FTICR MS and a composite breast cancer cell line AMT tag database for the identification of peptides from as little as ∼550 ng of protein lysate. Overall we identified over 17,000 unique peptides corresponding to over 2,500 unique proteins, a significantly larger fraction of the proteome than attainable with more conventional proteomics techniques (20, 22). Furthermore we believe there is more to gain if a breast cancer tissue-specific AMT tag database becomes available. Although breast cancer cell lines represent aspects of normal and malignant breast tissue, it is well known that cultured cell lines have quite a distinct proteomic profile compared with primary cells or tissues. This was clearly demonstrated by Ornstein et al. (18) who compared proteomes of microdissected prostate tumor cells with proteomes of matching cell lines from the same patient. They showed that protein expression was strikingly altered in cultured cells, which had less than 20% proteins in common with uncultured cells (18). Therefore, it is very well possible that proteins involved in therapy resistance of breast tumors are not expressed in cell lines and thus are missing from the AMT tag database used in this study. To overcome this problem, we are currently constructing an AMT tag database from breast cancer tissues using a selection of tumors that have distinct phenotypic characteristics. A breast cancer tissue-specific AMT tag database will most likely increase the number of identified peptides (i.e. proteome coverage) in LC-MS analyses, thus increasing our chances of identifying relevant biomarkers. Proteome coverage could even be further improved using “smart MS/MS,” e.g. by fragmenting currently unidentified LC-MS features.

Discovery and Verification of Putative Tamoxifen Therapy Response-associated Proteins—

The putative protein profile described in this study consists of 100 proteins involved in a variety of biological processes. These proteins can be categorized into different functional classes, such as structural proteins, signaling proteins and kinases, metabolic enzymes, proteins involved in apoptosis, and others (see Table II). Several of the putative profile proteins (NAP1L1, pyridoxine-5′-phosphate oxidase, and UQCRFS1) have been previously associated with tamoxifen therapy resistance in breast cancer (41, 42) or chemotherapy resistance (SGPL1 and TUBB3) in vitro and in clinical specimens (43–45) and with aggressiveness of breast cancer (S100A6, S100A9, CLIC4, EBP50, and OCLN) (46–51). Because the discovery of putative tamoxifen response-predictive proteins was performed in pooled samples, it was important to verify the presence and relative abundance of these proteins in each individual tumor tissue. Using a targeted MS/MS approach, we successfully identified 55 profile proteins in individual, non-microdissected tumor lysates and retrieved quantitative information for 47 of these proteins. Clearly 45 putative proteins were left unverified in individual tumor samples, including our top discriminating protein, EMMPRIN. The relatively low verification rate can be justified by the use of different samples and LC-MS platforms for the discovery and verification part of the study. Microdissected tumor cell lysates were analyzed by ultranarrow LC coupled to FTICR for discovery, whereas whole tissue lysates representing a mixture of cell types were analyzed by a standardized LC-MS/MS platform for verification. Nano-LC-FTICR analysis yielded an average of ∼40,000 LC-MS features, whereas LC-MS/MS Orbitrap analysis detected on average ∼14,000 LC-MS features. Therefore, the nano-LC-FTICR platform yielded ∼3× higher proteome coverage and, one can speculate, resulted in a similar improvement in sensitivity (i.e. limit of detection). Similarly we only used information on accurate mass in targeted MS/MS experiments because it was not possible to use NET information as an inclusion criterion with the software version available at the time. The addition of NET information as an inclusion criterion will most likely increase the success rate of target peptide identification through MS/MS in future studies using updated instrument control software. The compilation of these effects (i.e. LC-MS platform with lower overall sensitivity and inadequate targeted MS/MS strategy) resulted in a failure to confirm the identity of our top discriminating protein as EMMPRIN in the verification study. Nevertheless the presence of 55 putative profile proteins was verified, and based on the abundance ratios, ENPP1, UQCRFS1, and GNB4 were confirmed to be significantly differentially abundant between OR and PD tumors. In addition ENPP1, EIF3E, and GNB4, were significantly associated with time to progression upon first line tamoxifen treatment of recurrent breast cancer. So far, no link between ENPP1 or GNB4 and breast cancer or response to tamoxifen has been described, although ENPP1 overexpression and polymorphisms have been repeatedly associated with insulin resistance and obesity (52, 53). Obesity is a risk factor for breast cancer (54), and insulin resistance may be linked to tamoxifen therapy resistance. EIF3E protein expression has been shown to be significantly decreased in breast cancer, which was frequently associated with loss of heterozygosity at the Int-6/eIF3-p48 locus (55). EIF3E is ubiquitously expressed and highly conserved, and it encodes the p48 subunit of the translation initiation factor eIF3, also named INT6. In a multiplex tissue immunoblotting study by Traicoff et al. (56), EIF3E expression was determined in 124 breast cancer tissues. It was shown that breast tissues clustered according to high or low EIF3E expression, and this segregation was not dependent on tumor stage. Furthermore EIF3E expression positively correlated with tumor suppressors, such as p53, suggesting a function in the same signaling pathway (56). It was postulated that EIF3E has diverse functions in cell growth in addition to translation initiation, including tumor suppressive properties. This was particularly clearly shown in studies where truncation or knockdown of EIF3E induced angiogenesis and tumor formation (57, 58). This tumor-suppressive role correlates well with the elevated abundance of EIF3E in OR tumors and its contribution to prolonged progression-free survival upon tamoxifen treatment.

Validation of EMMPRIN—

The validation study was focused on our top discriminating protein, EMMPRIN, which is known to be involved in breast cancer and for which an appropriate antibody is conveniently available. EMMPRIN has been previously described to play a role in tumor cell invasion and metastasis (59). In particular, it acts through up-regulation of the urokinase-type plasminogen activator system, thereby promoting tumor cell invasion (60). In an immunohistochemical study using high density breast cancer tissue microarrays, it was shown that positive EMMPRIN staining correlated with various histopathological parameters, in particular with decreased tumor-specific survival in postmenopausal patients (61). EMMPRIN is up-regulated in many types of cancer (62), supporting the previous findings that the involvement of EMMPRIN in urokinase-type plasminogen activator deregulation may be a universal phenomenon in tumorigenesis and is not restricted to breast cancer. In addition, EMMPRIN has been recently shown to predict response and survival following cisplatin-containing chemotherapy in patients with advanced bladder cancer (63). An IHC analysis in 101 advanced bladder cancer patients showed that high EMMPRIN expression strongly correlated with shorter survival time, in particular in patients with metastatic tumors, and that response to chemotherapy could also be predicted with an odds ratio of 4.41 (63). In our study, high expression of EMMRPIN was more frequently observed in PD than OR tumors, and it was significantly associated with an early tumor progression after the onset of first line tamoxifen treatment in recurrent breast cancer. Combining our results with previous findings, one can speculate that EMMPRIN-induced tumor aggressiveness may be the result of therapy resistance in general (i.e. tamoxifen and chemotherapy) and that this mechanism is not restricted to breast cancer.

Concluding Remarks—

In this study we demonstrated quantitative analysis of minute amounts of clinically relevant tumor tissues using ultrasensitive nano-LC-FTICR technology. These analyses have put forward a putative protein profile that may predict the outcome of response for tamoxifen therapy in breast cancer patients. Whether this profile as a whole is a good predictor for tamoxifen therapy response in a larger, independent group of patients and whether it is applicable to chemotherapy as well will be the subject of further investigations.
  63 in total

1.  Genes associated with breast cancer metastatic to bone.

Authors:  Marcel Smid; Yixin Wang; Jan G M Klijn; Anieta M Sieuwerts; Yi Zhang; David Atkins; John W M Martens; John A Foekens
Journal:  J Clin Oncol       Date:  2006-04-24       Impact factor: 44.544

2.  Quantitative profiling of drug-associated proteomic alterations by combined 2-nitrobenzenesulfenyl chloride (NBS) isotope labeling and 2DE/MS identification.

Authors:  Keli Ou; Djohan Kesuma; Kumaresan Ganesan; Kun Yu; Sou Yen Soon; Suet Ying Lee; Xin Pei Goh; Michelle Hooi; Wei Chen; Hiroyuki Jikuya; Tetsuo Ichikawa; Hiroki Kuyama; Ei-ichi Matsuo; Osamu Nishimura; Patrick Tan
Journal:  J Proteome Res       Date:  2006-09       Impact factor: 4.466

3.  High incidence of EMMPRIN expression in human tumors.

Authors:  Sabine Riethdorf; Natalie Reimers; Volker Assmann; Jan-Wilhelm Kornfeld; Luigi Terracciano; Guido Sauter; Klaus Pantel
Journal:  Int J Cancer       Date:  2006-10-15       Impact factor: 7.396

4.  Breast cancer proteomics by laser capture microdissection, sample pooling, 54-cm IPG IEF, and differential iodine radioisotope detection.

Authors:  Hans Neubauer; Susan E Clare; Raffael Kurek; Tanja Fehm; Diethelm Wallwiener; Karl Sotlar; Alfred Nordheim; Wojciech Wozny; Gerhard P Schwall; Slobodan Poznanović; Chaturvedula Sastri; Christian Hunzinger; Werner Stegmann; André Schrattenholz; Michael A Cahill
Journal:  Electrophoresis       Date:  2006-05       Impact factor: 3.535

5.  Multicenter validation of a gene expression-based prognostic signature in lymph node-negative primary breast cancer.

Authors:  John A Foekens; David Atkins; Yi Zhang; Fred C G J Sweep; Nadia Harbeck; Angelo Paradiso; Tanja Cufer; Anieta M Sieuwerts; Dmitri Talantov; Paul N Span; Vivianne C G Tjan-Heijnen; Alfredo F Zito; Katja Specht; Heinz Hoefler; Rastko Golouh; Francesco Schittulli; Manfred Schmitt; Louk V A M Beex; Jan G M Klijn; Yixin Wang
Journal:  J Clin Oncol       Date:  2006-02-27       Impact factor: 44.544

6.  Development and evaluation of a micro- and nanoscale proteomic sample preparation method.

Authors:  Haixing Wang; Wei-Jun Qian; Heather M Mottaz; Therese R W Clauss; David J Anderson; Ronald J Moore; David G Camp; Arshad H Khan; Daniel M Sforza; Maria Pallavicini; Desmond J Smith; Richard D Smith
Journal:  J Proteome Res       Date:  2005 Nov-Dec       Impact factor: 4.466

7.  Quantitative proteome analysis of breast cancer cell lines using 18O-labeling and an accurate mass and time tag strategy.

Authors:  Anil J Patwardhan; Eric F Strittmatter; David G Camp; Richard D Smith; Maria G Pallavicini
Journal:  Proteomics       Date:  2006-05       Impact factor: 3.984

8.  Laser microdissection and microarray analysis of breast tumors reveal ER-alpha related genes and pathways.

Authors:  F Yang; J A Foekens; J Yu; A M Sieuwerts; M Timmermans; J G M Klijn; D Atkins; Y Wang; Y Jiang
Journal:  Oncogene       Date:  2006-03-02       Impact factor: 9.867

9.  Pathological prognostic factors in breast cancer. I. The value of histological grade in breast cancer: experience from a large study with long-term follow-up.

Authors:  C W Elston; I O Ellis
Journal:  Histopathology       Date:  1991-11       Impact factor: 5.087

10.  Proteomic analysis in human breast cancer: identification of a characteristic protein expression profile of malignant breast epithelium.

Authors:  Gernot Hudelist; Christian F Singer; Kerstin I D Pischinger; Klaus Kaserer; Mahmood Manavi; Ernst Kubista; Klaus F Czerwenka
Journal:  Proteomics       Date:  2006-03       Impact factor: 3.984

View more
  52 in total

1.  Data-independent proteomic screen identifies novel tamoxifen agonist that mediates drug resistance.

Authors:  Shawna Mae Hengel; Euan Murray; Simon Langdon; Larry Hayward; Jean O'Donoghue; Alexandre Panchaud; Ted Hupp; David R Goodlett
Journal:  J Proteome Res       Date:  2011-09-21       Impact factor: 4.466

2.  Making sense out of massive data by going beyond differential expression.

Authors:  Patrick R Schmid; Nathan P Palmer; Isaac S Kohane; Bonnie Berger
Journal:  Proc Natl Acad Sci U S A       Date:  2012-03-23       Impact factor: 11.205

Review 3.  Translational control in cancer.

Authors:  Deborah Silvera; Silvia C Formenti; Robert J Schneider
Journal:  Nat Rev Cancer       Date:  2010-04       Impact factor: 60.716

4.  Plasma proteomics analysis of tamoxifen resistance in breast cancer.

Authors:  Keivan Majidzadeh-A; Javad Gharechahi
Journal:  Med Oncol       Date:  2013-10-26       Impact factor: 3.064

5.  Quantitative proteomic analysis of single pancreatic islets.

Authors:  Leonie F Waanders; Karolina Chwalek; Mara Monetti; Chanchal Kumar; Eckhard Lammert; Matthias Mann
Journal:  Proc Natl Acad Sci U S A       Date:  2009-10-21       Impact factor: 11.205

6.  Roles of Small GTPases in Acquired Tamoxifen Resistance in MCF-7 Cells Revealed by Targeted, Quantitative Proteomic Analysis.

Authors:  Ming Huang; Yinsheng Wang
Journal:  Anal Chem       Date:  2018-11-30       Impact factor: 6.986

7.  Proteomics of mouse BRCA1-deficient mammary tumors identifies DNA repair proteins with potential diagnostic and prognostic value in human breast cancer.

Authors:  Marc Warmoes; Janneke E Jaspers; Thang V Pham; Sander R Piersma; Gideon Oudgenoeg; Maarten P G Massink; Quinten Waisfisz; Sven Rottenberg; Epie Boven; Jos Jonkers; Connie R Jimenez
Journal:  Mol Cell Proteomics       Date:  2012-02-24       Impact factor: 5.911

8.  The chromosome 3q26 OncCassette: A multigenic driver of human cancer.

Authors:  Alan P Fields; Verline Justilien; Nicole R Murray
Journal:  Adv Biol Regul       Date:  2015-12-23

9.  Antiestrogen Resistance and the Application of Systems Biology.

Authors:  Kerrie B Bouker; Yue Wang; Jianhua Xuan; Robert Clarke
Journal:  Drug Discov Today Dis Mech       Date:  2012-12-01

10.  Detection of gene pathways with predictive power for breast cancer prognosis.

Authors:  Shuangge Ma; Michael R Kosorok
Journal:  BMC Bioinformatics       Date:  2010-01-01       Impact factor: 3.169

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.