Literature DB >> 29735997

Discovery of proteins associated with a predefined genomic locus via dCas9-APEX-mediated proximity labeling.

Samuel A Myers¹, Jason Wright², Ryan Peckner², Brian T Kalish^3,4, Feng Zhang², Steven A Carr⁵.

Abstract

Regulation of gene expression is primarily controlled by changes in the proteins that occupy genes' regulatory elements. We developed genomic locus proteomics (GLoPro), in which we combine CRISPR-based genome targeting, proximity labeling, and quantitative proteomics to discover proteins associated with a specific genomic locus in native cellular contexts.

Entities: CellLine Chemical Disease Gene Mutation Species

Mesh：

Substances：

Year: 2018 PMID： 29735997 PMCID： PMC6202184 DOI： 10.1038/s41592-018-0007-1

Source DB: PubMed Journal: Nat Methods ISSN： 1548-7091 Impact factor: 28.547

Transcriptional regulation is a highly-coordinated process largely controlled by changes in protein occupancy at regulatory elements of the modulated genes. Chromatin immunoprecipitation (ChIP) has been invaluable for our understanding of transcriptional regulation and chromatin structure at both the individual locus and genome-wide levels[1-3]. However, because ChIP requires the use of antibodies, its utility can often be limited by the presupposition of a suspected protein’s occupancy, and lack of highly specific and high affinity reagents. Previously developed “inverse ChIP” methods have had limited utility in mammalian systems due to one of several drawbacks including loss of cellular and/or chromatin context, extensive engineering and locus disruption, reliance on repetitive DNA sequences, or chemical crosslinking[4-10] which often requires extensive optimization for mass spectrometry-based applications[11]. We sought to develop a method to identify proteins associated with a specific, non-repetitive genomic locus in the native cellular context without the need for crosslinking or genomic alterations. Here, we utilized recent advances in sequence-specific DNA targeting and affinity labeling in cells to develop genomic locus proteomics (GLoPro) to characterize proteins associated with a particular genomic locus. We fused the catalytically-dead RNA-guided nuclease Cas9 (dCas9)[12, 13] to the engineered peroxidase APEX2[14] to biotinylate proteins at a defined genomic loci for subsequent enrichment and identification by liquid chromatography-mass spectrometry (LC-MS/MS) (Figure 1A and Supplemental Figure 1). dCAS9 was employed due to the reprogrammable nature of the single guide RNAs (sgRNA)[15]. APEX2, in the presence of hydrogen peroxide, oxidizes the phenol moiety of biotin-phenol compounds to phenoxyl radicals that react with surface exposed tyrosine residues[16, 17], labeling nearby proteins with biotin derivatives[14, 17–19]. APEX2 was chosen for its small labeling radius and short reaction time[20-22]. The dCas9-APEX2 (Caspex) gene was cloned in-frame with the self-cleaving T2A peptide and Gfp under the control of a tetracycline response element into a puromycin-selectable piggybac plasmid[23] (Supplemental Figure 2). The inducibilty of Caspex expression provides temporal control to minimize the time CASPEX occupies the targeted locus, and the accumulation of excess CASPEX which leads to higher background biotinylation, common with proximity labeling[24-28].

Figure 1

Genomic Locus Proteomics if hTERT A) Illustration of CASPEX targeting and affinity labeling reaction. i) A genomic locus of interest is identified. ii) A targeting sequence for the sgRNA is designed (red bar). iii) CASPEX expression is induced with doxycycline and, after association with sgRNA, binds region of interest. iv) After biotin-phenol incubation, H2O2 induces the CASPEX-mediated labeling of proximal proteins, where the “labeling radius” of the reactive biotin-phenol is represented by the red cloud. v) Proteins proximal to CASPEX are labeled with biotin (orange stars) for subsequent enrichment. B) ChIP-qPCR against biotin (blue boxes) and FLAG (green boxes) in 293T-CasPEX cells expressing either no sgRNA (far right) or T092 sgRNA. White boxes indicate ChIP probes for regions amplified and detected by qPCR, all of which are in Supplemental Figure 4. hTERT is below to show the gene structure with respect to the sgRNA target (red box). C) UCSC Genome Browser representation of hTERT (hg19). sgRNAs (colored bars) are shown to scale relative to the transcription start site (black arrow). D) Multi-scatter plots for log2 fold enrichment values of proteins quantified, and the corresponding Pearson correlation coefficients between all pair-wise hTERT-293T-Caspex cells comparisons using the no sgRNA control line as the denominator (n = 5 independent sgRNA lines). E) Volcano plot of proteins quantified across the four overlapping hTERT-293T-Caspex cell lines (n = 4 independent sgRNA lines) compared to the no sgRNA control. Data points representing proteins enriched with an adjusted p-value of less than 0.05, FE > 0, are labeled in red. Proteins known to associate with hTERT and identified as enriched by GLoPro are highlighted. TP53, a known hTERT binder, had an adj. p val. = 0.058 and is highlighted blue. F) Mean GLoPro enrichment values for V5-tagged ORFs selected for ChIP-qPCR corroboration. Red indicates the protein was enriched at hTERT, blue that the protein was detected in the analysis but not statistically enriched. Grey proteins were not detected. G) Correlation between ChIP-qPCR mean log2 fold-enrichment over input of the four primer pairs spanning the sgRNA targets (biological quadruplicates, measurement singlicate), and GLoPro enrichment of the four overlapping sgRNAs at hTERT. Black, open circles indicate that the protein was not identified by GLoPro. Blue, open circles indicate the protein was identified but was not statistically enriched. Red open circles indicate proteins that are enriched according to the GLoPro analysis. Previously described hTERT binders are labeled. Dotted line separates ChIP-qPCR data tested for statistical significance via the Mann-Whitney test, and the p-value is shown.

To test whether the CASPEX protein correctly localized to the genomic site of interest, we created a single-colony HEK293T line, stably integrated with the Caspex plasmid (293T-Caspex), and expressed a sgRNA targeting 92 base pairs (bp) 3’ of the transcription start site (TSS) (denoted as T092) of the TERT gene. We focused on the TERT promoter (hTERT) as TERT expression is a hallmark of cancer and recurrent promoter mutations in hTERT have been shown to re-activate TERT expression[29]. Biotinylation in T092 sgRNA-expressing 293T-Caspex cells was induced, followed by anti-FLAG or anti-biotin ChIP- quantitative PCR (qPCR) with probes tiling hTERT. ChIP-qPCR showed proper localization of CASPEX with the peak of the anti-FLAG signal overlapping with the destination of the sgRNA. The anti-biotin ChIP-qPCR signal showed a similar trend of enrichment, indicating that CASPEX biotinylates proteins within approximately 400 base pairs on either side of its target locus. As expected, no enrichment was observed around T092 for the non-spatially constrained no sgRNA control (Figure 1B). We then tested four additional sgRNAs tiling hTERT (Figure 1C). After labeling in stable sgRNA-expressing 293T-Caspex lines, Western blot analysis showed that biotinylation was CASPEX-dependent but no difference in biotin patterns were observed between sgRNA lines (Supplemental Figure 3). ChIP-qPCR against FLAG or biotin showed all constructs correctly targeted and labeled the region of interest, where the peak of enrichment resided at the sgRNA site (Supplemental Figure 4). To assess off-target binding of CASPEX, we performed anti-FLAG ChIP-qPCR probing the top predicted off-target site for each respective sgRNA. No two off-target sites were within 5 Mb from each other (Supplemental Table 1). Each on-target site was occupied by CASPEX 3- to 40-fold more than the predicted off-target site, and the cumulative enrichment of the sgRNA-expressing 293T-Caspex lines with overlapping labeling radii (430T, 107T, T092, and T266) at hTERT was 50-fold (Supplemental Figure 5). These data demonstrate that CASPEX targeting can be accurately reprogrammed by substitution of the sgRNAs and that proximal protein biotinylation is CASPEX mediated. To test whether CASPEX could be used to identify proteins associated with hTERT, we enriched biotinylated proteins from hTERT-targeted 293T-Caspex lines, followed by analysis by quantitative LC-MS/MS. Biotinylation was initiated in the five individual hTERT-targeting 293T-Caspex lines that tiled the genomic locus of interest, along with the no-guide control 293T-Caspex line. Tiling is an important feature of this method as “noise” from off-target binding of dCas9 from each individual line will be diluted and only reproducibly enriched proteins from on-target occupancy contribute to the “signal”[30, 31]. Tiling may also circumvent the loss of native protein identification if CAS9 binding precludes their occupancy, though evidence suggests minor affects[30]. Whole cell lysates from each line were incubated with streptavidin-coated beads, stringently washed, and subjected to on-bead trypsin digestion. Digests of the enriched proteins were labeled with isobaric tandem mass tags (TMT)[32] for relative quantitation, multiplexed, and analyzed by LC-MS/MS (Supplemental Figure 1). We used a ratiometric approach for each individual sgRNA 293T-Caspex line compared to the no-guide control line[28]. The no-guide control[13, 30, 33] was chosen over a random genomic site to improve chances of finding “housekeeping” chromatin proteins that may reside in many places in the genome. Enrichment from the four overlapping hTERT Caspex lines showed high correlation of protein enrichment (Figure 1D). The T959 Caspex line, which lies ≥ 2 labeling radii from its closest neighbor, showed decreased correlation of protein enrichment. We performed a moderated one-sample T-test by treating the four overlapping sgRNA lines as replicates, using the non-spatially constrained no sgRNA 293T-CasPEX line as the background biotinylation and enrichment control. 387 of the 3,199 proteins identified with at least two peptides were significantly enriched (adj. p value < 0.05, FE > 0) at hTERT over the no sgRNA control, including five proteins known to occupy hTERT in various cell types (MAZ;[34, 35], CTNNB1;[36-38], ETV3;[39], CTBP1;[40], and TP53 [adj. p = 0.056][41, 42]) (Figure 1E, Supplemental Table 2). Histones and subunits of RNA polymerase II, while detected, were not significantly enriched, likely due to the inverse relationship between a protein’s abundance and the ability to determine that it is enriched over background (Supplemental Figure 6)[43, 44]. These results demonstrate GLoPro is able enrich and identify proteins at a particular genomic locus from the native cellular context. In addition to several known proteins, we identified a number of candidate proteins associated with hTERT. To corroborate the GLoPro results, we performed ChIP-qPCR for candidates spanning the GLoPro enrichment range (Figure 1F). Many of these proteins do not have ChIP grade antibodies so we turned to V5-tagged ORF expression in unmodified HEK293T cells[45]. We individually transfected 16 V5-tagged candidate ORFs, four V5-tagged ORFs for proteins not significantly enriched at hTERT, and three proteins that were not detected as negative controls. Comparing anti-V5 ChIP-qPCR signals from each individual ORF to their respective GLoPro enrichment values we found that all proteins enriched in the GLoPro analysis were, as a group, statistically enriched by ChIP-qPCR (Mann-Whitney test, p = 0.0008) (Figure 1G). Most candidates deemed statistically enriched according to the GLoPro analysis were separated in the ChIP enrichment space from those not enriched or not detected. Two proteins previously described to bind hTERT, CTBP1 and MAZ[34, 35, 40], were found in a regime of high ChIP enrichment and low GLoPro enrichment, suggesting ChIP-qPCR provides orthogonal information to GLoPro for protein occupancy at a genomic locus. These data show that GLoPro can identify known and novel proteins that can be corroborated by ChIP-qPCR that associate with hTERT. To explore the generalizability of GLoPro, we created 293T-Caspex cells expressing individual sgRNAs tiling the c-MYC promoter (Figure 2A). ChIP-qPCR against CASPEX verified the proper localization of each c-MYC 293T-Caspex line (Supplemental Figure 7) and off-target binding analysis showed a cumulative 32-fold enrichment at the c-MYC promoter over any predicted off-target site (Supplemental Figure 8). GLoPro analysis of the c-MYC promoter identified 66 proteins as significantly enriched (adj. p val < 0.05) compared to the no-guide control (Figure 2B, Supplemental Table 3). We applied a machine learning algorithm to identify association of GLoPro-enriched proteins with canonical pathways from the Molecular Signature Database[46, 47], http://apps.broadinstitute.org/genets). We identified 21 statistically enriched networks (adj. p val. < 0.01), including the “MYC_Active_Pathway”, a gene set of validated targets responsible for activating c-MYC transcription[47] (Figure 2C). ChIP-qPCR confirmed the presence of pathway components at the c-MYC promoter, including HUWE1, RUVBL1, and ENO1 for MYC_active_pathway; RBMX for mRNA_splicing_pathway; and MAPK14 for the Lymph_angiogenesis pathway (Figure 2D). Taken together, these results illustrate that GLoPro enriches and identifies proteins associated in multiple pathways that are known to activate c-MYC expression, while directly implicating specific proteins potentially involved in regulating c-MYC transcription through association with its promoter.

Figure 2

Genomic locus proteomic analysis of c-MYC promoter A) UCSC Genome Browser representation (hg19) of the c-MYC promoter and the location of sgRNA sites relative to the TSS. B) Volcano plot of proteins quantified across the five c-MYC-Caspex cell lines compared to the no sgRNA control Caspex line (n = 5 independent sgRNA lines). Data points representing positively enriched proteins with an adjusted p-value of less than 0.05 are labeled green. C) Significantly enriched gene sets from proteins identified to associate with the c-MYC promoter by GLoPro. Only gene sets with an adjusted p-value of less than 0.01 are shown. MYC_ACTIVE_PATHWAY is highlighted in red and discussed in the text D) ChIP-qPCR of candidate proteins identified by GLoPro at the c-MYC promoter. V5 tagged dsRED served as the negative control for V5-tagged proteins ENO1, RBMX, RUVBL1 and MAPK14, whereas HA-tagged HUWE1 was used for MYC-tagged HUWE1. * indicates T-test p-value < 0.05, ** p < 0.01. Mean and standard error is plotted (transfection duplicates, measurement triplicates).

GLoPro relies on the localization of the affinity labeling enzyme APEX2 directed by the catalytically-dead CRISPR/Cas9 system to biotinylate proteins in close proximity to a specific site in the genome. Other than the expression of Caspex and its associated sgRNA, no genome engineering or cell disruption is required to capture a snapshot of proteins associated with the locus of interest. This advantage, in combination with the generalizability of dCAS9 and APEX2, suggests that GLoPro can be used in a wide variety of cell types and at any dCAS9-targetable genomic element. Beyond circumventing the need for antibodies for discovery, LC-MS/MS analysis using isobaric peptide labeling allows for sample multiplexing, enabling multiple sgRNA lines and/or replicates to be measured in a single experiment with few or no missing values for relative quantitation of enrichment. GLoPro-derived candidate proteins can and should be validated for association with the genomic region by ChIP or another orthogonal assay, as the current version of the method is not yet comprehensive and is still subject to false positives/negatives. While GLoPro in this initial work only identifies association with a locus and not locus-specificity (a protein bound at the queried site likely binds elsewhere) or functional relevance, we expect that analyzing promoters or enhancer elements during relevant perturbations may provide novel functional insights into transcriptional regulation. In addition, we envision CASPEX can be used for enrichment of genomic locus entities such as locus-associated RNAs or DNA elements in close three-dimensional space within the nucleus (i.e. enhancers). Further work will be needed to assess the extended capabilities of CASPEX. In summary, we describe genomic locus proteomics, a novel approach to identify proteins associated with a particular genomic loci in live cells without modifying the site of interest. This method provides an orthogonal and highly complementary approach to ChIP for the unbiased discovery of proteins that may regulate gene expression and chromatin structure. We applied GLoPro to identify proteins associated with the hTERT and c-MYC promoters. Both well-established and previously unreported interactors of the respective promoter regions identified by GLoPro were corroborated using ChIP-qPCR, demonstrating that this method enables the discovery of proteins and pathways that potentially regulate a gene of interest.

Methods

Plasmid construction

The Caspex construct (dox inducible dCas9-APEX2-T2-GFP) was created by subcloning 3×FLAG-dCas9 and T2A-Gfp from pLV-hUBC-dCas9-VP64-T2A-GFP (Addgene #53192), and V5-APEX2-NLS from mito-V5-APEX2 (Addgene #42607) into an all in one piggybac, TREG/Tet-3G plasmid (Church lab) via ligation independent cloning (InFusion, Clontech). Guide sequences were selected and cloned as previously described[49, 50]. Guides were designed irrespective of sense/anti-sense strands but spacing 100–200 bp between guides and on-target scores were used. All V5 ORF constructs were purchased through the Broad Genetics Perturbation Platform and were expressed from the pLX-TRC_317 backbone. V5 ORFs were only selected for validation if the construct was available, had protein homology >99%, and an in-frame V5 tag. All constructs were individually transfected into unmodified HEK293T cells for anti-V5 ChIP experiments and at one-fourth the recommended DNA amount to mitigate gross overexpression. The Caspex plasmid is available through Addgene (plasmid # 97421).

Cell Line construction and culture

HEK293T cells were grown in DMEM supplemented with heat inactivated 10% fetal bovine serum, glutamine and non-essential amino acids (Gibco). All constructs were prepared using Zyppy Maxi Prep kits (Zymo Research) and transfected with Lipofectamine 2000 (Thermo). After Caspex transfection, puromycin was added to a final concentration of 4 ug/ml and selected for two weeks. Single colonies were picked, expanded and tested for doxycycline inducibility of the Caspex construct monitored by GFP detection and by anti-FLAG Western blotting. The HEK293T cell line with the best inducibility (now referred to as 293-Caspex cells) was expanded and used for all subsequent experiments. For stable sgRNA expression, single sgRNA constructs were transfected into 293-Caspex cells and were selected for stable incorporation by hygromycin treatment at 200 ug/ml for two weeks. CASPEX binding was tested using ChIP followed by digital droplet PCR (ddPCR) or Sybr qPCR described below.

APEX-mediated labeling

Prior to labeling, doxycycline dissolved in 70% ethanol was added to cell culture media to a final concentration of either 500 ng/mL for 18–24 hours (hTERT) or 12 hours at 1 ug/mL (c-MYC) for proteomic experiments. For off-target CASPEX binding analysis, cells were treated with 1 ug/mL dox for 20 hours. Biotin tyramide phenol Iris Biotech) in DMSO, stock concentration 500 mM, was added directly to cell culture media, which was swirled until the precipitate dissolved, to a final concentration of 500 uM. After 30 minutes at 37°C hydrogen peroxide was diluted in media to 100 mM before being added to the cell culture media to a final concentration of 1 mM to induce biotinylation. After 60 seconds of very gently swirling the media was decanted as quickly as possible and the cells were washed three times with 15 ml ice cold PBS containing 100 mM sodium azide, 100 mM sodium ascorbate and 50 mM TROLOX (6-hydroxy-2,5,7,8-tetramethylchroman-2-carboxylic acid). Cells were scraped and transferred to 15 ml Falcon tubes with ice cold PBS, spun at 500g for 3 minutes, flash frozen in liquid nitrogen and stored at −80°C.

Chromatin immunoprecipitation followed by quantitative PCR

For confirmation of CASPEX binding and labeling, cells were trypsinized to single cell suspension and fresh formaldehyde was added to a final concentration of 1% and incubated at 37°C for 10 minutes, being inverted several times every two minutes. Formaldehyde was quenched with 5% glycine for 37°C for 5 minutes and the samples were aliquoted into 3e6 cell aliquots, spun down and flash frozen in 0.5 mL Axygen tubes. Chromatin was sheared using a QSonica Q800R2 Sonicator at and amplitude of 50 for 30 seconds on/30 off, for 7.5 minutes, until 60% of fragments were between 150 and 700 bp, with an average size of ~350 bp. Lysis buffer was comprised of 1% SDS, 10 mM EDTA and Tris HCl, pH 8.0. For off-target CASPEX analysis, one approximately 80% confluent 15 cm2 plate was washed once with 20 mls ice cold PBS and then incubated with room temperature PBS with 1% formaldehyde (freshly prepared from 16% stock, Thermo) at room temperature for ten minutes with gentle rocking. After crosslinking, 1.5 mL 2M glycine in PBS was added to each dish and rocked at room temperature for 5 minutes. Cells were then washed twice with ice cold PBS containing protease inhibitors (Roche), where the second wash was allowed to sit at 4°C for 5 minutes. Cells were then scraped in 4 mL PBS plus protease inhibitors, spun at 4°C for 5 minutes at 500g, and the pellet was flash frozen and stored at −80°C. Cell pellets were allowed to thaw on ice, incubated with Cell Lysis Buffer (20 mM TRIS pH 8.0, 85 mM KCl, 0,5% NP-40, plus protease inhibitors) for ten minutes, and spun at 5,000g for 5 minutes at 4°C. The nuclear pellet was treated the same way a second time. The nuclear pellet was resuspended in Nuclear Lysis Buffer (10 mM TRIS pH 7.8, 1% NP40, 0.5% sodium deoxycholate, 0.1% SDS, plus protease inhibitors) for sonication. Sonication was performed on a Branson microtip sonicator for 8 minutes 0.3 seconds on, 1.7 off at 4°C. All sonicated chromatin was assessed for fragment size, requiring more than 60% of the size distribution to be between 150 to 700 bp long, with the average around 350 bp (Agilent TapeStation). Off-target sites were predicted via crispr.mit.edu. For ChIP, streptavidin (SA)-conjugated to magnetic beads (Thermo), M2 anti-FLAG antibody (Sigma) or anti-V5 antibodies (MBL Life Sciences) was conjugated to a 50:50 mix of Protein A: Protein G Dynabeads (Invitrogen) was incubated with sheared chromatin at 4°C overnight. qPCR was performed with either Roche 2× Sybr mix (biological duplicates or triplicates, measurement triplicates) on a Lightcycler (Agilent) or via digital droplet PCR (biological quadruplicates, measurement singlicate) (BioRad). All primers were synthesized by Integrative DNA Technologies (IDT).

Western blot analysis

sgRNA-293-Caspex cells were labeled as described above. 40 ug of whole cell lysate was separated by SDS-PAGE, transferred to nitrocellulose, and blotted against FLAG (Sigma, 1:2000 dilution) or biotin (Li-Cor IRdye 800 CW Streptavidin and IRdye 680RD anti-Mouse IgG, both at 1:10,000 dilution).

Enrichment of biotinylated proteins for proteomic analysis

Eight 15 cm2 plates of each sgRNA-293-Caspex line (~3e8 cells per line), or no-guide as a negative control, were used for proteomic experiments. Labeled whole cell pellets were lysed with RIPA (50 mM TRIS pH 8.0, 150 mM NaCl, 1% NP-40 and 0.5% sodium deoxycholate, 0.1% sodium dodecyl sulfate) with protease inhibitors (Roche) and probe sonicated to shear genomic DNA. Whole cell lysates were clarified by centrifugation at 14,000g for 30 minutes at 4°C and protein concentration was determined by Bradford. 500 uL SA magnetic bead slurry (Thermo) was used for each sgRNA line (between 60–90 mgs of protein/state). Lysates of equal protein concentrations were incubated with SA for 120 minutes at room temperature, washed twice with cold lysis buffer, once with cold 1M KCl, once with cold 100 mM Na3CO2, and twice with cold 2 M urea in 50 mM ammonium bicarbonate (ABC). Beads were resuspended in 50 mM ABC and 300 ng trypsin and digested at 37°C overnight. 10 mM TCEP and 10 mM iodoacetamide was added after digestion and allowed to incubate at room temperature for 30 minutes in the dark. The inherent sensitivity limits of current mass spectrometers and the unavoidable sample losses at each sample handling step, require that a large amount of input material be used per guide. Fortunately, these input requirements are readily attainable with many cell culture systems but may prove more challenging with recalcitrant or limited passaging cells.

Isobaric labeling and liquid chromatography tandem mass spectrometry

On-bead digests were desalted via Stage tip[51] and labeled with TMT (Thermo) using an on-column protocol. For on-column TMT labeling, Stage tips were packed with one punch C18 mesh (Empore), washed with 50 uL methanol, 50 uL 50% acetonitrile (ACN)/0.1% formic acid (FA), and equilibrated with 75 uL 0.1% FA twice. The digest was loaded by spinning at 3,500g until the entire digest passed through. The bound peptides were washed twice with 75 uL 0.1% FA. One uL of TMT reagent in 100% ACN was added to100 uL freshly made HEPES, pH 8, and passed over the C18 resin at 2,000g for 2 minutes. The HEPES and residual TMT was washed away with 75 uL 0.1% FA twice and peptides were eluted with 50 uL 50% ACN/0.1% FA followed by a second elution with 50% ACN/20 mM ammonium hydroxide, pH 10. Peptide concentrations were estimated using an absorbance reading at 280 nm and mixed at equal ratios. Mixed TMT labeled peptides were step fractionated by basic reverse phase on a sulfonated divinylbenzene (SDB-RPS, Empore) packed Stage tip into 6 fractions (5, 10, 15, 20, 30 and 55% ACN in 20 mM ammonium hydroxide, pH 10). Each fraction was dried via vacuum centrifugation and resuspended in 0.1% formic acid for subsequent LC-MS/MS analysis. Chromatography was performed using a Proxeon EASY-nLC at a flow rate of 200 nl/min. Peptides were separated at 50°C using a 75 micron i.d. PicoFit (New Objective) column packed with 1.9 um AQ-C18 material (Dr. Maisch) to 20 cm in length over an 84 min effective gradient. Mass spectrometry was performed on a Thermo Scientific Q Exactive Plus (hTERT data) or a Lumos (c-MYC data) mass spectrometer. After a precursor scan from 300 to 2,000 m/z at 70,000 resolution the top 12 most intense multiply charged precursors were selected for HCD at a resolution of 35,000. Data were searched with Spectrum Mill (Agilent) using the Uniprot Human database, in which the CASPEX protein was amended. A fixed modification of carbamidomethylation of cysteine and variable modifications of N-terminal protein acetylation, oxidation of methionine, and TMT-10plex labels were searched. The enzyme specificity was set to trypsin and a maximum of three missed cleavages was used for searching. The maximum precursor-ion charge state was set to 6. The precursor mass tolerance and MS/MS tolerance were set to 20 ppm. The peptide and protein false discovery rates were set to 0.01.

Data analysis

All non-human proteins and human proteins identified with only one peptide were excluded from downstream analyses. Human keratins were included in all analyses but were removed from the figures. The moderated T-test (http://software.broadinstitute.org/cancer/software/genepattern/) was used to determine proteins statistically enriched in the sgRNA-293-Caspex lines compared to the no sgRNA control. After correcting for multiple comparisons (Benjamini-Hochberg procedure), any proteins with an adjusted p-value of less than 0.05 were considered statistically enriched. Pathway analysis was performed using the Quack algorithm incorporated into Genets (http://apps.broadinstitute.org/genets) to test for enrichment of canonical pathways in the Molecular Signature Database (MSigDB). Proteins identified as significantly enriched (adj. p-val. < 0.05) by GLoPro were input into Genets and were queried against MSigDB. Pathways enriched (FDR < 0.05) were investigated manually for specific proteins for follow-up.

Data availability

The original mass spectra may be downloaded from MassIVE (http:\\massive.ucsd.edu), PXD009187. The data are directly accessible via ftp://massive.ucsd.edu/MSV000082154.

49 in total

1. The hTERT and hTERC telomerase gene promoters are activated by the second exon of the adenoviral protein, E1A, identifying the transcriptional corepressor CtBP as a potential repressor of both genes.

Authors: Rosalind M Glasspool; Sharon Burns; Stacey F Hoare; Catharina Svensson; W Nicol Keith
Journal: Neoplasia Date: 2005-06 Impact factor: 5.715

2. Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells.

Authors: Xuebing Wu; David A Scott; Andrea J Kriz; Anthony C Chiu; Patrick D Hsu; Daniel B Dadon; Albert W Cheng; Alexandro E Trevino; Silvana Konermann; Sidi Chen; Rudolf Jaenisch; Feng Zhang; Phillip A Sharp
Journal: Nat Biotechnol Date: 2014-04-20 Impact factor: 54.908

3. Adenoviral expression of p53 represses telomerase activity through down-regulation of human telomerase reverse transcriptase transcription.

Authors: T Kanaya; S Kyo; K Hamada; M Takakura; Y Kitagawa; H Harada; M Inoue
Journal: Clin Cancer Res Date: 2000-04 Impact factor: 12.531

4. Human telomerase reverse transcriptase (hTERT) is a target gene of β-catenin in human colorectal tumors.

Authors: Stefanie Jaitner; Jana A Reiche; Achim J Schäffauer; Elke Hiendlmeyer; Hermann Herbst; Thomas Brabletz; Thomas Kirchner; Andreas Jung
Journal: Cell Cycle Date: 2012-08-16 Impact factor: 4.534

5. Multidimensional Tracking of GPCR Signaling via Peroxidase-Catalyzed Proximity Labeling.

Authors: Jaeho Paek; Marian Kalocsay; Dean P Staus; Laura Wingler; Roberta Pascolutti; Joao A Paulo; Steven P Gygi; Andrew C Kruse
Journal: Cell Date: 2017-04-06 Impact factor: 41.582

6. An Approach to Spatiotemporally Resolve Protein Interaction Networks in Living Cells.

Authors: Braden T Lobingier; Ruth Hüttenhain; Kelsie Eichel; Kenneth B Miller; Alice Y Ting; Mark von Zastrow; Nevan J Krogan
Journal: Cell Date: 2017-04-06 Impact factor: 41.582

7. Purification of proteins associated with specific genomic Loci.

Authors: Jérôme Déjardin; Robert E Kingston
Journal: Cell Date: 2009-01-09 Impact factor: 41.582

8. A promiscuous biotin ligase fusion protein identifies proximal and interacting proteins in mammalian cells.

Authors: Kyle J Roux; Dae In Kim; Manfred Raida; Brian Burke
Journal: J Cell Biol Date: 2012-03-12 Impact factor: 10.539

9. Proximity biotinylation and affinity purification are complementary approaches for the interactome mapping of chromatin-associated protein complexes.

Authors: Jean-Philippe Lambert; Monika Tucholska; Christopher Go; James D R Knight; Anne-Claude Gingras
Journal: J Proteomics Date: 2014-10-02 Impact factor: 4.044

10. RNA-guided gene activation by CRISPR-Cas9-based transcription factors.

Authors: Pablo Perez-Pinera; D Dewran Kocak; Christopher M Vockley; Andrew F Adler; Ami M Kabadi; Lauren R Polstein; Pratiksha I Thakore; Katherine A Glass; David G Ousterout; Kam W Leong; Farshid Guilak; Gregory E Crawford; Timothy E Reddy; Charles A Gersbach
Journal: Nat Methods Date: 2013-07-25 Impact factor: 28.547

57 in total

10. Proteomic Profiling of the ECM of Xenograft Breast Cancer Metastases in Different Organs Reveals Distinct Metastatic Niches.

Authors: Jess D Hebert; Samuel A Myers; Alexandra Naba; Genevieve Abbruzzese; John M Lamar; Steven A Carr; Richard O Hynes
Journal: Cancer Res Date: 2020-02-04 Impact factor: 12.701