Literature DB >> 33693821

An RNA tagging approach for system-wide RNA-binding proteome profiling and dynamics investigation upon transcription inhibition.

Zheng Zhang1, Tong Liu1, Hangyan Dong1, Jian Li1, Haofan Sun1, Xiaohong Qian1, Weijie Qin1,2.   

Abstract

RNA-protein interactions play key roles in epigenetic, transcriptional and posttranscriptional regulation. To reveal the regulatory mechanisms of these interactions, global investigation of RNA-binding proteins (RBPs) and monitor their changes under various physiological conditions are needed. Herein, we developed a psoralen probe (PP)-based method for RNA tagging and ribonucleic-protein complex (RNP) enrichment. Isolation of both coding and noncoding RNAs and mapping of 2986 RBPs including 782 unknown candidate RBPs from HeLa cells was achieved by PP enrichment, RNA-sequencing and mass spectrometry analysis. The dynamics study of RNPs by PP enrichment after the inhibition of RNA synthesis provides the first large-scale distribution profile of RBPs bound to RNAs with different decay rates. Furthermore, the remarkably greater decreases in the abundance of the RBPs obtained by PP-enrichment than by global proteome profiling suggest that PP enrichment after transcription inhibition offers a valuable way for large-scale evaluation of the candidate RBPs.
© The Author(s) 2021. Published by Oxford University Press on behalf of Nucleic Acids Research.

Entities:  

Mesh:

Substances:

Year:  2021        PMID: 33693821      PMCID: PMC8216453          DOI: 10.1093/nar/gkab156

Source DB:  PubMed          Journal:  Nucleic Acids Res        ISSN: 0305-1048            Impact factor:   16.971


INTRODUCTION

Eukaryotic genomes encode a large number of RNA-binding proteins (RBPs) that interact with RNAs to form dynamic complexes to execute biological functions. These RBPs extensively control every stage of the life cycle of RNA (1,2), including the synthesis and maturation of mRNA, the activation of noncoding RNAs (ncRNAs) in chromatin remodelling and epigenetic regulation and the final decay of RNAs. Thus, RNA–protein interactions are increasingly considered a diverse and critical layer of transcriptome and proteome regulation (3,4). Considering their functional importance, the precise composition and dynamics of RBPs in ribonucleic-protein complex (RNP) under cell stress are receiving increasing attention. The disruption of RBPs may lead to cellular dysfunction and numerous diseases, including metabolic disorders, muscular atrophies, neurological disorders, autoimmune disease and cancer (5,6). Therefore, the systematic identification of RBPs and the quantification of their dynamic changes in RNPs is of great significance for understanding their diverse roles in regulating fundamental cellular functions and disease states (7). The comprehensive mapping of RBPs and the elucidation of their dynamic assembly in RNPs demand efficient tools for the specific and unbiased enrichment of RBPs. Extensive efforts have been devoted to obtain a global view of RBPs and RNPs in recent years (8). Benefitting from UV irradiation-induced cross-linking that ‘freezes’ natural RNPs in intact cells, cross-linking and immunoprecipitation (CLIP)-based strategies have enabled the large-scale analysis of RBPs in different kinds of cells (9,10). Hundreds to thousands of RBPs have been identified by mass spectrometry analysis using enrichment strategies that exploit the polyA tails of mRNA (11,12), alkynyl uridine analog labelling (13,14), solubility of RNPs in organic solvents (15–17) or silica affinity (18,19). These methods largely promoted high-through profiling of RBPs and discovered potential new functions of RBPs in regulating cellular activity. Despite the rapidly expanding RNA-binding proteome, there are still some limitations in the current methods. For alkynyl uridine analog-based RNPs enrichment, metabolic labelling may interfere the physiological process of the cell. Although do not involve using non-canonical uridine, the solubility- or silica affinity-based methods may face the glycoprotein contamination issue, due to the similar solubility or silica affinity between RNPs and glycoproteins (16,20–22). Therefore, we developed a photoreactive psoralen probe (PP) for the global enrichment of in vivo UV cross-linked RNPs (Scheme 1) and RBPs identification by mass spectrometry (MS). In this strategy, the specific isolation of RNPs is achieved by psoralen tagging of RNA upon 365 nm UV irradiation. Psoralens are a class of planar, three-ring heterocyclic compounds that can form covalent bonds through cycloaddition reactions with uridine in RNA upon irradiation with 320–410nm UV (23–25). Although DNA can also react with psoralen via thymine and be captured by streptavidin beads, UV crosslinking of protein to DNA is much less efficient than that between proteins and RNA (15,26) and requires energy doses 10–100 times higher than that used for protein-RNA crosslinking (27). Furthermore, RNase A treatment is adopted for elution of the enriched RBPs by RNA degradation, further eliminating contamination by DNA-binding proteins. By the cycloaddition reactions between psoralen and uracil, the comprehensive isolation of both coding and noncoding RNPs achieved without the need of metabolic labelling of RNAs. In total, 2986 RBPs, including 782 putative novel RBPs were identified by MS analysis after PP enrichment from HeLa cells. These proteins cover ∼70% of the RBPs obtained from the same cell line by previously reported polyA-dependent and polyA-independent strategies in five different works (11,13–15,28). The 782 candidate RBPs include 178 metabolism related proteins distributed among a variety of pathways, indicating a possibly broader interplay between RNA and metabolism than previously anticipated, which deserves further exploration. Furthermore, the large-scale dynamic investigation of the PP-enriched RBPs in the RNPs during RNA transcription inhibition reveals distinct decreasing patterns compared with the corresponding RBPs in global proteome profiling, therefore, providing further evidence to support the candidate RBPs. This finding suggests the potential of this strategy as a high-throughput way for evaluation of the candidate RBPs. Considering the different enrichment mechanism of PP, phase-separation and silica affinity methods (covalent-binding/dissolution property/affinity adsorption), combined application of these complementary methods is advantage for achieving more comprehensive enrichment and deeper coverage of RNA-binding proteome.
Scheme 1.

Schematic overview of the PP-based RNP enrichment for large-scale RNA sequencing and RBP identification by MS.

Schematic overview of the PP-based RNP enrichment for large-scale RNA sequencing and RBP identification by MS.

MATERIAL AND METHODS

Synthesis of the psoralen probe (PP)

The synthesis steps and characterization of PP are shown in the Supplementary Figure S1A–C, Supporting Information.

Cell and culture conditions

HeLa cells were grown in DMEM (Gibco) supplemented with 10% FBS (Gibco), 100 U/ml penicillin, and 100 μg/ml streptomycin (Gibco) at 37°C in a 5% CO2 atmosphere. For the actinomycin D (ActD) treatment experiment, HeLa cells were treated with 5 μM ActD (J&K Scientific) at 37°C in DMEM medium (10% FBS, 100 U/ml penicillin, and 100 g/ml streptomycin) for 0, 2, 5 and 9 h before collecting the cells.

Isolation of RNPs by in vivo cross-linking and RNA tagging by PP

All the buffers were prepared using RNase-free H2O. The HeLa cells were washed with 5mL cold PBS for three times, followed by irradiation with 254-nm UV light at 0.25 J/cm2 on ice using a UV cross-linker (CL-1000; UVP). After adding 2 ml cold PBS, the cells were harvested using a cell lifter (Corning) and collected in a 1.5-ml RNase-free tube. After centrifugation and discarding the supernatants, the cells were re-suspended in 250 μl lysis buffer I [PBS, 0.5% SDS, EDTA-free protease inhibitor mixture (Thermo), ribonucleoside vanadyl complex (New England Biolabs)] and homogenized by passing through a narrow needle, followed by incubation at 4°C with gentle rotation for 20 min. All the buffers were prepared using RNase-free H2O. Next, the homogenate was adjusted with 1 ml lysis buffer II containing PBS, 0.1% Triton X-100, EDTA-free protease inhibitor mixture (Roche) and ribonucleoside vanadyl complex (New England Biolabs) and homogenized by passing through a narrow needle, followed by incubation at 4°C with gentle rotation for 20 min. The cell lysates was centrifuged at 16 000 g for 10 min and the supernatant was transferred to another 1.5 ml RNase-free tube. The supernatant was treated with 5 μM PP at 4°C for 20 min, followed by irradiation on ice with 365-nm UV light at 2 J/cm2 using a UV cross-linker (CL-1000; UVP) for 4 min. The supernatant was concentrated to a final volume of 100 μl by an Amicon Ultra-2 centrifugal filter unit (molecular weight cutoff of 10 kDa, Millipore) and washed three times with 500 μl cold PBS. The PP treated lysate was adjusted to a volume of 500 μl with 2 M urea, and mixed with 100 μl precleared streptavidin magnetic beads (Thermo), followed by incubation with gentle rotation at 4°C for 1 h. After discarding the supernatant, the beads were washed twice with 200 μl PBS containing 0.2% SDS for 1 min, twice with 200 μl PBS containing 8 M urea for 1 min, twice with 200 μl PBS for 1 min and twice with 50 mM NH4HCO3 buffer for 1 min. For RBPs elution, the beads were incubated with 20 μl 0.01 μg/μl RNase A at 37°C for 1 h. The eluted RBPs were analysed by SDS-PAGE with silver staining or MS-based proteomic analysis. For control samples in MS-based proteomic analysis, the cells and lysates were subjected to the same treatment, except that the beads were incubated the with 0.1 μg/μl RNase A at 37°C for 1 h before the washing step and discarding the washing solution.

Analysis of the PP captured RNA

Preparation of the RNA sample

The RNPs from HeLa cells were captured by streptavidin magnetic beads using the method as described above. Subsequently, the beads were resuspended in 400 μl elution buffer (12.5 mM biotin, 75 mM NaCl, 7.5 mM, TrisHCl, pH 7.5, 1.5 mM EDTA, 0.15% SDS, 0.075% sarkosyl and 0.02% Na-deoxycholate dissolved in RNase-free H2O) and were incubated at RT for 20 min on a shaker, followed by heating at 65°C for 10 min. The solution was collected and the beads were eluted again to give 800 μl solution in total. 800 μl Proteinase buffer (100 mM TrisHCl, pH 7.5, 12.5 mM EDTA, 150 mM NaCl and 2% SDS dissolved in RNase-free H2O) and 2 mg/ml Proteinase K (Ambion) was added and incubated at 55°C for 1 h. The eluted RNAs were further purified by TRIzol following the manufacturer's instructions. Next, the RNA samples were prepared with the TruSeq RNA Library Prep Kit v2 (Illumina, not stranded). The samples in total were barcoded to be sequenced on HiSeq2500.

Analysis of RNA-Seq Data

For estimation of the rRNA content of libraries, reads were aligned to a collection of human ribosomal sequences which retrieved from NCBI nucleotide database. All reads were aligned to those sequences using bowtie2 and reads that failed to align were saved as non-rRNA reads file. Reads in this file were aligned to the complete hg38 assembly by using STAR. Percentage of the rRNA content was estimated by comparing the number of reads aligning to the rRNA sequences and the residual reads aligning to the complete hg38 assembly. For estimating the content of RNA biotypes in libraries, reads were aligned to the hg38 assembly using STAR. Subsequently, counting was performed with HTSeq-count using the geneset annotated by GENCODE Release 21 (GRCh38) and using the GTF feature ‘gene’ for counting.

Proteomic analysis of the enriched RBPs

After releasing the PP enriched RBPs by RNase A elution, the RBPs were digested by filter aided sample preparation method (FASP) (29). Stable isotopic dimethyl labelling was conducted as previously described (30) for quantitative comparison between the enrichment group of the RBPs and the control. Detailed experimental procedures were provided in the Supporting information.

Proteomic identification by LC-MS/MS

The FASP digested and dimethyl labelled peptide samples were dissolved in 0.1% FA and were loaded into an in-house made 15 cm length reverse phase column (150 nm id) packed with Ultimate XB-C18 1.9 μm resin (Welch materials). An Easy nLC 1000 system (Thermo) was used to separate the peptides using the following gradient: 5–8% B for 8 min, 8–22% B for 50 min, 22–32% B for 12 min, 32–90% B for 1 min, and 90% B for 7 min (A is 0.1% formic acid in water and B is 0.1% formic acid in acetonitrile). A constant flow rate of 600 nl/min was applied. The eluted peptides were sprayed into an Orbitrap Fusion™ Tribrid™ mass spectrometer (Thermo) equipped with a nanoelectrospray ionization source. The mass spectrometer was operated in data-dependent mode with a full MS scan (300–1400 m/z) at a resolution of 120 000, a maximum injection time of 100 ms and an AGC target value of 5e5, followed by Higher-energy Collision Dissociation (HCD) with 32% normalized collision energy. The MS2 spectra were acquired in the ion trap with an AGC target value of 5000 and a maximum injection time of 35 ms. The dynamic exclusion was set to 18 s.

Mass spectrometric data analysis

The raw data files were searched using MaxQuant (version 1.5.2.8) against UniProt database (release on 2015, 20 198 entry). Trypsin was set as the digestion enzyme with a maximum of two missed cleavages and the minimal peptide length was set to six amino acids. Carbamidomethyl cysteine was set as fixed modification and methionine oxidation and acetyl N-terminal were set as variable modifications. For peptide identification, the mass tolerances were 20 ppm for precursor ions and 0.5 Da for fragment ions. The false discovery rate was set ≤1% at spectra and protein level. Missing values were imputed using the minimum values in each dataset (31). For RBP filtering, proteins with a minimum of two identified unique peptides in at least two tests and a fold change of two or greater with P < 0.01 in the experimental groups compared with the control groups were considered as RBPs. The abundance changes of the PP enriched RBPs at different time points after ActD treatment were acquired based on their label-free intensity provided by MaxQuant and were further normalized by their corresponding reduction in the global proteome level to acquire the actual binding and crosslinking abundance (BCA) variation of the RBPs.

RESULTS

Establishment of the psoralen probe (PP)-based RNP enrichment strategy

In this work, we developed a PP-based RNA tagging strategy for the highly specific enrichment of the RNPs by exploiting the photoreactivity of psoralen analogs with uracil under 365 nm UV irradiation (Scheme 1). First, 254 nm UV irradiation was applied to living cells to establish cross-linking between RNA and RBPs. After cell lysis, PP was introduced to the cell lysate to covalently tag both coding and noncoding RNAs under 365 nm UV irradiation. The PP-tagged RNPs and RNAs were next captured by streptavidin beads and subjected to either RNA sequencing or mass spectrometry analysis. The synthetic route of the bifunctional PP is shown in Supplementary Figure S1A (Supporting Information). PP is composed of a uracil reactive psoralen, a biotin and a spacer arm. HNMR was carried out to confirm the successful synthesis of PP (Supplementary Figure S1B and C). The feasibility of PP-based RNP isolation was first demonstrated by gel electrophoresis and RNA staining (Supplementary Figure S2). RNA bands are discovered after treating the isolation products with proteinase K to remove proteins from RNAs. In contrast, the RNA bands are completely abolished upon RNase A digestion or by the omission of PP, indicating the binding and isolation RNPs by PP. The PP-enriched RNPs were further characterized by SDS-PAGE after eluting the beads-captured RBPs by RNase A treatment. Clear proteins bands of the PP enrichment products are displayed in Figure 1A. In contrast, proteins bands can barely be observed without PP or 254/365 nm UV treatment, indicating a high selectivity of this enrichment method with only marginal nonspecific adsorption of non-RBPs. Using RNase A containing buffer in the washing step results in the disassociation of the RNPs and the loss of the RBPs before elution, therefore serving as a stringent negative control for this method. As expected, the protein bands are almost completely abolished by RNase A washing before elution, confirming the RNA dependence of the observed protein bands (Supplementary Figure S3). The successful enrichment of the RBPs was further demonstrated by the detection of known RBPs in western-blotting analysis, as shown in Figure 1B. ELAVL1, nucleolin and PTBP1 were clearly enriched by PP and were undetectable in the negative controls without PP or 254/365 nm UV treatment. Similar to the large-scale enrichment, RNase A treatment depletes the protein bands of three RBPs, confirming that these RBPs were directly bound and cross-linked to RNAs. The high selectivity of this method was also proven by β-tubulin and β-actin. No obvious sign of non-specific adsorption of these two non-RBPs was found in the PP enrichment products. Next, the concentration of PP and the reaction time with cell lysates were evaluated. As shown in Supplementary Figure S4A and B, optimized enrichment was obtained with 5 μM PP and 4 min 365 nm UV irradiation. No obvious degradation of RNA was observed under these condition (Supplementary Figure S4C).
Figure 1.

(A) SDS-PAGE characterization of PP enriched RBPs. (B) Western-blotting analysis of PP enriched ELAVL1, Nucleolin and PTBP1, β-tubulin and β-actin.

(A) SDS-PAGE characterization of PP enriched RBPs. (B) Western-blotting analysis of PP enriched ELAVL1, Nucleolin and PTBP1, β-tubulin and β-actin.

RNA sequencing and proteome identification of the RNPs obtained by PP enrichment

To further explore the type and relative distribution of the PP isolated RNAs, the captured RNPs were treated with proteinase K to digest and remove the RBPs. Control experiments were also conducted by extracting RNAs using the traditional TRIzol method. The resulting RNAs were analyzed by RNA sequencing (Supplementary Figure S5). The PP enrichment products and the TRIzol controls exhibit consistent RNA distribution patterns. In addition to rRNA, other major types of RNA were also found by PP enrichment, including protein-coding mRNAs and various types of noncoding RNAs, such as lincRNA, antisence RNA, snRNA and miRNA. These results demonstrate the comprehensiveness of PP tagging and its capability to enrich RBPs bound to various kinds of RNAs, although psoralen analogs were originally used for probing double-stranded regions of RNAs (23). We noticed the relatively high content of mRNA in the PP enrichment product, which may be attributed to underestimation of tRNA, since it is small and highly modified and difficult to sequence by standard methods (13,32). Presence of tRNA was revealed using gel electrophoresis by the peak around 100 nt in Supplementary Figure S6. For large-scale RBP profiling by mass spectrometry, it is crucial to differentiate and remove the non-RBPs in the results. To evaluate the proper and stringent control conditions for obtaining reliable RBPs, quantitative proteomics analysis was carried out using the Experiment and the Control design illustrated in Supplementary Table S1. As shown in Supplementary Figure S7, Control 4 showed stronger intensity than the other three controls, indicating it included more false positive hits. This result was not unexpected, since Control 4 had gone through all the treatments that may introduce false positive hits (254 nm UV irradiation, PP-treatment & 365 nm UV irradiation), while the other three controls had gone through only two of the three treatments. Therefore, Control 4 was chosen for further investigation of the PP-assay using the scheme shown in Figure 2A. Washing with RNase A before elution in the control group degrades RNA and removes the RNPs from the sample. In contrast, RNase A is not applied in the washing step, but in the elution step for the experiment group, which leads to specific elution of the beads-PP captured RBPs without the possible co-elution of DNA-binding proteins. Furthermore, a stringent screening cut-off was applied, which required a fold change >2 and a P value <0.01 with a minimum of two identified unique peptides in at least two tests. In this way, we identified 2986 highly confident RBPs with good reproducibility (Pearson correlations > 0.85 in three technical replicates) (Figure 2B, Supplementary Figure S8 and Table S2 of the Supporting Information).
Figure 2.

(A) Experimental design of the quantitative differential proteomic comparison between the experiment group and control group for RBPs identification. (B) Scatter plot displaying the log2 fold change (x-axis) and –log P values (y-axis) for RBPs identification by quantitative differential proteomic comparison between the experiment group and control group.

(A) Experimental design of the quantitative differential proteomic comparison between the experiment group and control group for RBPs identification. (B) Scatter plot displaying the log2 fold change (x-axis) and –log P values (y-axis) for RBPs identification by quantitative differential proteomic comparison between the experiment group and control group. Domain analysis of the identified RBPs shows that most of the top represented domains comprise classical RBDs, such as RRM, KH and DEAD domains as well as other non-classical RBDs (Figure 3A). Interestingly, we found ‘SPEC’ domain was also enriched in our RBPs, which was not previously reported as a RBD. We conducted nucleotide-crosslinked peptide analysis of the enriched RBPs to locate the RNA-binding site and to further determine the identity of ‘SPEC’ using reported methods (15). MS analysis of the enriched cyclic-U modified peptides resulted in seven RBPs identified with SPEC domain (Supplementary Table S4). The high RNA correlation of the identified RBPs is confirmed by the top enriched pathways, which are almost all associated with RNA processing, such as ribosomal subunite formation and hydrolysis, rRNA processing, nonsense mediated decay and mRNA translation (Figure 3B). Consistently, gene ontology (GO) analysis in Figure 3C reveals highly enriched GO terms related to RNA. The most enriched molecular function and biological process terms exhibit an over-representation of RNA binding and other RNA related GO terms, suggesting that the dataset contains a large number of known RBPs. Other GO terms, such as ‘Catalytic activity’ and ‘Ubiquitin-specific protease activity’ are also enriched in the molecular function. In the ‘catalytic activity’ category, many nucleotide-binding and ATP-binding proteins were found, which is consistent with previous studies (13,14). For ‘ubiquitin-specific protease activity’, more than twenty E3 ubiquitin ligases were previously found overlap with RNA-binding proteome (33,34) and we further expanded this category by identifying >50 E3 ubiquitin ligases/transferases and related proteins. Interestingly, extracellular exosome was found to be the fourth top enriched term in the cellular component analysis. Further comparison with the exosome protein database (ExoCarta) showed that 83 of the top 100 most identified exosome proteins were found in our RBPs. Such high coverage of exosome proteins was not reported in other studies using either polyA-dependent or polyA-independent enrichment strategies. Considering that RNAs in exosomes have been extensively studied in recent years as functional regulators and biomarkers for disease diagnostics and that few reports on their interacting proteome have been published, the PP-based strategy may facilitate the deeper exploration of RBPs in extracellular exosome.
Figure 3.

(A) Number of identified RBPs with classical (left) and non-classical RBDs (right). (B) pathway and (C). GO enrichment analysis of the obtained RBPs.

(A) Number of identified RBPs with classical (left) and non-classical RBDs (right). (B) pathway and (C). GO enrichment analysis of the obtained RBPs. Compared with the reported RBPs enriched from HeLa cells by polyA-dependent and polyA-independent strategies (11,13–15,28), the products of PP enrichment show overlapping but distinct results. As displayed in Figure 4, PP-based enrichment not only covered majority of the reported RBPs in HeLa cells, but also provided >1000 additional RBPs, presumably due to the facile and efficient RNA tagging via the cycloaddition reaction by the psoralen group of PP, which is capable of pulling down both coding and non-coding RBPs. We noticed that even after combining three large-scale RBP data obtained by polyA independent methods and ours, 121 RBPs were still exclusively identified by the polyA dependent enrichment methods (Figure 4C). Failure to cover these RBPs by the polyA independent methods may be attributed to the low abundancy of coding RNA in the total RNA. Coding RNA binding proteins may be overwhelmed by the more abundant non-coding RNA binding proteins and lost in mass spectrometry analysis. Next, we compiled an MS identified RBP dataset of 4758 RBPs from studies using different cell lines. Although the size of RBP data has been substantially expanded by the recently reported RBP enrichment methods, our PP-based enrichment still provided 782 candidate RBPs that were not found in previous large-scale RBP profiling (Supporting Information Supplementary Table S3). We compared the physiochemical characteristics of the PP identified all RBPs, 782 candidate RBPs and that of the literature reported RBPs, including positively charged amino acid, isoelectric point, hydrophobicity and disorder. The RNA-binding proteomes were divided into polyA interactome and non-polyA interactome following the way described previously (15). As shown in Supplementary Figure S9, the distribution curves of both of the PP identified all RBPs and candidate RBPs resemble that of the previously known RBPs and fall within the range of those of the RBPs obtained by phase separation-based methods (OOPS and XRNAX) (15,16) or metabolic labelling-based methods (RICK and CARIC) (13,14). Interestingly, the non-poly(A)-binding proteins are clearly distinct when compared to poly(A)-binding proteins. The poly(A)-binding RBPs are less hydrophobic and obviously more disordered. Whereas the non-poly(A)-binding RBPs do not have bimodal isoelectric points (IPs) distribution and generally have more acidic IPs and less content of positively charged amino acids. The 782 candidate RBPs (red) exhibit an obviously lower abundance distribution in Supplementary Figure S10A, while the previously reported RBPs (blue) found in this work are more concentrated in region of higher abundance. Similar trends are also found in Supplementary Figure S10B. The median abundance of the candidate RBPs is ∼10-fold lower than that of the reported RBPs. To further explore the function of these candidate RBPs, we conducted GO enrichment analysis of the 782 candidate RBPs that were not reported by other works (11–16,35,36) (Supplementary Figure S11A). Interestingly, we found that the GO terms ‘membrane’ and ‘GTPases activity’ are among the top enriched GO terms in the cellular component and molecular function. Although not previously reported as RBPs, proteins in both terms regulate protein transport and membrane trafficking via binding with RNA for correct targeting towards specific organelles (37,38). DNA-binding RBPs were previously reported via enrichment using the serIC method (39). 75% of the DNA-binding RBPs enriched by serIC were also identified our PP-assay (Supplementary Table S4). Further inspection on the 782 candidate RBPs revealed additional 55 DNA-binding proteins (DBPs) not discovered in previous work, including [poly [ADP-ribose] polymerase 2 (PARP-2), that are involved in diversified functions. Although the RNA-binding abilities of these DBPs have not been discovered in previous large-scale RBP identification, the SAP domain containing PARP-2 was reported to bind to specific transcripts for RNA regulation in an individual protein study (40), indicating the reliability of these candidate RBPs. Furthermore, GO (biological process) and pathway analysis in Supplementary Figure S11A both exhibited overrepresented ‘Metabolism’ among the candidate RBPs, especially the citric acid (TCA) cycle. TCA cycle have long been related to RNA-binding activities. PP enrichment approach managed to identify 70 RBPs that associated with TCA cycle and covered more than 80% of the previously reported TCA cycle-related RBPs (4,5) (Supplementary Figure S11B). The binding between RNA and numerous metabolic enzymes also has been previously documented to regulate the expression of the bound mRNA in response to metabolite levels or, conversely, to control the activity of the bound enzyme (41). Compared with the RBPs identified by PP-assay, 79 of the 104 reported RNA-binding enzymes (42,43) were covered (Supplementary Table S4). Based on the Reactome pathway analysis (44), we found that the 782 candidate RBPs include 178 metabolism related proteins (Supplementary Figure S11C) that are distributed among a variety of pathways, including TCA cycle, carbohydrate and lipid metabolism. These candidates expanded the scale of RBP, suggesting that the interplay between RNA, DNA, membrane, protein and metabolite may be broader than previously realized and deserves further exploration.
Figure 4.

UpSet plots comparing the RBPs identified by PP-assay and other polyA independent (A) and dependent (B) RBP enrichment methods. (C) Venn diagram showing overlap of the RBPs identified by PP, reported polyA dependent and independent enrichment methods.

UpSet plots comparing the RBPs identified by PP-assay and other polyA independent (A) and dependent (B) RBP enrichment methods. (C) Venn diagram showing overlap of the RBPs identified by PP, reported polyA dependent and independent enrichment methods. Next, we selected WDR82, ARL2, ARF5 and ELP3 from the 782 candidate RBPs that were not reported in previous large-scale RBP investigations for validation via cross-linking and immunoprecipitation (CLIP) and western-blotting (45,46). The four proteins were selected from different function categories and subcellular locations to avoid bias towards a particular subgroup. WDR82 is a chromatin binding protein and mainly locates in nucleus (47). ARL2 regulates formation of new microtubules and centrosome integrity and locates in cytoplasm, cytoskeleton, mitochondrion and nucleus (48). ARF5 is a GTP-binding protein involved in protein trafficking and may modulate vesicle budding, and locates in cytoplasm, Golgi apparatus and membrane (49). ELP3 is a RNA polymerase II core binding protein involved in transcriptional elongation and locates in cytoplasm and nucleus (50). FLAG-tagged candidate RBPs were expressed in HeLa cells and irradiated with 254-nm UV light. The tagged RBPs were pulled down together with the cross-linked RNA and characterized by western blotting, after shortening the RNA and labelling it with biotin. As shown in Supplementary Figure S12A, similar to the positive control using known RBP (ELAVL1), the positive existence of RNA was revealed by the corresponding gel bands for the four candidate RBPs and demonstrated their RNA-binding activity. To further evaluate our validation result, we performed PP isolation assay to pull-down all the RNPs and characterized the four candidate RBPs (ELP3, WDR82, ARF5 and ARL2) by antibody-based western-blotting. As shown in Supplementary Figure S12B, corresponding protein bands were found only in the experiment group (with 254, 365 nm UV-irradiation, PP treatment, but not RNase A washing) and the input group. No false positive bands were discovered in any of the control groups.

Actinomycin D (ActD)-assisted RNP stability/dynamics study and large-scale evaluation of the candidate RBPs

RBPs regulate various biological processes by associating and disassociating with RNAs, therefore the stability and dynamics of the corresponding RNPs under different cellular states hold great significance. We conducted a large-scale investigation on the stability/dynamics of RNPs under ActD treatment using our PP-based enrichment strategy. ActD blocks RNA transcription and leads to decreasing levels of RNAs and disassociation of the RNPs in the cell. As shown in Supplementary Figure S13, the levels of various kinds of RNAs in the cell continually decrease after ActD treatment, with protein-coding RNA displaying the most distinct drop of ∼60%. Similarly, the inhibition of RNA transcription results in a decreased number of PP-enriched RBPs, with 2137, 1921, 1742 and 786 RBPs (Supplementary Table S2) identified at 0, 2, 5 and 9 h after ActD treatment, respectively, indicating increased disassembly of RNPs due to the reduced RNA levels. The dynamics of RBPs in the RNPs were determined by quantitative comparison of the PP-enriched RBPs in the ActD-treated cells with the corresponding RBPs in the untreated cells. The quantity of the RBPs obtained by PP-assay is controlled by UV crosslinking between RNA and protein, which may be influenced by multiple factors, such as binding geometry of RNA, nucleotide and amino acids composition, and duration of the interaction (51). Therefore, the same UV-irritation condition must be applied for the cells treated with different ActD conditions to minimize the variation induced by UV crosslinking. Furthermore, to exclude the reduced abundance of RBPs caused by protein degradation and RNA synthesis inhibition, the global proteome level of each RBP at different time points during ActD treatment was determined by proteome profiling. After normalizing the change in abundance of the PP-enriched RBPs by the corresponding reduction in their global proteome levels to obtain their actual binding and crosslinking abundance (BCA) variation, the true dynamic trends of RBPs in the RNP are shown in Figure 5A. Since our PP strategy actually isolates the RNPs, the dynamic trends of each identified RBP can be used as an indication of the stability of the RNPs. The stability/abundance of RNPs can be divided into three categories. Approximately 5% of the RBPs are in relatively stable RNPs, which are present in almost the same abundance throughout the ActD treatment (blue zone). An obviously greater descending trend is displayed in the green zone, with most of the RBPs decreasing below the detection limit of the PP enrichment method after 5 h RNA synthesis inhibition, indicating that these RBPs are present in unstable RNPs with high decay rates. The rest of the RBPs are in RNPs with median decay rates, which disassociate from two to nine hours after ActD treatment.
Figure 5.

(A) heat map and (B) scatter plot of normalized BCA variation of the RBPs after treating cells with ActD for 0, 2, 5 and 9 h (T0–T9). Normalized BCA of RBPs (T0 versus T9) = fold change (T0 versus T9) of RBPs in PP-assay/fold change (T0 versus T9) of the corresponding RBPs in the whole cell lysate.

(A) heat map and (B) scatter plot of normalized BCA variation of the RBPs after treating cells with ActD for 0, 2, 5 and 9 h (T0–T9). Normalized BCA of RBPs (T0 versus T9) = fold change (T0 versus T9) of RBPs in PP-assay/fold change (T0 versus T9) of the corresponding RBPs in the whole cell lysate. Although RNA-binding of the candidate RBPs can be validated individually by the traditional CLIP and western-blotting strategy as mentioned above, the particularly low-throughput of this method is a severe bottleneck for the large-scale evaluation of the hundreds to thousands of candidate RBPs discovered in recent years. Inspired by the ActD induced inhibition of RNA transcription and the dynamic changes of RNPs, we propose a high-throughput way to further evaluate RNA-binding of the 782 candidate RBPs that were not discovered by other large-scale RBP identification studies. As shown in Figure 5B, close to 90% (red) of the candidate RBPs display normalized BCA >0 after nine hours of ActD treatment, which may be attributed to either the decrease of the available RNA and disassembly of RNPs after inhibition of RNA synthesis or changes in the binding activity of the RBPs, though cannot be assessed with the current available data.

DISCUSSION

In this work, we developed a photo-induced RNA tagging and RNP enrichment strategy. We demonstrated the capability of PP tagging to enrich RBPs bound to various kinds of RNAs, although psoralen analogs were originally used for probing the double-stranded regions of RNAs (23). Using this strategy, we successfully identified 2986 RBPs from HeLa cells, including 782 candidate RBPs that were not discovered in previous large-scale studies. PP strategy is simple and does not need metabolic labelling. The high enrichment specificity and free of possible contamination of glycoproteins (16) make this approach advantageous in discovering new candidate RBPs. Interestingly, GO analysis of the 782 candidate RBPs reveals that metabolism and related terms are overrepresented in molecular function and biological process. The 178 metabolism pathways related proteins in the candidate RBPs indicate a possible deeper involvement of RBPs in the regulation of metabolic process than previously known, and the mechanisms of this regulation are yet to be investigated. Furthermore, PP can be used as a powerful tool to study the dynamic association of RNPs in the cell. Large-scale investigation on the perturbations of RNPs during RNA transcription inhibition using our PP-based enrichment strategy reveals the first distribution patterns of RBPs bound to RNA with different decay rates. More importantly, this strategy provides a high throughput way to evaluate candidate RBPs by exploiting their obviously greater abundance decrease in the RNPs than that in the global proteome due to the continued disassociation of the RNPs after the inhibition of RNA synthesis. Though successfully applied in a non-polyA dependent manner and is theoretically capable of enriching both coding and non-coding RNA binding-proteins at the same time, there are still some limitation in the PP-assay. Considering coding RNAs are only 3% in the total RNAs, many of the coding RNA-binding-proteins may be overwhelmed by the more abundant non-coding RNA-binding proteins in the enrichment products and leads to biased RBPs identification. Therefore, for studies targeting at mRNA-binding proteins, polyA dependent enrichment methods should be a better choice (51).

DATA AVAILABILITY

The mass spectrometry proteomics data have been deposited to the ProteomeXchange Consortium via the PRIDE partner repository with the dataset identifier PXD023162. All RNA-seq data used in this manuscript have been deposited in Gene Expression Omnibus (www.ncbi.nlm.nih.gov/geo). The data about the type and relative distribution of the PP isolated RNAs were deposited under accession number GSE163230. The data about the dynamic investigation of the RNA upon transcription inhibition by ActD treatment were deposited under accession number GSE151805. All other data supporting the findings of this study are available from the corresponding author upon reasonable request. Click here for additional data file.
  51 in total

Review 1.  Ribosome-associated GTPases: the role of RNA for GTPase activation.

Authors:  Nina Clementi; Norbert Polacek
Journal:  RNA Biol       Date:  2010-09-01       Impact factor: 4.652

Review 2.  A census of human RNA-binding proteins.

Authors:  Stefanie Gerstberger; Markus Hafner; Thomas Tuschl
Journal:  Nat Rev Genet       Date:  2014-11-04       Impact factor: 53.242

3.  m(6)A RNA methylation is regulated by microRNAs and promotes reprogramming to pluripotency.

Authors:  Tong Chen; Ya-Juan Hao; Ying Zhang; Miao-Miao Li; Meng Wang; Weifang Han; Yongsheng Wu; Ying Lv; Jie Hao; Libin Wang; Ang Li; Ying Yang; Kang-Xuan Jin; Xu Zhao; Yuhuan Li; Xiao-Li Ping; Wei-Yi Lai; Li-Gang Wu; Guibin Jiang; Hai-Lin Wang; Lisi Sang; Xiu-Jie Wang; Yun-Gui Yang; Qi Zhou
Journal:  Cell Stem Cell       Date:  2015-02-12       Impact factor: 24.633

Review 4.  RNA Regulation by Poly(ADP-Ribose) Polymerases.

Authors:  Florian J Bock; Tanya T Todorova; Paul Chang
Journal:  Mol Cell       Date:  2015-06-18       Impact factor: 17.970

5.  The Reactome Pathway Knowledgebase.

Authors:  Antonio Fabregat; Steven Jupe; Lisa Matthews; Konstantinos Sidiropoulos; Marc Gillespie; Phani Garapati; Robin Haw; Bijay Jassal; Florian Korninger; Bruce May; Marija Milacic; Corina Duenas Roca; Karen Rothfels; Cristoffer Sevilla; Veronica Shamovsky; Solomon Shorser; Thawfeek Varusai; Guilherme Viteri; Joel Weiser; Guanming Wu; Lincoln Stein; Henning Hermjakob; Peter D'Eustachio
Journal:  Nucleic Acids Res       Date:  2018-01-04       Impact factor: 16.971

6.  The Cardiomyocyte RNA-Binding Proteome: Links to Intermediary Metabolism and Heart Disease.

Authors:  Yalin Liao; Alfredo Castello; Bernd Fischer; Stefan Leicht; Sophia Föehr; Christian K Frese; Chikako Ragan; Sebastian Kurscheid; Eloisa Pagler; Hao Yang; Jeroen Krijgsveld; Matthias W Hentze; Thomas Preiss
Journal:  Cell Rep       Date:  2016-07-21       Impact factor: 9.423

7.  Global changes of the RNA-bound proteome during the maternal-to-zygotic transition in Drosophila.

Authors:  Vasiliy O Sysoev; Bernd Fischer; Christian K Frese; Ishaan Gupta; Jeroen Krijgsveld; Matthias W Hentze; Alfredo Castello; Anne Ephrussi
Journal:  Nat Commun       Date:  2016-07-05       Impact factor: 14.919

8.  Comprehensive Identification of RNA-Binding Domains in Human Cells.

Authors:  Alfredo Castello; Bernd Fischer; Christian K Frese; Rastislav Horos; Anne-Marie Alleaume; Sophia Foehr; Tomaz Curk; Jeroen Krijgsveld; Matthias W Hentze
Journal:  Mol Cell       Date:  2016-07-21       Impact factor: 17.970

Review 9.  Metabolic Enzymes Enjoying New Partnerships as RNA-Binding Proteins.

Authors:  Alfredo Castello; Matthias W Hentze; Thomas Preiss
Journal:  Trends Endocrinol Metab       Date:  2015-10-28       Impact factor: 12.015

10.  Conserved mRNA-binding proteomes in eukaryotic organisms.

Authors:  Ana M Matia-González; Emma E Laing; André P Gerber
Journal:  Nat Struct Mol Biol       Date:  2015-11-23       Impact factor: 15.369

View more
  2 in total

Review 1.  Protein-DNA/RNA Interactions: An Overview of Investigation Methods in the -Omics Era.

Authors:  Flora Cozzolino; Ilaria Iacobucci; Vittoria Monaco; Maria Monti
Journal:  J Proteome Res       Date:  2021-05-07       Impact factor: 4.466

Review 2.  The Emerging Role of the Interactions between Circular RNAs and RNA-binding Proteins in Common Human Cancers.

Authors:  Meng-Ping Jiang; Wen-Xiu Xu; Jun-Chen Hou; Qi Xu; Dan-Dan Wang; Jin-Hai Tang
Journal:  J Cancer       Date:  2021-06-26       Impact factor: 4.207

  2 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.