| Literature DB >> 34188089 |
A Pipier1,2, A Devaux3, T Lavergne3, A Adrait4, Y Couté4, S Britton1,2, P Calsou1,2, J F Riou5, E Defrancq3, D Gomez6,7.
Abstract
G-quadruplexes (G4) are non-canonical secondary structures consisting in stacked tetrads of hydrogen-bonded guanines bases. An essential feature of G4 is their intrinsic polymorphic nature, which is characterized by the equilibrium between several conformations (also called topologies) and the presence of different types of loops with variable lengths. In cells, G4 functions rely on protein or enzymatic factors that recognize and promote or resolve these structures. In order to characterize new G4-dependent mechanisms, extensive researches aimed at identifying new G4 binding proteins. Using G-rich single-stranded oligonucleotides that adopt non-controlled G4 conformations, a large number of G4-binding proteins have been identified in vitro, but their specificity towards G4 topology remained unknown. Constrained G4 structures are biomolecular objects based on the use of a rigid cyclic peptide scaffold as a template for directing the intramolecular assembly of the anchored oligonucleotides into a single and stabilized G4 topology. Here, using various constrained RNA or DNA G4 as baits in human cell extracts, we establish the topology preference of several well-known G4-interacting factors. Moreover, we identify new G4-interacting proteins such as the NELF complex involved in the RNA-Pol II pausing mechanism, and we show that it impacts the clastogenic effect of the G4-ligand pyridostatin.Entities:
Year: 2021 PMID: 34188089 PMCID: PMC8241873 DOI: 10.1038/s41598-021-92806-8
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1(A) Schematic representation of constrained DNA structures used in the pull-down assay. (B) Global strategy to identify constrained G4 interacting proteins from human cells. Biotin-functionalized G4-constrained molecules (1a and 2) and the biotin-functionalized duplex-DNA control 8 were individually mixed with a semi-total human protein extract from HeLa cells, then trapped by streptavidin magnetic beads to isolate interacting proteins. Protein identification was obtained from MS-based quantitative proteomic analysis and further characterized by western-blotting (arrow), or directly by western-blotting (dashed arrow). (C) Diagram showing the differential enrichment of human proteins on constrained G4 structures relative to control duplex DNA. G4 enriched proteins refer to proteins found enriched on 1a and/or 2 G4 constructions relative to the duplex control 8. 214 out of 425 proteins found enriched on constrained G4 have been shown to interact with nucleic acids. Differentially interacting proteins were sorted out using a fold change ≥ 2 and p-value < 0.05, allowing to reach a false discovery rate (FDR) inferior to 5% according to the Benjamini–Hochberg procedure.
Figure 2Most significant pathways and processes covered by constrained G4 interacting proteins. (A) Enriched KEGG pathways. Gene Ontology terms, (B) GO-Biological processes and (C) GO-Molecular Functions for the 425 proteins found enriched on constrained G4 structures. A right-sided (Enrichment) test based on the hyper-geometric distribution was performed on the corresponding Entrez gene IDs for each gene list and the Bonferroni adjustment (p < 0.05).
Manually curated functional groups from nucleic acid binding proteins enriched on constrained G4 structures.
| Mapped IDs | Gene Name | Mapped IDs | Gene Name |
|---|---|---|---|
| DDX1 | ATP-dependent RNA helicase DDX1 | SF1 | Splicing factor 1 |
| DDX20 | Probable ATP-dependent RNA helicase DDX20 | SF3A1 | Splicing factor 3A subunit 1 |
| DDX23 | Probable ATP-dependent RNA helicase DDX23 | SF3A2 | Splicing factor 3A subunit 2 |
| DDX3X | ATP-dependent RNA helicase DDX3X | SF3A3 | Splicing factor 3A subunit 3 |
| DDX41 | Probable ATP-dependent RNA helicase DDX41 | SF3B1 | Splicing factor 3B subunit 1 |
| DDX42 | ATP-dependent RNA helicase DDX42 | SF3B2 | Splicing factor 3B subunit 2 |
| DDX52 | Probable ATP-dependent RNA helicase DDX52 | SF3B3 | Splicing factor 3B subunit 3 |
| DDX6 | Probable ATP-dependent RNA helicase DDX6 | SF3B6 | Splicing factor 3B subunit 6 |
| DHX16 | Pre-mRNA-splicing factor ATP-dependent RNA helicase DHX16 | SRSF1 | Serine/arginine-rich splicing factor 1 |
| DHX29 | ATP-dependent RNA helicase DHX29 | SRSF10 | Serine/arginine-rich splicing factor 10 |
| DHX30 | Putative ATP-dependent RNA helicase DHX30 | SRSF11 | Serine/arginine-rich splicing factor 11 |
| DHX36 | ATP-dependent RNA helicase DHX36 | SRSF2 | Serine/arginine-rich splicing factor 2 |
| DHX38 | Pre-mRNA-splicing factor ATP-dependent RNA helicase DHX38 | SRSF6 | Serine/arginine-rich splicing factor 6 |
| DHX40 | Probable ATP-dependent RNA helicase DHX40 | SRSF7 | Serine/arginine-rich splicing factor 7 |
| SRSF9 | Serine/arginine-rich splicing factor 9 | ||
| hnRNP A1 | Heterogeneous nuclear ribonucleoprotein A1 | ||
| hnRNP A2B1 | Heterogeneous nuclear ribonucleoproteins A2-B1 | SNRNP200 | U5 small nuclear ribonucleoprotein 200 kDa helicase |
| hnRNP A3 | Heterogeneous nuclear ribonucleoprotein A3 | SNRNP40 | U5 small nuclear ribonucleoprotein 40 kDa protein |
| hnRNP F | Heterogeneous nuclear ribonucleoprotein F | SNRNP70 | U1 small nuclear ribonucleoprotein 70 kDa |
| hnRNP H1 | Heterogeneous nuclear ribonucleoprotein H1 | SNRPA | U1 small nuclear ribonucleoprotein A |
| hnRNP H3 | Heterogeneous nuclear ribonucleoprotein H3 | SNRPA1 | U2 small nuclear ribonucleoprotein A |
| hnRNP L | Heterogeneous nuclear ribonucleoprotein L | SNRPC | U1 small nuclear ribonucleoprotein C |
| hnRNP R | Heterogeneous nuclear ribonucleoprotein R | SNRPD1 | Small nuclear ribonucleoprotein Sm D1 |
| SNRPD2,SNRPD1 | Small nuclear ribonucleoprotein Sm D2 | ||
| CPSF1 | Cleavage and polyadenylation specificity factor subunit 1 | SNRPD3 | Small nuclear ribonucleoprotein Sm D3 |
| CPSF2 | Cleavage and polyadenylation specificity factor subunit 2 | SNRPE | Small nuclear ribonucleoprotein E |
| CPSF3 | Cleavage and polyadenylation specificity factor subunit 3 | SNRPN | Small nuclear ribonucleoprotein-associated protein N;SNRPN |
| CPSF4 | Cleavage and polyadenylation specificity factor subunit 4 | ||
| CRNKL1 | Crooked neck-like protein 1 | ||
| CSTF1 | Cleavage stimulation factor subunit 1 | ||
| CSTF3 | Cleavage stimulation factor subunit 3 |
Constrained G4 interacting factors found related to G4 on UniprotKB, gene Ontology (GO), PubMed abstract and G4IPDB[48] data bases.
| Mapped IDs | Gene Name | PUBMED ID |
|---|---|---|
| ADAR | Double-stranded RNA-specific adenosine deaminase | 24813121 23381195 |
| CNBP | Cellular nucleic acid-binding protein | 23774591 28329689 24594223 26332732 31219592 |
| DDX1 | ATP-dependent RNA helicase | 29731414 |
| DDX42 | ATP-dependent RNA helicase | 31287417 |
| DHX36 | ATP-dependent RNA helicase | 29269411 28069994 25653156 25611385 24151078 22238380 21149580 18842585 16150737 |
| DNMT1 | DNA (cytosine-5)-methyltransferase 1 | 30275516 |
| EWSR1 | RNA-binding protein | 21244633 21561087 22214309 |
| FUS | RNA-binding protein | 18776329 19749353 23521792 24251952 28575444 29434328 29800261 |
| hnRNP A1 | Heterogeneous nuclear ribonucleoprotein A1 | 9188487 19282454 20213319 24371143 24831962 26930004 28510424 29361764 30247678 31311954 |
| hnRNP A2B1 | Heterogeneous nuclear ribonucleoproteins A2/B1 | 15302914 17716999 |
| hnRNP A3 | Heterogeneous nuclear ribonucleoprotein A3 | 27623008 23381195 |
| hnRNP F | Heterogeneous nuclear ribonucleoprotein F | 29269483 |
| hnRNP H1 | Heterogeneous nuclear ribonucleoprotein H | 26930004 27623008 |
| MID1 | E3 ubiquitin-protein ligase Midline-1 | 21930711 |
| Mre11A (*) | Double-strand break repair protein MRE11 | 16116037 |
| RIF1 | Telomere-associated protein | 26436827 29348174 29357064 30510058 31197198 |
| SF3B3 | Splicing factor 3B subunit 3 | 23381195 |
| SRSF1 | Serine/arginine-rich splicing factor 1 | 24771345 |
Figure 3Contrained-G4 interacting factors are associated with RNA-G4 binding activities and with the sensitisation to small molecules that stabilize G4 structures. (A) Venn diagram showing the overlap of our study (orange) with the RNA-G4 interacting proteins identified in Herdy[35] (green) and Herviou[36] (blue). 98 out of 425 proteins identified in our study were known to interact with RNA-G4 structures, with 14 indicated factors common to three studies. (B) Schematic representation of constrained G4 interacting proteins identified in our study that are associated to an increased sensitivity to G4 ligands, established by Zyner et al.[49]. From this analysis we determined that 62 out of 425 proteins were reported as G4 ligands sensitisers.
Figure 4Differential interaction of human proteins with constrained G4 structures adopting different topologies. (A) Diagram showing the differential enrichment of human proteins on 1a (green) and 2 (purple) constructions. Differential enrichment of proteins on structures 1a or 2 was determined through statistical analysis using the fold change ≥ 2 and p-value < 0.05, allowing to reach a false discovery rate (FDR) inferior to 5%. (B) KEGG pathway covered by the 31 proteins found enriched on constrained G4 structure 1a and functional interaction network analysis using STRING[51] for the 31 proteins found enriched on construction 1a. A right-sided (Enrichment) test based on the hyper-geometric distribution was performed on the corresponding Entrez gene IDs for each gene list and the Bonferroni adjustment (p < 0.05). (C) MS-based quantitative proteomic analysis of the interaction of MSC-complex proteins with contrained-G4 structures (extracted from Supplementary Table 1). Differentially interacting proteins were sorted out using a fold change ≥ 2 and p-value < 0.05, allowing to reach a false discovery rate (FDR) inferior to 5% according to the Benjamini–Hochberg procedure.
Figure 5Impact of the orientation and nucleotide composition of connecting loops on the differential enrichment of proteins on G4 structures. Western-blotting analysis and quantification of the interaction of proteins found enriched on constrained G4 structures with modified molecules (1a, 1b, 2, 3, 4) and with unconstrained G4 (c-myc and 21 T). Arrows indicate the 5′-3′ strand orientation of single-stranded extensions or connecting loops present on different systems. The modification of the nucleotide composition of connecting loops in the system 4 is indicated by the sequence TCT. Construct 8 and scramble sequence were used as control for pull-down performed with constrained of free-G4 structures, respectively.
Figure 6NELF complex interact with G4 structures and modulates the cellular response to G4 ligands. (A) MS-based quantitative proteomic analysis of the interaction of the NELF-complex proteins with contrained-G4 structures (extracted from Supplementary Table 1). Differentially interacting proteins were sorted out using a fold change ≥ 2 and p-value < 0.05, allowing to reach a false discovery rate (FDR) inferior to 5% according to the Benjamini–Hochberg procedure. (B) Quantification of the interaction of immunoprecipitated Flag NELF-E protein with constrained G4 structures (1a, 2) relative to cyclopeptide (CP-T23) and duplex control (8) constructions. Error bars represent SD from the means, n ≥ 3 independent experiments. p values were calculated using unpaired t-tests (without corrections for multiple comparisons). ns: p > 0.05; *: p < 0.05; **: p < 0.01; ***: p < 0.001; ****: p < 0.0001. ns non-significant difference. (C) Quantification and representative images γH2AX foci fluorescence signal (red) detected HeLa cells transfected with control (Ctrl), or two NELF-E siRNAs (independent sequences) and treated with PDS (20 µM) for 4 h. Error bars represent SD from the means, n ≥ 3 independent experiments. p values were calculated using an unpaired multiple Student’s t test. ns: p > 0.05; *: p < 0.05; **: p < 0.01; ***: p < 0.001; ****: p < 0.0001signals. Western-blotting analysis of NELF-E depletion in HeLa cells following siRNA treatment is shown.
| Target | Dilution | Species | Class | Reference | Manufacturer |
|---|---|---|---|---|---|
| KU70 | 0.2 µg/mL | Mouse | Monoclonal | MA5-13110 | Invitrogen |
| eIF4G | 1/1000 | Rabbit | Polyclonal | 2498 | Cell Signaling Technology |
| WRN | 1/2000 | Mouse | Monoclonal | W0393 | Sigma-Aldrich |
| Mre11 | 1/1000 | Mouse | Monoclonal | 611366 | BD bioscience |
| hnRNP A1 | 1/1000 | Rabbit | Polyclonal | GTX106208 | Gene Tex |
| DHX36 | 1/1000 | Rabbit | Polyclonal | HPA035399 | Sigma-Aldrich |
| CNBP | 8/1000 | Kindly provided by N. Calcaterra’s group | |||
| NCL23 | 1/1000 | Rabbit | Polyclonal | ab50279 | Abcam |
| NELFE | 1 μg/mL | Rabbit | Monoclonal | A301-913A | Bethyl laboratories |
| NELFB (COBRA) | 1 μg/mL | Rabbit | Monoclonal | A301-911A | Bethyl laboratories |
| Anti-rabbit | 1:10,000 | Goat | Polyclonal | 111-035-003 | Jackson Immunoresearch |
| Anti-mouse | 1:10,000 | Goat | Polyclonal | 115-035-003 | Jackson Immunoresearch |
| Target | Name | Sequence | Manufacturer |
|---|---|---|---|
| Luciferase | siLUC | 5′-CUUACGCUGAGUACUUCGATT-3′ | Eurofins |
| NELFE | siNELFE | 5′-AAGAUGGAGUCAGCAGAUCAG-3′ | Eurofins |
| NELFE | siNELFE_2 | 5′-GACCUUCUGGAGAAGAGCUTT-3′ | Eurofins |