| Literature DB >> 36062164 |
Matthew Lalli1,2,3, Allen Yen1,4, Urvashi Thopte3, Fengping Dong1,2, Arnav Moudgil1,2, Xuhua Chen1,2, Jeffrey Milbrandt1, Joseph D Dougherty1,4, Robi D Mitra1,2.
Abstract
Calling cards technology using self-reporting transposons enables the identification of DNA-protein interactions through RNA sequencing. Although immensely powerful, current implementations of calling cards in bulk experiments on populations of cells are technically cumbersome and require many replicates to identify independent insertions into the same genomic locus. Here, we have drastically reduced the cost and labor requirements of calling card experiments in bulk populations of cells by introducing a DNA barcode into the calling card itself. An additional barcode incorporated during reverse transcription enables simultaneous transcriptome measurement in a facile and affordable protocol. We demonstrate that barcoded self-reporting transposons recover in vitro binding sites for four basic helix-loop-helix transcription factors with important roles in cell fate specification: ASCL1, MYOD1, NEUROD2 and NGN1. Further, simultaneous calling cards and transcriptional profiling during transcription factor overexpression identified both binding sites and gene expression changes for two of these factors. Lastly, we demonstrated barcoded calling cards can record binding in vivo in the mouse brain. In sum, RNA-based identification of transcription factor binding sites and gene expression through barcoded self-reporting transposon calling cards and transcriptomes is an efficient and powerful method to infer gene regulatory networks in a population of cells.Entities:
Year: 2022 PMID: 36062164 PMCID: PMC9428926 DOI: 10.1093/nargab/lqac061
Source DB: PubMed Journal: NAR Genom Bioinform ISSN: 2631-9268
Figure 1.Barcoding the self-reporting transposon. (A) Schematic overview of the SRT construct, Calling Card method, and sequencing library preparation. Candidate sites for barcode insertions are indicated with gold stars. The TR-Genome junction, used to map transposon insertions, is circled in dotted magenta line. (B) Barcode site 3 is within the piggyBac TR sequence, immediately adjacent to the TR-Genome junction. Underlined nucleotides in the 13-bp terminal inverted repeat region (‘CTA’, gold) were targeted for mutagenesis by mutagenic PCR. (C) Overview of calling card rapid mutagenesis scheme. Mutant amplicons were transfected into cells with piggyBac transposase and integrated calling cards were collected. Nucleotide frequency for each mutagenized position of integrated SRTs were calculated. Nucleotide frequency at (D) position 1, (E) position 2 and (F) position 3 of integrated mutated SRTs. Wild-type sequences are outlined in red. All four possible nucleotides were well-represented at all three mutated positions. IR: internal repeat. TR: terminal repeat. EF1a: eukaryotic translation elongation factor 1 α promoter. SRT: self-reporting transposon. nt: nucleotide. kb: kilobase. PuroR: puromycin resistance cassette. WT: wild-type. Mut: mutant.
Figure 2.Multi-nucleotide mutagenesis in piggyBac terminal repeat discovers integration-competent barcoded SRTs. (A) Normalized counts of integration of events for 64 possible combinations of three nucleotide barcodes at the targeted region are shown (log2 counts per million (CPM)). All 64 barcoded SRTs could integrate into the genome. Black dotted lines indicate 50th percentile of read counts. Data are plotted as mean and SEM from two independent replicates. (B) Targeted mutagenesis at a fourth position in the terminal repeat identified another site that could tolerate all 4 nucleotide substitutions while retaining integration-competence. Wild-type sequence (‘G’) is outlined in red. (C) Normalized counts (log2 CPM) of insertions for 256 combinations of 4-nt barcodes. All 256 barcodes were present at varying degrees of insertional efficiency. Wild-type sequence is colored cerulean. Error-correcting and error-detecting barcodes are colored respectively in magenta and midnight blue. (D) Sequence logo of the top 100 most abundantly inserted 4-nt barcoded SRTs reveals modest sequence preference for integration efficiency. CPM: counts per million sequencing reads.
Figure 3.Calling cards using barcoded SRTs recover known binding motifs for bHLH factors near genes related to known TF functions. (A) Top binding motifs for each motif were retrieved from DNA sequences in calling card peaks. These sites are enriched for the canonical E-box motif as well as bHLH TFs including or related to each TF. (B) Venn diagram of genes proximal to called peaks for each TF indicates both shared and distinct binding of these TFs. (C) Gene Ontology enrichment analysis reveals terms related to neurogenesis and myogenesis. (D) Species mixing experiment confirms minimal barcode swapping in SRT library preparation. bHLH: basic helix-loop-helix. bZIP: basic zipper.
Figure 4.Barcoded SRT calling cards and transcriptomes enables joint measurement of TFBS and gene expression. (A) Schematic overview of barcoded sequencing library preparation. Sample-specific barcode (Sample BC) with unique molecular identifiers (UMI) is introduced during reverse-transcription of poly(A) RNA including SRTs and mRNA. Reverse transcription products (cDNA) can then be pooled for second strand synthesis and amplification. Sequencing libraries are prepared for SRTs and transcriptomes in parallel. (B) Barcoded SRT experiments recover binding motifs for ASCL1 and MYOD1. (C) Venn Diagram showing shared and distinct genes near ASCL1 and MYOD1 binding sites. (D) Transcriptomes profiled by bulk RNA-seq with barcodes revealed differential gene expression for ASCL1 and MYOD1, compared to cells transfected with unfused piggyBac. (E) Gene Ontology of differentially expressed genes in ASCL1 and MYOD1 cells.
Drastic cost and labor reduction of barcoded SRT and transcriptomes compared to original protocol. ‘Original’ calculations use the recommended 12 replicates per TF (4). This experiment assayed 3 TFs (unfused hyper piggyBac, ASCL1 and MYOD1). Transfection costs are based on NEON or nucleofector transfection device reactions. Tagmentation costs assume a library is prepared for each of the 12 replicates for both calling cards and transcriptomes. Tapestation costs reflect core facility pricing
| Replicates ( | Cost ($USD) | |||
|---|---|---|---|---|
| Original | Barcoded | Original | Barcoded | |
| Transfections | 36 | 12 | 720 | 240 |
| RNA isolation and reverse transcription | 36 | 12 | 180 | 60 |
| Amplification | 72 | 2 | 216 | 6 |
| Bead Cleanup, Tapestation | 72 | 2 | 1080 | 30 |
| Tagmentation | 72 | 2 | 2160 | 60 |
| Bead Cleanup, Tapestation | 72 | 2 | 1080 | 30 |
| Sequencing |
| |||
| Total | 5436 | 426 | ||
Figure 5.Comparison of barcoded and non-barcoded SRT calling cards in vivo in the mouse brain. (A) Equivalent amounts of brain tissue were collected after in vivo calling card experiments using a pool of 25 barcoded (BC) or non-barcoded (non-BC) SRT donors delivered by AAV. n = 4 for BC and 3 for Non-BC. (B) Number of genomic insertions recovered for each barcoded SRT. (C) Number of genomic insertions recovered at the same depth of sequencing for barcoded and non-barcoded SRTs. (D) Browser view of genomic insertions and called peaks for barcoded and non-barcoded SRTs. (E) Genomic features of peaks called by barcoded and non-barcoded experiments. (F) KEGG pathway enrichment comparison of genes near peaks called by barcoded and non-barcoded experiments.