Literature DB >> 34585162

Efficient acquisition of tens of thousands of short tandem repeats in single-cell whole-genome-amplified DNA.

Liming Tao1, Zipora Marx1, Ofir Raz1, Ehud Shapiro1.   

Abstract

Short tandem repeats (STRs) are highly abundant in the human genome, but existing approaches for accurate genotyping of STRs are limited. Here, we describe a protocol for duplex molecular inversion probes for high-throughput and cost-effective STR enrichment. We have successfully tested panels targeting as many as 50K STRs in several thousands of genomic samples (e.g., HeLa cells, Du145 cells, leukemia cells, melanoma cells). However, because the protocol is plate based, the sample size is limited to a few thousand. For complete details on the use and execution of this protocol, please refer to Tao et al. (2021).
© 2021 The Authors.

Entities:  

Keywords:  Genomics; High Throughput Screening; Molecular Biology; Molecular/Chemical Probes; Sequencing; Single Cell

Mesh:

Year:  2021        PMID: 34585162      PMCID: PMC8452885          DOI: 10.1016/j.xpro.2021.100828

Source DB:  PubMed          Journal:  STAR Protoc        ISSN: 2666-1667


Before you begin

The protocol below describes the specific steps for using whole genome amplified genomic DNA (REPLI-g Mini Kit, Qiagen) from Du145 single cells for the 12K OM6 STR panel presented in our Cell Reports Methods paper (Tao et al., 2021) (Custom Array). However, we have also used this protocol for primary cells such as melanoma, leukemia, T-cells, Macrophages, etc. and other whole genome amplification kits such as REPLI-g Single Cell Kit, Ampli1WGA kit, MALBAC single cell WGA kit etc.

Duplex MIP preparation

Timing: [2 days] Prepare the duplex molecular inversion probes for a 12K panel of selected human STRs, OM6, to enrich these targets from the single cell WGA DNA in the following steps. KOD Hot Start Real Time Custom PCR Mix 5× (KOD 5× Custom Mix) Prepare SYBR 100× by mixing 10 μL from stock SYBR green I (Lonza, 10,000×) and 990 μL Dimethyl Sulfoxide (DMSO) (Sigma). Prepare 2 mL KOD 5× Custom Mix according to the table below. PreAmp PCR (8 reactions) Dilute the synthesized oligo pool (Custom Array, Inc.) to 1 ng/μL to prepare PCR template. Amplification primers designed to bind universal adapters are used for PreAmp PCR in LightCycler 480 (LC480, Roche) as shown below: OM4_Mly_F: GTCTATGAGTGTGGAGTCGTTGC OM4_Mly_R: CTAGCTTCCTGATGAGTCCGATG SYBR in KOD 5× Custom Mix can be used to track the amplification for real time PCR. PreAmp PCR Mix: PreAmp PCR program: Purify PreAmp PCR product by MinElute PCR purification kit (Qiagen). Measure concentration by Qubit dsDNA HS Assay Kit (Life Technologies). Production PCR (48 reactions). Troubleshooting 3 Dilute purified PreAmp PCR product to 1 ng/μL for template. 96 well plate production PCR is performed according to the setup below. Amplification is tracked by SYBR present in the KOD 5× Custom Mix. Production PCR program PCR product are pooled and purified by MinElute columns (Qiagen). Elute with 45 μL ddH2O per column. Pool all purified products. Measure the DNA concentration of the final pool by loading 1 μL of the pool onto a NanoDrop spectrophotometer (Thermo Scientific). Dilute the pool to ∼30 ng/μL based on measured concentration. Retain 20 μL of sample to evaluate size distribution in Step 6. Carry the rest forward in Step 4. Digest the diluted DNA. Troubleshooting 4 Combine diluted DNA with MlyI following the table below Incubate the mixture at 37°C overnight, deactivate at 80°C for 20 min, and store at 4°C. Prepare final duplex MIP pool. Purify digested DNA by MinElute column. Pool elution samples into one tube. Measure concentration using by Qubit dsDNA HS (High Sensitivity) assay kit according to the manufacturer's protocol. Perform quality control on digested product size distribution. Run digested and undigested samples (Step 4b) on Tape Station (Agilent). The final duplex MIP pool should be ∼105 bp, and undigested sample from step (4b) should be ∼150 bp. (Figure1).
Figure 1

Duplex MIPs quality control

Duplex MIPs quality control Based on length of 105 bp and the concentration, the final duplex MIPs pool is diluted to 80 nM (80 fmol/μL) stock solution, equivalent to 5.8 ng/μL. Dilute further to 8 nM as working solution. Store both stock and working solutions at −20°C.

Whole-genome-amplified genomic DNA preparation

Timing: [15 min] Single-cell WGA DNA is prepared by selected kit in advance. Here we just describe thawing of the single cell WGA DNA for the following step. Clean the bench with 70% Ethanol. Take out a plate of whole genome amplified genomic DNA from −20 freezer. Thaw at room temperature. Shake on a bench top mixer, quickly spin down (approximately 30 s) at 500 rpm. CRITICAL: Keep the plate well sealed to avoid cross contamination.

Key resources table

Step-by-step method details

STR target enrichment

Timing: [2 days] In this step, we enrich all the designed targets from every single cell WGA DNA in 96 well plates. Hybridization Make Hybridization Mix with 200–500 ng of single cell WGA DNA (∼2 μL) per reaction. Note that single cell WGA product concentration is generally 100–200 ng/μL in our hands. For large scale experiments, prepare Hybridization Master Mix according to the following table without WGA DNA. Distribute 8μL Hybridization Master Mix per well of a 96-well plate. Add 2 μL DNA or ddH2O to each well and mix by liquid handling system (Evoware, Tecan) or manually. Place the reaction plate into a PCR machine with 100°C lid temperature. Heat at 98°C for 3 min and ramp the temperature at 0.01°C per second to 56°C.Then, incubate at 56°C for 17 h. An example in our PCR machine is shown below. Gap filling Prepare Gap Filling Mix half an hour before hybridization finishes. See table below. Keep the mix at 56°C on a heat block Transfer reaction plate from the PCR machine to a 56°C heat block when the hybridization step is finished. Add 10 μL of Gap Filling Mix to each well, carefully mix by pipette, seal tightly and quickly return plate to the PCR machine. Run a 4-h 56°C incubation, deactivate for 20 min at 68°C, then keep at 4°C until next step. Pause point: After the gap filling step, the reaction plate can be stored at 4°C fridge for up to two days. Digestion of linear DNA: Prepare Digestion Mix 15 min before gap filling ends. Retrieve reaction plate from PCR machine. Note: take care when removing cover. Add 2 μL of the Digestion Mix to each well and mix. Spin down the reaction plate and seal. Incubate at 37°C for 60 min, 80°C for 10 min and 95°C for 5 min. Pause point: the reactions can be stored at −20°C for at least 2x months after the digestion step. CRITICAL: Seal the plate tight, avoid evaporation.

Library preparation and sequencing

Timing: [4 days] Illumina sequencing adapters and unique barcode per cell are added by a barcoding PCR. Then all the samples are pooled into one tube in equal volume and then equal molecular concentration. The pools are size selected by Blue Pippin to remove dimmers and by products. library pools passed quality control are sequenced on MiSeq or NextSeq with default illumine sequencing primers. Sample specific barcoding PCR Note the structure of the dual-index Illumina barcoding primers used in the experiments: i5-index-primer: AATGATACGGCGACCACCGAGATCTACAC[i5-8bp-index]ACACTCTTTCCCTACACGACGCTCTTCCG; i7-index-primer: CAAGCAGAAGACGGCATACGAGAT[i7-8bp-index]GTGACTGGAGTTCAGACGTGTGCTCTTCCG; 2 μL product from the previous step (step 3) are amplified with a pair of unique barcoding primers for each sample in a reaction as shown below. Barcoding PCR program Sample pooling and Purification for Diagnostic Sequencing Clean up barcoded PCR product in a 96-well plate using 0.8× AMPure XP SPRI magnetic beads (Beckman Coulter) according to manufactory’s manual by Tecan liquid handling system, eluted in 40 μL ddH2O. Pool equal volumes (usually take 2 ul) of purified samples manually. Concentrate the pool by MinElute according to manufacturer instructions, elute with 35 μL ddH2O. Size Selection for Diagnostic Sequencing Retain 3 ul of the concentrated pool for quality control in step 5. Run 30 μL of the concentrated pool on a lonza 2% V1 cassette BluePippin (Sage Science) with setting range 240–340 bp according to manufactory’s protocol. Agarose gel extraction in the range of 240–340 bp can serve as an alternative. Purify size-selected elution by MinElute, elute with 15 μL ddH2O. Measure concentration by Qubit dsDNA HS (High Sensitivity) assay kit. Troubleshooting 1 Inspect size distribution of the concentrated pool before and after size selection using a Tape Station dsDNA chip (Figure 2 is a reuse of panel 1 in Supplementary Figure 1 from our Cell Reports Methods paper (Tao et al., 2021) and confirms a single peak around 300 bp. Troubleshooting 2
Figure 2

Quality control of sequencing library

Quality control of sequencing library Dilute size-selected pool to make 12 μL of 4 nM (4 fmol/μL) library for Illumina NGS calculated based on the concentration and average size reported by the Tape Station. Diagnostic sequencing (∼17 h for sequencing, ∼2 h for analysis) Troubleshooting 5 Sequence at 10 pM loading concentration. We recommend to run on a 300 cycle MiSeq Nano flow cell in pair end mode. Set Read1 and Read2 as 151, and both Index1 and Index2 reads as 8. Minimum read length we have tested is 125 × 2 pair end to allow sequencing through the repeat regions of most STRs in our design. Default sequencing primers suffice for sequencing. Following bcl2fastq demultiplexing, merge overlapping Read1 and Read2 with the following command: >pear -v 40 -m 300 -f fastq1 -r fastq2 -o pear_files_prefix Map merged reads against customized STR reference (as shown in Figure 3) of all amplicons with bowtie2, each appearing multiple times, once with every possible STR length.
Figure 3

MS reads mapping

Each read is mapped to a specific target locus according to its flanking regions.

>bowtie2 -x index_files_prefix -U merged_fastq | samtools view -bS - | samtools sort -o sorted_assignment_bam MS reads mapping Each read is mapped to a specific target locus according to its flanking regions. For more details, parallel execution and integration to the clineage analysis system, please see the codes at: https://github.com/shapirolab/clineage/blob/master/sequencing/analysis/full_msv/full_msv.py Extract the total number of reads per sample from “sorted_assignment_bam” with pysam. Balancing reads per sample Calculate the scaling volume for each sample based on the total number of reads extracted from the diagnostic sequencing result to equalize the read coverage per sample. For example, sample A got 500 reads, sample B got 1000 reads in the diagnostic sequencing, to equalize the read coverage in the following production sequencing, we can pool 2 ul sample A with 1 ul sample B. According to the scaling volume, pool purified samples from step (5a) manually or by Echo550, then concentrate by miniElute, elute in 35μL ddH2O. Prepare production sequencing library for pooled samples as in step (6). Production sequencing (∼29 h for sequencing) The minimum reads per samples is 1M, and the minimum read length is 125 × 2 pair end. We recommend to sequence up to 200 samples on one NextSeq500 high output flow cell with 151×2 pair-end run parameters according to manufactory manual and relying on default sequencing primers. Set both Index1 and Index2 as 8. Load at 1.8–2.2 pM concentration. (Figure 3) If the production sequencing doesn’t generate enough reads for some samples (i.e over 1M reads for samples enriched with the OM6 panel), another round of NextSeq could be conducted using the same library for these samples. Consider Hiseq or NovaSeq platforms for large scale projects.

Expected outcomes

We expect to get and ∼150 bp precursors size and ∼110 bp probe size after digestion as shown in Figure1. The sequencing ready library size after size selection and purification should be ∼300 bp as detected by Tape Station and no/minimum primer dimmers 170–240, see Figure 2.

Limitations

Poor quality of whole genome amplified genomic DNA may prevent hybridization, gap fill, and full library preparation. The protocol is plate-based, so the sample size is limited to a few thousand.

Troubleshooting

Problem 1

The sequencing library after size selection by Blue Pippin resulting DNA concentration is too low to load on Illumina sequencer. [Step 6d]

Potential solution

Increase the pooling volume per sample from 2 ul to 5 ul for the Blue Pippin loading pool. Use the same elution volume 40 ul to increase the original DNA amount loaded in Blue Pippin.

Problem 2

Primer dimers at 170–240 bp are still presenting in significant ratio to the desired library peak around 300 bp in diagnostic libraries detected by Tape Station after size selection by Blue Pippin.[Step 6e] Check the quality of single cell WGA DNA by size and concentration, make sure to use good quality WGA DNA for the majority of samples.

Problem 3

Significant by product in large size more than 300 bp detected by Tape Station presented in probe production PCR.[Step 3 ] Check the template concentration used in production PCR, make sure to dilute it to 1 ng/ul; reduce the production PCR cycles to 10 or 11.

Problem 4

Significant undigested probes ∼150 bp remains in the Tape Station quality control step.[Step 4] Check the concentration of the input precursor again to make sure <30 ng/ul concentration used in digestion reaction; With the same digestion setting, digest the probes again, and purify by Mini Elute, run quality control by Tape Station.

Problem 5

Low sequencing quality presented by the illumina sequencer, including low passing filter clusters, low Q30. [Step 7] Consider the sequencing complexity in both the amplicon region and index region, especially when handling small panel (<100 targets) and small scale of samples (<20). Spike in 20% PhiX in such cases could help improve the overall sequencing quality.

Resource availability

Lead contact

Further information and requests for resources and reagents should be directed to and will be fulfilled by the lead contact: Ehud Shapiro: ehud.shapiro@weizmann.ac.il

Materials availability

This study did not generate new unique reagents.
ReagentsStock conc.Final conc.KOD 5× custom mix (μl)
ddH2O0.27
KOD Buffer 10× (Merck)10×5×2.5
MgSO4 25 mM (Merck)25 mM7.5 mM1.5
dNTP 25 mM each (Bioline)25 mM7.5 mM0.2
KOD Enzyme 1 U/μL (Merck)1 U/μL0.1 U/μL0.5
SYBR 100× (Lonza)100×1×0.025
Total Volume5
ReagentsStock conc.Final conc.1× PreAmp PCR mix (μl)
Template1 ng/μL0.2 ng/μL1.8
OM4_Mly_F primer10 pmol/μL0.3 pmol/μL1.35
OM4_Mly_R primer10 pmol/μL0.3 pmol/μL1.35
KOD 5× Custom Mix5×1×9
ddH2O31.5
Total Volume45
PCR cycling conditions
StepsTemperatureTimeCycles
Initial Denaturation95°C120 s1
Denaturation95°C20 s18 cycles
Annealing60°C10 s
Extension70°C5 s
Final extension70°C50 s1
Hold4°CForever
ReagentsStock conc.Final conc.1× production PCR (μl)
Template1 ng/μL0.2 ng/μL1.8
OM4_Mly_F primer10 pmol/μL0.3 pmol/μL1.35
OM4_Mly_R primer10 pmol/μL0.3 pmol/μL1.35
KOD 5× Custom Mix5×1×9
ddH2O31.5
Total Volume45
PCR cycling conditions
StepsTemperatureTimeCycles
Initial Denaturation95°C120 s1
Denaturation95°C20 s12 cycles
Annealing60°C10 s
Extension70°C5 s
Final extension70°C50 s1
Hold4°CForever
ReagentsStock conc.Final conc.1× with MlyI mix (μl)
Diluted DNA (30 ng / uL)30 ng/μL25.2 ng/μL84
10× NEB Smarter Buffer10×1×10
MlyI10 U/μL0.6 U/μL6
Total Volume100
REAGENT or RESOURCESOURCEIDENTIFIER
Chemicals, peptides, and recombinant proteins

Betaine solutionSigmaCat#5MB0306 1VL
KOD enzymeMerckCat# 71086
dNTP SetBiolineCat#BIO-39049
SYBR 100×LonzaCat#50513
Phusion High-Fidelity DNA PolymeraseNEBCat#NEB-M0530L
Ampligase 10× Reaction BufferEpicentreCat#A1905B
Ampligase DNA Ligase W/O BufferEpicentreCat#A3210K
Exonuclease I (E. coli)NEBCat#M0293L
Exonuclease III (E. coli)NEBCat#M0206L
RecJfNEBCat#M0264L
Exonuclease TNEBCat#M0265L
T7 ExonucleaseNEBCat#M0263L
Lambda ExonucleaseNEBCat#M0262L
NEBNext Ultra II Q5 MasterMixNEBCat#M0544L
MinElute PCR Purification KitQIAGENCat#28006
Qubit® dsDNA HS Assay KitThermo FisherCat#Q32854
Agencourt Ampure XP BeadsBeckman CoulterCat#A63881
2% Agarose, dye-free, BluePippin, 100–600,SageCat#BDF2010
TapeStation ScreenTapeAgilentCat#5067-5582
TapeStation ReagentsAgilentCat#5067-5583
MiSeq Reagent Kits v2IlluminaCat#MS-102-2002
MiSeq Reagent Nano Kit v2 (300-cycles)IlluminaCat#MS-103-1001
NextSeq 500/550 High Output Kit v2.5 (300 Cycles)IlluminaCat#20024908

Deposited data

Sequencing dataArrayExpressE-MTAB-6411

Experimental models: cell lines

DU145 cell lineATCCDU 145ATCC® HTB-81™

Oligonucleotides

OligopoolGenScriptOM6(Tao et al., 2021)
ReagentsStock conc.Final conc.1× hybridization mix (μl)
Single Cell WGA DNA100 ng/μL20 ng/μL2
Duplex MIPs8 fmol/μL0.8 fmol/μL1
Ampligase Buffer10×1×1
Betaine5M0.9 M1.8
ddH2O4.2
Total Volume10
StepTemperatureTimeCycles
197.9°C3 min
297.9°C15 s×420
decrease as slow as 0.1ºC/sec
decrease by 0.1°C/sec every cycle
356°C17 h
456°CPause for adding gap filing mix
ReagentsStock con.Final conc.1× gap filling Mix(μl)
dNTP2 mM0.3 mM1.5
NAD10 mM2 mM2
Betaine5M1.1 M2.2
Ampligase buffer10×1×1
Ampligase5 U/μL0.5 U/μL1
Phusion2 U/μL0.8 U/μL0.4
ddH2O1.9
Total Volume10
ReagentsStock con. (U/μL)Final conc. (U/μL)1× digestion mix (μl)
exo I203.50.175
exo III100180.18
exo T71040.4
exo T50.40.08
RecJf3030.1
lambda exo100.20.02
ddH2O1.045
Total Volume2
ReagentsStock conc.Final conc.1× (μl)
TemplateNANA2
dual-index Illumina primers5 pmol/μL each0.5 pmol/μL each2
NEBNext Ultra II Q5 Master Mix2×1×10
SYBR 100×10×0.5×1
ddH2O5
Total Volume20

Barcoding PCR program

TemperatureTimeCycles
98°C30 s
98°C10 s×5 cycle
56°C30 s
65°C45 s
98°C10 s×15 cycle
65°C75 s
65°C5 min
4°CHold
  1 in total

1.  Retrospective cell lineage reconstruction in humans by using short tandem repeats.

Authors:  Liming Tao; Ofir Raz; Zipora Marx; Manjusha S Ghosh; Sandra Huber; Julia Greindl-Junghans; Tamir Biezuner; Shiran Amir; Lilach Milo; Rivka Adar; Ron Levy; Amos Onn; Noa Chapal-Ilani; Veronika Berman; Asaf Ben Arie; Guy Rom; Barak Oron; Ruth Halaban; Zbigniew T Czyz; Melanie Werner-Klein; Christoph A Klein; Ehud Shapiro
Journal:  Cell Rep Methods       Date:  2021-07-26
  1 in total

北京卡尤迪生物科技股份有限公司 © 2022-2023.