| Literature DB >> 30899333 |
Jared P Steranka1,2, Zuojian Tang3,4, Mark Grivainis3,4, Cheng Ran Lisa Huang2, Lindsay M Payer1, Fernanda O R Rego5, Thiago Luiz Araujo Miller5,6, Pedro A F Galante5, Sitharam Ramaswami7, Adriana Heguy7, David Fenyö3,4, Jef D Boeke4, Kathleen H Burns1,2.
Abstract
BACKGROUND: Transposable elements make up a significant portion of the human genome. Accurately locating these mobile DNAs is vital to understand their role as a source of structural variation and somatic mutation. To this end, laboratories have developed strategies to selectively amplify or otherwise enrich transposable element insertion sites in genomic DNA.Entities:
Keywords: LINE-1; Next generation sequencing; Targeted PCR
Year: 2019 PMID: 30899333 PMCID: PMC6407172 DOI: 10.1186/s13100-019-0148-5
Source DB: PubMed Journal: Mob DNA
Fig. 1Steps in the TIPseq protocol. a Steps in TIPseq are shown from top to bottom in a vertical flow chart. These include (i.) vectorette adapter annealing, (ii.) genomic DNA (gDNA) digestion, (iii.) vectorette adapter ligation, (iv.) vectorette touchdown PCR, (v.) PCR amplicon shearing, (vi.) sequencing library preparation, (vii.) Illumina sequencing, and, (viii.) data analysis. The first seven of these steps are shown adjacent to schematic representations in part b., to the right. b Vectorette adapter annealing is shown first. Mismatched sequences within the hybridized vectorette oligonucleotides are illustrated in red and blue, and create a duplex structure with imperfect base pairing. The sticky end overhang on one strand of the vectorette (here, a 5′ overhang on the bottom strand) is drawn in gray. This overhang in the annealed vectorette complements sticky ends left by genomic DNA digest, and the digest and vectorette ligations are shown in the subsequent two steps. The black box within the gDNA fragment illustrate a LINE-1 element of interest (i.e., a species-specific L1Hs). Most gDNA fragments will not have a transposable element of interest, and thus cannot be amplified efficiently by the vectorette PCR. In vectorette PCR, the L1Hs primer begins first strand synthesis (1) and extends this strand through the ligated vectorette sequence. The reverse primer complements this first-strand copy of the vectorette (2) and the two primers participate in exponential amplification (3) of these fragments in subsequent cycles. c Amplicons are sheared, and conventional Illumina sequencing library preparation steps complete the protocol. Paired-end sequencing reads are required to perform data analysis with TIPseqHunter. d A diagram of read pile-ups demonstrate how there is deep coverage of the 3′ end of L1Hs elements. For elements on the plus (+) strand with respect to the reference genome, the amplified sequences are downstream of the insertion site (i.e., covering genomic coordinates ascending from the transposon insertion). For minus (−) stranded insertions, sequences are recovered in the opposite direction
Vectorette oligo and primer sequences
| Enzyme Vectorette Oligo Sequences (5′ to 3′) | |
| AseI plus | TAGAAGGAGAGGACGCTGTCTGTCGAAGGTAAGGAACGGACGAGAGAAGGGAGAG |
| BspHI plus | CATGGAAGGAGAGGACGCTGTCTGTCGAAGGTAAGGAACGGACGAGAGAAGGGAGAG |
| BstYI plus | GATCGAAGGAGAGGACGCTGTCTGTCGAAGGTAAGGAACGGACGAGAGAAGGGAGAG |
| HindIII plus | AGCTGAAGGAGAGGACGCTGTCTGTCGAAGGTAAGGAACGGACGAGAGAAGGGAGAG |
| NcoI plus | CATGGAAGGAGAGGACGCTGTCTGTCGAAGGTAAGGAACGGACGAGAGAAGGGAGAG |
| PstI minus | CTCTCCCTTCTCGGATCTTAACCGTTCGTACGAGAATCGCTGTCCTCTCCTTCTGCA |
| Common Vectorette Oligo Sequences (5′ to 3′) | |
| Vectorette minus | CTCTCCCTTCTCGGATCTTAACCGTTCGTACGAGAATCGCTGTCCTCTCCTTC |
| Vectorette plus | GAAGGAGAGGACGCTGTCTGTCGAAGGTAAGGAACGGACGAGAGAAGGGAGAG |
| Primer Sequences (5′ to 3′) | |
| L1 Primer | AGATATACCTAATGCTAGATGACACA |
| Vectorette Primer | CTCTCCCTTCTCGGATCTTAA |
Fig. 2Schematic of LINE-1 and read alignments. a Diagrams of example LINE-1 insertions types are shown: full length, 5′ truncated, 5′ truncated with inversion, and 5′ truncated with 3′ transduction. TIPseq is able to detect these types of insertions. The full length LINE-1 element includes 5′ and 3’ UTRs, including a 3′ polyA tail, all colored in light blue. The specific L1 primer binding site is shown as a black arrow in the 3’ UTR. The open reading frames (ORF1 and ORF2) are shown in two darker shades of blue. Flanking genomic DNA is shown as gray lines with target site duplications (TSDs) as black lines. The gold line represents a transduced region of gDNA. Arrows underneath each diagram illustrate the orientation of the sequence. b The types of reads that TIPseq generates are shown in the top of the diagram with a TranspoScope image capture below. Reads containing only LINE-1 sequence are colored blue. Junction reads which contain both L1 and unique genomic DNA and are colored orange. Uniquely mapped genomic DNA reads are shown in gray, purple, and green. Gray reads are genome reads in genome-genome pairs. Purple reads are genome mates in genome-L1 pairs. Green reads are genome reads with an unmapped or discordant pair. TranspoScope displays the read counts and positions for specific L1 insertions detected by TIPseq. The L1 insertion site is shown as a vertical blue line, and downstream restriction enzyme cut sites used in TIPseq are shown as gray triangles with vertical red lines
Vectorette PCR thermal cycler program
| 95 °C | 5 min | |
| 95 °C | 1 min | 5 cycles |
| 72 °C | 1 min | |
| 72 °C | 5 min | |
| 95 °C | 1 min | 5 cycles |
| 68 °C | 1 min | |
| 72 °C | 5 min | |
| 95 °C | 45 s | 15 cycles |
| 64 °C | 1 min | |
| 72 °C | 5 min | |
| 95 °C | 45 s | 15 cycles |
| 60 °C | 1 min | |
| 72 °C | 5 min | |
| 72 °C | 15 min | |
| 16 °C | Hold | |
Data analysis using TIPseqHunter (Timing: variable)
| TIPseqHunter uses genome assembly GRCh37 (hg19) and can be run with a Docker image or by using individual programs. |
Validation of insertions through spanning PCR and Sanger sequencing (Timing: variable)
| 1. Design flanking primers around L1 insertion site. |
Fig. 3Approaches to PCR validation of insertions. a Agarose gel electrophoresis of a somatic PCR validation. Three lanes are shown: [L] 2-log ladder (NEB), [N] normal DNA, [T] tumor DNA. An upper band marked by a black arrow is present in the tumor but absent in the normal sample which confirms a somatic L1 insertion occurred in the tumor. b Agarose gel of two L1 3’ PCR validations. Five lanes are shown: [L] 2-log ladder (NEB), [F1] forward primer with L1 primer for insertion on 2p16.3, [R1] reverse primer with L1 primer for insertion on 2p16.3, [F2] forward primer with L1 primer for insertion on 9q21.31, [R2] reverse primer with L1 primer for insertion on 9q21.31. For both insertions, only the reverse primer produces a band when paired with the L1 primer, which suggests that both are plus strand insertions. All specific primers were designed approximately 200 bp away from the insertion site. Because the L1 primer is located 150 bp away from the 3′ end of the element, the expected product size for both reactions is approximately 350 bp marked with a gray arrow. The PCR reaction for the 9q21.31 insertion produces a band larger than expected marked with a black arrow. This suggests that a 3′ transduction may have taken place and is confirmed by sending the PCR product for Sanger sequencing. c The illustration shows the relative positions of primers and products for the two L1 insertions from part b. The 9q21.31 insertion in the lower diagram has a 3′ transduction shown as a gold line
Validation of insertions and identification of 3’ transduction events through L1-specific 3’ PCR and Sanger sequencing (Timing: variable)
| 1. Design flanking primers around L1 insertion site. |
Digestion master mix
| Digestion master mix | Volume (μL) | |
|---|---|---|
| 1x | 4x | |
| Molecular grade H2O | 2.25 | 9.0 |
| 10x Restriction enzyme buffer | 2.5 | 10 |
| Restriction enzyme | 1.0 | 4.0 |
| RNase cocktail enzyme mix | 0.25 | 1.0 |
Ligation master mix
| Ligation master mix | Volume (μL) | |
|---|---|---|
| 1x | 8x | |
| 10 mM ATP | 2.5 | 20 |
| 10x T4 Ligase buffer | 0.5 | 4.0 |
| T4 Ligase (400 U/μL) | 0.2 | 1.6 |
PCR master mix formulas
| PCR master mix | Volume (μL) | |
|---|---|---|
| 1x | 8x | |
| Molecular grade H2O | 32.55 | 260.4 |
| 10x Ex Taq buffer | 5.0 | 40 |
| dNTP mixture (2.5 mM each) | 4.0 | 32 |
| Specific L1 Primer (100 μM) | 0.1 | 0.8 |
| Vectorette Primer (100 μM) | 0.1 | 0.8 |
| Ex Taq HS polymerase | 0.25 | 2.0 |
Troubleshooting table
| Step | Problem | Possible reason | Solution |
|---|---|---|---|
| 20 | Low PCR yield | Poor vectorette adapter annealing or ligation | Anneal fresh vectorette adapters and repeat procedure |
| 20 | Low PCR yield | Low starting gDNA quality/quantity | Increase the initial amount of starting gDNA, or isolate fresh gDNA |
| 21 | Very high molecular weight smear | Vectorette-Primer concatemer amplification | Digest 2 μg of vectorette PCR amplicons with BstYI and running on a 1.5% agarose gel. An intense band around 50 bp indicates the presence of concatemers in the PCR product (see Additional file |
| 27 | Library yield too low to sequence | DNA lost during library preparation or size-selection | Restart library preparation with more sheared DNA (0.5-1 μg) |
| 28 | Uneven sequencing output distribution | Uneven library pooling | Performing qPCR on prepared libraries with KAPA Library Quantification Kit prior to pooling may result in a more balanced sequencing output. |
| 30 | High number of overlapping read pairs | Small library fragments | Add a Pippin prep selection after pooling (step 28) to remove fragments under 400 bp. |
| Table | No L1 insertion band | Large/difficult L1 insertion | Use L1-specific 3’ PCR (see Table |