| Literature DB >> 31616586 |
Travis C Glenn1,2,3,4,5, Roger A Nilsen4,6, Troy J Kieran1, Jon G Sanders7,8, Natalia J Bayona-Vásquez1, John W Finger1,2,9, Todd W Pierson1,10, Kerin E Bentley3,11, Sandra L Hoffberg3,12, Swarnali Louha5, Francisco J Garcia-De Leon13, Miguel Angel Del Rio Portilla14, Kurt D Reed15, Jennifer L Anderson16, Jennifer K Meece16, Samuel E Aggrey5,17, Romdhane Rekaya5,18, Magdy Alabady4,19, Myriam Belanger4,20, Kevin Winker21, Brant C Faircloth22.
Abstract
Massively parallel DNA sequencing offers many benefits, but major inhibitory cost factors include: (1) start-up (i.e., purchasing initial reagents and equipment); (2) buy-in (i.e., getting the smallest possible amount of data from a run); and (3) sample preparation. Reducing sample preparation costs is commonly addressed, but start-up and buy-in costs are rarely addressed. We present dual-indexing systems to address all three of these issues. By breaking the library construction process into universal, re-usable, combinatorial components, we reduce all costs, while increasing the number of samples and the variety of library types that can be combined within runs. We accomplish this by extending the Illumina TruSeq dual-indexing approach to 768 (384 + 384) indexed primers that produce 384 unique dual-indexes or 147,456 (384 × 384) unique combinations. We maintain eight nucleotide indexes, with many that are compatible with Illumina index sequences. We synthesized these indexing primers, purifying them with only standard desalting and placing small aliquots in replicate plates. In qPCR validation tests, 206 of 208 primers tested passed (99% success). We then created hundreds of libraries in various scenarios. Our approach reduces start-up and per-sample costs by requiring only one universal adapter that works with indexed PCR primers to uniquely identify samples. Our approach reduces buy-in costs because: (1) relatively few oligonucleotides are needed to produce a large number of indexed libraries; and (2) the large number of possible primers allows researchers to use unique primer sets for different projects, which facilitates pooling of samples during sequencing. Our libraries make use of standard Illumina sequencing primers and index sequence length and are demultiplexed with standard Illumina software, thereby minimizing customization headaches. In subsequent Adapterama papers, we use these same primers with different adapter stubs to construct amplicon and restriction-site associated DNA libraries, but their use can be expanded to any type of library sequenced on Illumina platforms. ©2019 Glenn et al.Entities:
Keywords: Adapters; Illumina; Multiplexing; Next Generation Sequencing; NovaSeq; Pooling; Primers; Sample Preparation
Year: 2019 PMID: 31616586 PMCID: PMC6791352 DOI: 10.7717/peerj.7755
Source DB: PubMed Journal: PeerJ ISSN: 2167-8359 Impact factor: 2.984
Figure 1iTru library preparation method overview.
Sheared DNA from the organism of interest (black) is used as input for iTru library preparation process. The input DNA is end-repaired and a single adenosine (A) overhang (not shown) is added to the 3′ end (see Figs. 2, 3 for details). Y-yoke adapter stubs, which have annealed complementary regions (orange) of the Read 1 (R1, purple) and Read 2 (R2, red) adapters, a 3′ thymidine (T) overhang (not shown) and are phosphorylated (indicated with a “P” at the 5′ position), are ligated to the genomic DNA. During limited-cycle PCR, iTru5 and iTru7 primers anneal to the ends of the Y-yoke adapters and are extended to produce full-length, double-indexed molecules (see Fig. S3 for details of PCR), making them fully functional for sequencing on Illumina instruments and also adding dual indexes. The P5 (maroon) and P7 (yellow-green) regions on the molecule are complementary to oligonucleotides present on Illumina flow-cells, allowing for hybridization and clonal amplification. The i5 (green) and i7 (light blue) indexes can be used for multiplexing. The R1 and R2 primer-binding sites are complementary to the sequencing primers, enabling sequencing of the library molecules on the flow cell. The R1 and R2 primer-binding sites also contain regions with identical sequence (shown in orange) that are used to facilitate the y-yoke adapters. Thus, the full R1 and R2 sequences include the regions in orange (see Fig. 2).
Comparison of oligonucleotide numbers and costs when using varying numbers of independent tags.
Cost estimates assume 2-stage library preparations and list prices from Integrated DNA Technologies, 25 nmole synthesis scale, with oligonucleotides delivered in plates. An index length of 8 nucleotides is used with an edit distance ≥3 for iTru and an edit distance ≥2 for Illumina.
| Adapter cost + Primer cost (US $) | ||||||
|---|---|---|---|---|---|---|
| 96 | TruSeq | 1 | 0 | 1 + 96 | 0 [2 | $4,019 + $18 |
| 96 | TruSeq Nano HT | 2 | 0 | 8 + 12 | 0 | $4,560 |
| 96 | iTru | 1 | 2 | 0 | 1 + 96 | $45 + $1,617 |
| 96 | iTru | 2 | 2 | 0 | 8 + 12 | $45 + $344 |
| 384 | TruSeq | 1 | 0 | 1 + 384 | 0 [2 | $16,029 + $18 |
| 384 | iTru | 1 | 2 | 0 | 1 + 384 | $45 + $6,416 |
| 384 | iTru | 2 | 2 | 0 | 16 + 24 | $45 + $689 |
| 9216 | TruSeq | 1 | 0 | 1 + 9216e | 0 [2 | $392,049 + $18 |
| 9216 | iTru | 1 | 2 | 0 | 1 + 9216 | $45 + $153,539 |
| 9216 | iTru | 2 | 2 | 0 | 96 + 96 | $45 + $3,333 |
| 147,456 | iTru | 2 | 2 | 0 | 384 + 384 | $45 + $13,332 |
Notes.
Original TruSeq approach with custom adapters (cf. Faircloth & Glenn, 2012); kits are no longer available, but the method can be home-brewed (cf. Fisher et al., 2010), or the adapters can be used with reagents from TruSeq Nano kits.
P5 and P7 primers are used.
Price includes all library preparation reagents, not just adapters; P5 and P7 primers are included in kit.
Libraries contain both i5 and i7 tags, but only one iTru5 primer is used for all samples, thus only the i7 tags are informative and are sequenced (cost efficient with old versions of HiSeq ≤2,500 kits). This method is no longer recommended, but illustrates cost differences.
Both the i5 and i7 indexes are informative and are sequenced.
Tags of 11 nucleotides are required for 9216 tags of edit distance ≥3.
Figure 2iTru and iNext library preparation workflows.
Here we illustrate the major steps used for library construction. The process is identical for iTru and iNext, except: (1) which nucleoside (A vs. C) is added to blunt, 5′ phosphorylated (end-repaired) molecules, (2) which adapter is ligated to the DNA, and (3) which primers are used for limited-cycle PCR. All steps are functionally equivalent.
Figure 3Detailed steps for iTru library construction with relevant sequences.
Starting material is sheared, double-stranded DNA (represented as X) with ragged ends. The DNA is made blunt and 5′ phosphates are added (phosphates not shown). Third, a single adenosine (A) is added to each 3′ end to allow for complementary hybridization of adapters. Next, stubby Y-yoke adapters with complementary ends are ligated to each end of the DNA molecule. These adapters contain both complementary and non-complementary sequences (non-complementary indicated by the gap between the top and bottom strand). These non-complementary sequences include primer-binding sites, as indicated by the colors, used in the next step. In the final step of library preparation, limited-cycle PCR is performed using two distinct primers complementary to the ends of the Y-yoke adapter (shown as iTru5 and iTru7). The primers contain unique indexes (i5 and i7, respectively, shown in color) as well as the P5 and P7 sequences (for color scheme and explanation of functions, see Fig. 1). The index strand in color indicates the sequence of the primer (which is the same as the index read for i5, but the reverse complement is obtained for the i7 index read; see Fig. 4). Note that iNext libraries are similar, except that cytosines are added to the template DNA (instead of adenosines), the Y-yoke adapter has single guanosine overhangs, and the Read1 and Read2 portions have different sequences (cf. Fig. S4).
Figure 4Sequencing reads that can be obtained from the full-length, dual-indexed iTru library molecules.
The top double-stranded molecule shows an iTru-library-prepared molecule. The color scheme follows Fig. 1, except that the sequences derived from the complementary ends of the adapter molecules (i.e., the portion of the y-yoke adapter that was annealed together and previously shown in orange) are illustrated in light violet and light red on the template to more clearly indicate their contiguity and are not shown on the primers (Fig S6 shows these regions in orange). The horizontal arrows indicate sequencing primers (binding to the complementary strand of the library molecules). The tip of the arrowhead indicates the 3′ end of the primer and the direction of elongation for sequencing. Four sequencing reads are shown for each library-prepared molecule, with one read for each index and each strand of the genomic DNA. Reads are arranged 1 to 4 (numbered in magenta) from top to bottom, respectively. Numbering follows the order in which the reads are obtained on Illumina instruments. The arrow immediately 3′ of the primers indicates the data obtained from that primer. 3A and 3B correspond to workflow A (NovaSeq 6000, MiSeq, HiSeq 2500, and HiSeq 2000) and workflow B(iSeq 100, MiniSeq, NextSeq, HiSeq X, HiSeq 4000, and HiSeq3000), respectively, of dual-indexed workflows on paired-end flow cells (Illumina, 2018a). Thus, a short “Dark Read”, which uses up reagents without collecting data, is needed to extend the primer to the i5 index (see text for more details). Figure S7 illustrates the reads generated from libraries lacking an i5 index but sequenced using double-indexing run settings on an Illumina platform.
Comparison of Nextera, iNext, iTru, and TruSeq Nano HT library preparation methods.
| Input DNA (ng) | Intact (≥50) | Sheared (≥100 | Sheared (≥100 | Sheared (≥100) |
| Repair ends | N/A | Yes | Yes | Yes |
| Add DNA overhang | N/A | C | A | A |
| Ligate adapter | Tagmentation | iNext stub | iTru stub | TruSeq |
| Limited cycle PCR primers | Nextera or iNext | Nextera or iNext | iTru | P5 and P7 |
| Advantages | Least time | Lower cost, high diversity | Lower cost, high diversity | Industry standard |
| Disadvantages | Higher cost, lower diversity, less randomness | More prep. time than Nextera | More prep. time than Nextera | Higher cost, more input DNA, more prep. time; not for sequence capture |
Notes.
Note, iNext primers are not specified as biotinylated, and thus will not work interchangeably with Nextera libraries that use streptavidin beads to capture/normalize/purify libraries unless biotins are added. Using unmodified iNext primers requires other purification and normalization procedures.
Tagmentation does not insert adapters into the genome as randomly as shearing the DNA.
Hyper Prep Plus Kits (KapaBioSciences) allow input as low as one ng of intact DNA.
iTru and iNext adapter stub oligonucleotides and tagged primer sequences.
All sequences are given in 5′ to 3′ orientation. To make it clear which portions are constant among all tagged primers, as well as to identify function, the tagged primers aregiven in three pieces (the invariant 5′ end, the tag sequence which varies among primers, and the invariant 3′ end), but the primers are obtained as a single contiguous fusion of these three pieces. Complete balanced sets of primers are available as Files S4 and S15 . Adapter stub oligonucleotides must be hydrated and annealed prior to use (Files S7).
| iTru | |||||
|---|---|---|---|---|---|
| iTru_R2_stub_RCp | /5Phos/GATCGGAAGAGCACACGTCTGAACTCCAGTCAC | ||||
| iTru_R1_stub | ACACTCTTTCCCTACACGACGCTCTTCCGATCT | ||||
| iTru5_01_A | AATGATACGGCGACCACCGAGATCTACAC | ACCGACAA | ACACTCTTTCCCTA*C | tag063 | |
| iTru5_01_B | AATGATACGGCGACCACCGAGATCTACAC | AGTGGCAA | ACACTCTTTCCCTA*C | tag134 | |
| iTru7_01_01 | CAAGCAGAAGACGGCATACGAGAT | AGTGACCT | GTGACTGGAGTTCA*G | tag132 | |
| iTru7_01_02 | CAAGCAGAAGACGGCATACGAGAT | AACAGTCC | GTGACTGGAGTTCA*G | tag008 | |
Results from initial iTru library preparation and sequencing tests of DNA from sharks and challenging non-model organisms.
The Illumina TruSeq HT i7 index sequences where used in these tests. Protocol 1: EZNA Tissue DNA KIT (Omega Bio-Tek, USA); Protocol 2: Aljanabi & Martínez (1997); Protocol 3: CTAB-Phenol.
| MaF 5 | white shark | Protocol 1 | 705 | 1,930,539 | 1,805,638 | mtDNA | 1,722,562 | 17,103 (46×) | – | |
| MaF 19 | white shark | Protocol 2 | 707 | 2,075,236 | 1,927,792 | mtDNA | 2,003,858 | 17,138 (31×) | – | |
| MaF 10 | silky shark | Protocol 1 | 706 | 1,438,468 | 1,358,550 | mtDNA | 1,800,534 | 17,285 (22×) | – | |
| MaF 1 | tarantula | Protocol 1 | 701 | 985,171 | 934,406 | msats | 80,790 | – | 563 | |
| MaF 16 | cannonball jellyfish | Protocol 3 | 703 | 959,516 | 909,401 | msats | 591,608 | – | 92,668 | |
| MaF 9 | coral | Protocol 1 | 702 | 3,449,711 | 3,298,155 | msats | 1,549,718 | 18,628 (50×) | 7.322 | |
| Total | 10,838,641 | 10,233,942 |
Notes.
Only includes high quality reads with inserts of 250 bases; excluded reads generally due to short insert length due to degraded input DNA.
Identified using default parameters in PAL-finder (Castoe et al., 2012).
Díaz-Jaimes et al. (2016).
Galván-Tirado et al. (2016).
Del Rio Portilla et al. (2016).