| Literature DB >> 29310587 |
Laura E MacConaill1,2, Robert T Burns1, Anwesha Nag1, Haley A Coleman1, Michael K Slevin1, Kristina Giorda3, Madelyn Light3, Kevin Lai3, Mirna Jarosz3, Matthew S McNeill4, Matthew D Ducar1, Matthew Meyerson1,2,5,6, Aaron R Thorner7.
Abstract
BACKGROUND: Sample index cross-talk can result in false positive calls when massively parallel sequencing (MPS) is used for sensitive applications such as low-frequency somatic variant discovery, ancient DNA investigations, microbial detection in human samples, or circulating cell-free tumor DNA (ctDNA) variant detection. Therefore, the limit-of-detection of an MPS assay is directly related to the degree of index cross-talk.Entities:
Keywords: Adapter; Barcode cross-talk; Illumina; Index; Massively parallel sequencing; Molecular barcode; Multiplexing; Next generation sequencing; UMI
Mesh:
Year: 2018 PMID: 29310587 PMCID: PMC5759201 DOI: 10.1186/s12864-017-4428-5
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Fig. 1Unique dual-matched UMI adapters are compatible with shear ligation library construction methods and may be sequenced in three different modes depending on the sensitivity of the application. a For genotyping applications, the i7 sample index may be used for demultiplexing. b More sensitive applications, such as somatic variant calling, should use the i7 and i5 index. c When a unique molecular identifier (UMI) is required for greater sensitivity, the read length for the i7 index may be increased to sequence the sample index and UMI in addition to the i5 sample index
Fig. 2Unique dual-matched sample indices reduce read misassignment caused by contamination. a Example contamination of A1 adapter with 1% of A2 adapter. Because only the i7 index is used to discriminate between the samples, 1% of sample A1 reads would be misassigned to sample A2. b The same 1% level of contamination with unique dual-matched indexed adapters results in only 0.01% read misassignment to sample A2
Fig. 3Level of cross-talk using combinatorial indices on the Illumina MiSeq and 2500 platforms. The 96-well plate layout represents the actual adapter plate arrangement. Numbers in each well represent the number of fragments that passed standard Illumina filters and demultiplexed using only perfect index matches on all TS-96 indices. a Four cell-line libraries were prepared using Illumina-synthesized TS-96 adapters (green), libraries were pooled, hybrid captured using a custom bait set, and then sequenced on an Illumina MiSeq v2 flow cell. b Fifteen patient-derived xenograft (PDX) libraries were prepared using IDT-synthesized TS-96 adapters (green), libraries were pooled, hybrid captured using a custom bait set, and then sequenced on a single lane of an Illumina HiSeq2500 flow cell
Fig. 4Level of cross-talk using unique, dual-matched indexed adapters on Illumina HiSeq 2500. The 96-well plate layout represents the adapter plate. A total of 35 adapters were synthesized. Seventeen human cell line libraries were prepared using IDT-synthesized unique, dual-matched indexed adapters (green), libraries were pooled, hybrid captured using a custom bait set, and then sequenced on a single lane of an Illumina HiSeq2500 flow cell. Numbers in each well represent the number of fragments that passed standard Illumina filters and demultiplexed using only perfect sequence matches on all TS-96 indices
Fig. 5Unique dual-matched indices accurately identify contamination and index hopping events. a Heatmap displays the percent of reads for all i5 and i7 sample index combinations in 1-, 4-, 8- or 16-plex captures. The contamination level of library adapters was 0.09% for single library captures but increased up to 0.39% for 16-plex captures. b Percentage of reads filtered out per multiplex experiment when using dual-matched indexed adapters that would have been misassigned using combinatorial indices
Fig. 6UMI consensus calling improves variant detection accuracy, allows for the detection of rare variants, and corrects 8-oxoguanine errors. a Sensitivity and positive predictive value (PPV) using either no UMI or consensus calling for expected allele frequencies from 0.5–99.5% across 291 SNPs. Variant calling performed using VarDict with a threshold of 0.2%. b Variant calling thresholds (AF) with no UMI or consensus calling further improved sensitivity and PPV for low frequency variants (N = 54). There were 10 and 44 sites expected at 1% and 0.5% allele frequencies, respectively. c Sensitivity and PPV for low frequency variants expected at AF 0.5–1% when using no UMI versus consensus calling using a variant calling threshold of 0.2% with VarDict. d Number of false positive calls with or without UMI consensus using a variant calling threshold of 0.2%