| Literature DB >> 30588329 |
Hindrek Teder1,2, Mariann Koel1,3, Priit Paluoja1,4, Tatjana Jatsenko1, Kadri Rekker1,5, Triin Laisk-Podar1,5,6, Viktorija Kukuškina6, Agne Velthut-Meikas1,7, Olga Fjodorova3, Maire Peters1,5, Juha Kere8,9,10, Andres Salumets1,5,11,12, Priit Palta6,13, Kaarel Krjutškov1,8,9.
Abstract
Targeted next-generation sequencing (NGS) methods have become essential in medical research and diagnostics. In addition to NGS sensitivity and high-throughput capacity, precise biomolecule counting based on unique molecular identifier (UMI) has potential to increase biomolecule detection accuracy. Although UMIs are widely used in basic research its introduction to clinical assays is still in progress. Here, we present a robust and cost-effective TAC-seq (Targeted Allele Counting by sequencing) method that uses UMIs to estimate the original molecule counts of mRNAs, microRNAs, and cell-free DNA. We applied TAC-seq in three different clinical applications and compared the results with standard NGS. RNA samples extracted from human endometrial biopsies were analyzed using previously described 57 mRNA-based receptivity biomarkers and 49 selected microRNAs at different expression levels. Cell-free DNA aneuploidy testing was based on cell line (47,XX, +21) genomic DNA. TAC-seq mRNA profiling showed identical clustering results to transcriptome RNA sequencing, and microRNA detection demonstrated significant reduction in amplification bias, allowing to determine minor expression changes between different samples that remained undetermined by standard NGS. The mimicking experiment for cell-free DNA fetal aneuploidy analysis showed that TAC-seq can be applied to count highly fragmented DNA, detecting significant (p = 7.6 × 10-4) excess of chromosome 21 molecules at 10% fetal fraction level. Based on three proof-of-principle applications we demonstrate that TAC-seq is an accurate and highly potential biomarker profiling method for advanced medical research and diagnostics.Entities:
Year: 2018 PMID: 30588329 PMCID: PMC6299075 DOI: 10.1038/s41525-018-0072-5
Source DB: PubMed Journal: NPJ Genom Med ISSN: 2056-7944 Impact factor: 8.617
Fig. 1Principle and technical parameters of TAC-seq. a Schematic diagram of the assay to detect specific mRNA or cell-free DNA. Target-specific DNA oligonucleotide detector probes hybridize under stringent conditions to the studied cDNA or cfDNA. Both detector oligonucleotides consist of a specific 27-bp region (green), 4-bp unique molecular identifier (UMI) motif (NNNN), and universal sequences (purple and orange). The right detector oligonucleotide is 5′ phosphorylated. After rigorous hybridization, the pair of detector probes is ligated using a thermostable ligase under stringent conditions. Next, the ligated detectors complexed with the target region are captured with magnetic beads and PCR amplified to introduce sample-specific barcodes and other common motifs that are required for single-read NGS. b Spearman correlation analysis of the input and detected ERCC synthetic spike-in mRNA molecules at UMI threshold 4 (UMI = 4). UMI threshold is defined as the number of detected unique UMI sequences. For example, UMI = 4 indicates that a certain UMI motif is detected at least four times. UMIs are valuable only if the number of UMI combinations (8-bp UMI provides 65,536 variants, for example) is substantially larger than the sum of the target molecules in the studied sample. c Bar plot of Spearman’s correlation analysis of the ERCC input and detected molecules at different UMI thresholds. d Reproducibility of seven technical ERCC replicates (seven different icons on plot) of 22 spike-in molecules at UMI = 4
Fig. 2Comparison of the overall predictions for mRNA TAC-seq assay. a Principal component analysis of the full transcriptome RNA-seq, high-coverage TAC-seq and low-coverage TAC-seq of ten endometrial samples. The first principal component (PC1) describes most of the sample variability and correlates most with the receptivity status. Blue dots represent pre-receptive and red dots receptive human endometrial samples. One separate pre-receptive sample (indicated with an asterisk) represents the same sample that clusters differently in the heatmap analysis (below) and is, therefore, a potential biological outlier. b Heatmaps of the full transcriptome RNA-seq, high-coverage-, and low-coverage TAC-seq show the sensitivity to distinguish different endometrial samples according to their receptivity. One pre-receptive sample (indicated with an asterisk) shares the expression profile and clusters together with receptive samples in all three comparisons. Pre-receptive samples are labeled blue and receptive red. Detailed heatmaps are presented in Supplementary Fig. 3 together with housekeeping genes that demonstrate a lack of fluctuation of the pre-receptive and receptive biopsies. High-coverage TAC-seq data are presented at UMI = 2 and low-coverage data at UMI = 1 on PCA and heatmaps. Higher UMI thresholds in both high- and low-coverage approaches left low-expressed biomarker genes, like APOD, EDN3 etc without reads, according to Supplementary Fig. 4. The data are plotted as row-wise scaled log-transformed counts per million (CPM) values. The samples are hierarchically clustered column-wise using Pearson correlation. The genes are ordered row-wise according to the RNA-seq clustering results using Euclidean distance. Fewer genes are found expressed with a low-coverage compared to RNA-seq and high-coverage TAC-seq
Fig. 3TAC-seq miRNA assay performance. Correlation plots of four miRNA sample technical replicates using TAC-seq assay at UMI = 4. miRNA sample 1 is on the left hand and has two replicates, one plotted on the x-axis and the other on the y-axis. The same with miRNA sample 2 on the right hand
Fig. 4Trisomy detection under in vitro conditions. Boxplots over applied UMI thresholds of normalized molecule counts (y-axis) of trisomy TAC-seq experiments indicates a positive correlation between the trisomy factor (x-axis, trisomic cell proportion) and chr21 counts. Experiment 1, upper four plots, involved 114 loci along chr2 and chr21. One biological replica is depicted. Experiment 2, lower four plots at various UMI thresholds, involved extended TAC-seq probe set (in total 224 probes) along chr2, chr3, and chr21. The red asterisks indicate significant reference chromosome(s) and chr21 read-count-based differences between studied samples (p < 0.05, one-tailed Welch’s t-test)