| Literature DB >> 31641155 |
Kei Mizuno1,2, Shusuke Akamatsu1, Takayuki Sumiyoshi1, Jing Hao Wong2,3, Masashi Fujita4, Kazuaki Maejima4, Kaoru Nakano4, Atushi Ono5, Hiroshi Aikata5, Masaki Ueno6, Shinya Hayami6, Hiroki Yamaue6, Kazuaki Chayama5, Takahiro Inoue1, Osamu Ogawa1, Hidewaki Nakagawa7, Akihiro Fujimoto8,9.
Abstract
Plasma cell-free DNA (cfDNA) testing plays an increasingly important role in precision medicine for cancer. However, circulating cell-free tumor DNA (ctDNA) is highly diluted by cfDNA from non-cancer cells, complicating ctDNA detection and analysis. To identify low-frequency variants, we developed a program, eVIDENCE, which is a workflow for filtering candidate variants detected by using the ThruPLEX tag-seq (Takara Bio), a commercially-available molecular barcoding kit. We analyzed 27 cfDNA samples from hepatocellular carcinoma patients. Sequencing libraries were constructed and hybridized to our custom panel targeting about 80 genes. An initial variant calling identified 36,500 single nucleotide variants (SNVs) and 9,300 insertions and deletions (indels) across the 27 samples, but the number was much greater than expected when compared with previous cancer genome studies. eVIDENCE was applied to the candidate variants and finally 70 SNVs and 7 indels remained. Of the 77 variants, 49 (63.6%) showed VAF of < 1% (0.20-0.98%). Twenty-five variants were selected in an unbiased manner and all were successfully validated, suggesting that eVIDENCE can identify variants with VAF of ≥ 0.2%. Additionally, this study is the first to detect hepatitis B virus integration sites and genomic rearrangements in the TERT region from cfDNA of HCC patients. We consider that our method can be applied in the examination of cfDNA from other types of malignancies using specific custom gene panels and will contribute to comprehensive ctDNA analysis.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31641155 PMCID: PMC6805874 DOI: 10.1038/s41598-019-51459-4
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1(a) Summary of the eVIDENCE pipeline. An input BAM file is converted to the BAM file with consensus alignment pairs using Connor. Candidate variants are called using the converted BAM file. eVIDENCE removes unique molecular tag (UMT) and stem sequences from a raw BAM file and creates new FASTQ files while retaining the UMT information. These FASTQ files are converted into a new BAM file and for each candidate variant, eVIDENCE performs filtering using the new BAM file. (b) Number of detected single nucleotide variants (SNVs) and insertions and deletions (indels) from cell-free DNA (cfDNA) sequencing data processed by Connor (left) and after applying eVIDENCE (right). The expected number of SNVs and indels are indicated by blue and red dotted line (70 and 5, respectively). (c) Number of detected variants among 6 cfDNA samples in which matched tumor sequencing data were available. An initial variant calling using the processed data by Connor detected 11806 variants containing 14 tumor variants (left). After applying eVIDENCE, a large number of candidate variants were discarded, but 12 tumor variants remained (right), showing that eVIDENCE efficiently filtered candidate variants.
Figure 2The landscape of genomic alterations in 27 cell-free DNA samples. Each column represented a sample and each row represents a gene. Color legends of the aberrations represent including missense, nonsense, synonymous, splice site, frameshift and promoter variant.
The variant allele frequency (VAF) distribution of the detected variants.
| type | VAF (%) | total | |||
|---|---|---|---|---|---|
| <0.5 | 0.5–1.0 | 1.0–5.0 | 5.0< | ||
| nonsynonymous | 22 | 15 | 8 | 5 | 50 |
| synonymous | 4 | 3 | 5 | 1 | 13 |
| splice-site | 0 | 2 | 0 | 1 | 3 |
| indels | 1 | 1 | 1 | 4 | 7 |
| 0 | 1 | 3 | 0 | 4 | |
| total | 27 | 22 | 17 | 11 | 77 |
The variant allele frequency (VAF) distribution of driver genes of hepatocellular carcinoma.
| Gene | VAF (%) | total | |||
|---|---|---|---|---|---|
| <0.5 | 0.5–1.0 | 1.0–5.0 | 5.0< | ||
|
| 0 | 1 | 3 | 0 | 4 |
|
| 2 | 4 | 1 | 2 | 9 |
|
| 2 | 0 | 1 | 0 | 3 |
|
| 1 | 0 | 2 | 0 | 3 |
|
| 0 | 0 | 1 | 0 | 1 |
|
| 1 | 0 | 1 | 0 | 2 |
|
| 0 | 0 | 1 | 1 | 2 |
|
| 0 | 2 | 0 | 0 | 2 |
|
| 1 | 0 | 0 | 0 | 1 |
|
| 0 | 0 | 0 | 1 | 1 |
|
| 0 | 0 | 1 | 1 | 2 |
| total | 7 | 7 | 11 | 5 | 30 |
Summary of 25 single nucleotide variants subjected to validation experiments.
| Gene | Sample | Chr | Genomic position | Reference | Variant | AA change | Total number of consensus reads | Number of variant reads | VAF (%) | Validation with tumor DNA by amplicon sequencing | Validation with cfDNA by digital PCR |
|---|---|---|---|---|---|---|---|---|---|---|---|
| RK436 | 5 | 1295228 | G | A | — | 395 | 6 | 1.52 | y | — | |
|
| RK432 | 17 | 7577120 | C | T | R273H | 884 | 7 | 0.79 | n | N/A |
|
| RK451 | 17 | 7577133 | T | C | S269G | 1,179 | 3 | 0.25 | n | y |
|
| RK258 | 17 | 7578503 | C | T | V143M | 1,216 | 106 | 8.72 | y | — |
|
| RK436 | 17 | 7578535 | T | G | K132T | 620 | 61 | 9.84 | y | — |
|
| RK451 | 3 | 41274886 | A | G | Q379R | 1,376 | 22 | 1.60 | y | — |
|
| RK445 | 1 | 27106316 | C | T | S1976F | 1,035 | 3 | 0.29 | n | y |
|
| RK439 | 12 | 46231342 | T | G | Y394X | 816 | 17 | 2.08 | y | — |
|
| RK441 | 2 | 148657079 | G | T | E106X | 696 | 8 | 1.15 | y | — |
|
| RK444 | 2 | 178098809 | T | C | E79G | 622 | 5 | 0.80 | n | y |
|
| RK441 | 2 | 178098956 | A | C | L30R | 751 | 5 | 0.67 | y | — |
|
| RK439 | X | 20193353 | T | A | S386C | 397 | 10 | 2.52 | y | — |
|
| RK456 | 1 | 103405977 | G | A | P1097L | 1,264 | 4 | 0.32 | n | N/A |
|
| RK451 | 1 | 103488365 | G | T | P393Q | 1,384 | 10 | 0.72 | n | N/A |
|
| RK445 | 2 | 80097000 | G | A | R175H | 1,079 | 4 | 0.37 | y | — |
|
| RK432 | 3 | 77571995 | G | T | M292I | 685 | 3 | 0.44 | y | — |
|
| RK438 | 5 | 26902700 | G | T | P380T | 1,453 | 3 | 0.21 | y | — |
|
| RK456 | 5 | 112175232 | G | A | R1314K | 1,174 | 5 | 0.43 | n | y |
|
| RK433 | 6 | 93956601 | C | T | E879K | 615 | 4 | 0.65 | n | N/A |
|
| RK456 | 7 | 42004860 | C | A | A1271S | 1,263 | 7 | 0.55 | n | N/A |
|
| RK442 | 8 | 69104007 | C | T | A1466V | 857 | 27 | 3.15 | y | — |
|
| RK442 | 11 | 108121480 | T | G | C430G | 677 | 5 | 0.74 | n | y |
|
| RK445 | 14 | 42356455 | G | C | K209N | 996 | 4 | 0.40 | n | y |
|
| RK451 | 15 | 99251289 | A | G | N198S | 1,275 | 8 | 0.63 | n | y |
|
| RK439 | 20 | 9561315 | A | C | L156R | 788 | 6 | 0.76 | n | y |
Note: AA, amino acid; VAF, variant allele frequency; cfDNA, cell-free DNA; y, successfully validated; n, NOT validated; N/A: not assessed due to a lack of sample volume for the experiment.
Figure 3Concordance of genomic alterations in tissue and cell-free DNA (cfDNA) among 6 samples. Twelve out of 16 variants in tumor DNA were detected in cfDNA. Of 26 variants in cfDNA, 14 were detected in cfDNA, but one TP53 variant was validated by the targeted amplicon sequencing of the tumor. Thirteen variants detected in cfDNA only and their variant allele frequency are shown. Driver gene variants such as AIRD1A.S1976F, NEE2L2.E79G and PIK3CA.P449A were also observed. ATM.C430G was detected in matched lymphocyte DNA by digital PCR.