| Literature DB >> 32714374 |
Wenjuan Yang1, Ying Liu2, Ruyi Dong1, Jia Liu1, Jidong Lang1, Jialiang Yang1, Weiwei Wang1, Jingjing Li3, Bo Meng1, Geng Tian1,4.
Abstract
During the carcinogenesis of cervical cancer, the DNA of human papillomavirus (HPV) is frequently integrated into the human genome, which might be a biomarker for the early diagnosis of cervical cancer. Although the detection sensitivity of virus infection status increased significantly through the Illumina sequencing platform, there were still disadvantages remain for further improvement, including the detection accuracy and the complex integrated genome structure identification, etc. Nanopore sequencing has been proven to be a fast yet accurate technique of detecting pathogens in clinical samples with significant longer sequencing length. However, the identification of virus integration sites, especially HPV integration sites was seldom carried out by using nanopore platform. In this study, we evaluated the feasibility of identifying HPV integration sites by nanopore sequencer. Specifically, we re-sequenced the integration sites of a previously published sample by both nanopore and Illumina sequencing. After analyzing the results, three points of conclusions were drawn: first, 13 out of 19 previously published integration sites were found from all three datasets (i.e., nanopore, Illumina, and the published data), indicating a high overlap rate and comparability among the three platforms; second, our pipeline of nanopore and Illumina data identified 66 unique integration sites compared with previous published paper with 13 of them being verified by Sanger sequencing, indicating the higher integration sites detection sensitivity of our results compared with published data; third, we established a pipeline which could be used in HPV integration site detection by nanopore sequencing data without doing error correction analysis. In summary, a new nanopore data analysis method was tested and proved to be reliable in integration sites detection compared with methods of existing Illumina data analysis pipeline with less sequencing data required. It provides a solid evidence and tool to support the potential application of nanopore in virus status identification.Entities:
Keywords: HPV; cervical cancer; integration; nanopore; next generation (deep) sequencing (NGS); third generation sequencing
Year: 2020 PMID: 32714374 PMCID: PMC7344299 DOI: 10.3389/fgene.2020.00660
Source DB: PubMed Journal: Front Genet ISSN: 1664-8021 Impact factor: 4.599
FIGURE 2Flow chart of HPV integration site analysis pipeline. (A) The workflow of HPV nanopore sequencing analysis. (B) The library structure of nanopore sequencing reads.
FIGURE 1Distribution diagram of nanopore sequencing quality score and read length. (A) Quality score. (B) Length.
Summary of the sequencing results of HPV integration sites by nanopore and Illumina platforms.
| Platform | Raw reads | Quality filtered reads (≥Q20) | Pass 1D reads | Length (bp) | Sequence quality Scores | ||||
| min | mean | max | min | Mean | Max | ||||
| Illumina | 11,093,630 | 11,093,535(99.9%) | / | / | 151 | / | / | / | / |
| Nanopore | 381,475 | / | 333,028(87.3%) | 67 | 502.8 | 7,404 | 7 | 12.8 | 17 |
FIGURE 3Distribution of integration sites identified by Nanopore and Illumina platforms. (A) HPV integration breakpoints distribution in human chromosomes. (B) HPV integration breakpoints distribution in the HPV genome. (C) HPV integration sites functional locations in the human genome.
The summary of identified integration sites by all three datasets (Nanopore/Illumina/Liu et al.) after reads number filtration.
| Breakpoint* | Reads number (N/I/Liu) | HPV16 gene | Chromosome | Func_refGene | Gene_refGene | GeneDetail_refGene (bp) | Map | Sanger Verified |
| HPV16:2804,chr20:32516985 | 406/1439/940 | E1/E2 | Chr20 | Intergenic | 74812; 63309 | 20q11.22 | ||
| HPV16:3327,chr6:7328094 | 51/0/131 | E2 | Chr6 | Intronic | . | 6p24.3 | ||
| HPV16:2623,chr6:8814271 | 37/64/46 | E1 | Chr6 | Intergenic | 28593; 1582645 | 6p24.3 | ||
| HPV16:5534,chrX:20464412 | 132/168/43 | L2 | ChrX | Intergenic | 179662; 928004 | Xp22.12 | ||
| HPV16:4680,chr6:7326060 | 43/6/22 | L2 | Chr6 | Downstream | . | 6p24.3 | ||
| HPV16:3163,chrX:20462930 | 50/0/21 | E2 | ChrX | Intergenic | 178180; 929486 | Xp22.12 | ||
| HPV16:7619,chr2:99404946 | 5/1/0 | LCR | Chr2 | Intergenic | 16585; 5363 | 2q11.2 | S | |
| HPV16:3873,chr2:197526852 | 5/4/0 | E5 | Chr2 | Intronic | . | 2q33.1 | ||
| HPV16:2891,chr6:12645768 | 13/25/12 | E2 | Chr6 | Intergenic | 348341; 71120 | 6p24.1 | ||
| HPV16:3182,chr20:32516399 | 23/0/8 | E2 | Chr20 | Intergenic | 74226; 63895 | 20q11.22 | ||
| HPV16:850,chr20:32515476 | 13/2/8 | E7 | Chr20 | Intergenic | 73303; 64818 | 20q11.22 | ||
| HPV16:4453,chr20:32472390 | 25/19/4 | L2 | Chr20 | Intergenic | 30217; 107904 | 20q11.22 | ||
| HPV16:5744,chr11:103893587 | 37/62/2 | L1 | Chr11 | Intronic | . | 11q22.3 | ||
| HPV16:7519,chr2:197519914 | 19/56/0 | LCR | Chr2 | Intronic | . | 2q33.1 | S | |
| HPV16:4489,chr11:31749949 | 17/32/0 | L2 | Chr11 | Intronic | . | 11p13 | S | |
| HPV16:2110,chr2:99396966 | 17/25/0 | E1 | Chr2 | Intergenic | 8605; 13343 | 2q11.2 | S | |
| HPV16:322,chr20:32486709 | 7/21/0 | E6 | Chr20 | Intergenic | 44536; 93585 | 20q11.22 | S | |
| HPV16:1276,chr18:50054 | 5/20/14 | E1 | Chr18 | Intergenic | 34124; 59011 | 18p11.32 | ||
| HPV16:4394,chr12:17224129 | 3/15/2 | L2 | Chr12 | Intergenic | 80567; 510628 | 12p12.3 | ||
| HPV16:4834,chrX:20461879 | 41/12/0 | L2 | ChrX | Intergenic | 177129; 930537 | Xp22.12 | S | |
| HPV16:1037,chrX:20445300 | 2/11/0 | E1 | ChrX | Intergenic | 160550; 947116 | Xp22.12 | ||
| HPV16:6618,chr20:32506419 | 3/9/12 | L1 | Chr20 | Intergenic | 64246; 73875 | 20q11.22 | ||
| HPV16:2638,chr20:32502300 | 4/8/0 | E1 | Chr20 | Intergenic | 60127; 77994 | 20q11.22 | ||
| HPV16:1155,chrX:20447276 | 27/8/0 | E1 | ChrX | Intergenic | 162526; 945140 | Xp22.12 | S | |
| HPV16:214,chr6:7242229 | 6/7/0 | E6 | Chr6 | Intronic | . | 6p24.3 | ||
| HPV16:4307,chr13:74240744 | 2/6/0 | L2 | Chr13 | Intergenic | 78728; 19405 | 13q22.1 | ||
| HPV16:883,chr20:32487936 | 4/5/5 | E1 | Chr20 | Intergenic | 45763; 92358 | 20q11.22 | ||
| HPV16:4250,chr2:197550764 | 2/4/0 | L2 | Chr2 | Intronic | . | 2q33.1 | ||
| HPV16:7362chr18:12717 | 8/3/0 | LCR | Chr18 | ncRNA_intronic | . | 18p11.32 | ||
| HPV16:4405,chr2:99438640 | 1/3/2 | L2 | Chr2 | Exonic | . | 2q11.2 | ||
| HPV16:462,chr20:30942164 | 8/2/0 | E6 | Chr20 | Intergenic | 19353; 3983 | 20q11.21 | S | |
| HPV16:6558,chr2:99431140 | 3/1/0 | L1 | Chr2 | Intronic | . | 2q11.2 | ||
| HPV16:7139,chr20:32478733 | 169/0/0 | L1 | Chr20 | Intergenic | 36560; 101561 | 20q11.22 | S | |
| HPV16:4276,chr20:32502143 | 51/0/0 | L2 | Chr20 | Intergenic | 59970; 78151 | 20q11.22 | S | |
| HPV16:4029,chr11:103891933 | 21/0/0 | E5 | Chr11 | Intronic | . | 11q22.3 | S | |
| HPV16:3312,chr6:17081322 | 8/0/0 | E2 | Chr6 | Intergenic | 314208; 21167 | 6p22.3 | S | |
| HPV16:3295,chr11:31761709 | 6/0/0 | E2 | Chr11 | Intronic | . | 11p13 | S | |
| HPV16:5311,chr1:455824 | 6/0/0 | L2 | Chr1 | Intergenic | 87227; 106936 | 1p36.33 | ||
| HPV16:4027,chr11:103891932 | 4/0/0 | E5 | Chr11 | Intronic | . | 11q22.3 | ||
| HPV16:6681,chr10:80475 | 4/0/0 | L1 | Chr10 | Intergenic | NONE; 12353 | 10p15.3 | ||
| HPV16:1570,chrX:20385416 | 4/0/0 | E1 | ChrX | Intergenic | 100666; 1007000 | Xp22.12 | ||
| HPV16:4508,chr1:143415731 | 3/0/0 | L2 | Chr1 | Intergenic | 213492; 257190 | 1q21.1 | ||
| HPV16:6840,chr6:171021105 | 3/0/0 | L1 | Chr6 | Intergenic | 127325; NONE | 6q27 | ||
| HPV16:6840,chr1:547221 | 3/0/0 | L1 | Chr1 | Intergenic | 178624; 15539 | 1p36.33 | ||
| HPV16:6840,chr1:547739 | 3/0/0 | L1 | Chr1 | Intergenic | 179142; 15021 | 1p36.33 | ||
| HPV16:6305,chr2:197525082 | 3/3/0 | L1 | Chr2 | Intronic | . | 2q33.1 | ||
| HPV16:4730,chr1:10270 | 3/0/0 | L2 | Chr1 | Intergenic | NONE; 1604 | 1p36.33 | ||
| HPV16:6929,chr2:99443220 | 2/0/0 | L1 | Chr2 | Intronic | . | 2q11.2 | ||
| HPV16:1469,chr3:189605104 | 2/0/0 | E1 | Chr3 | Intronic | . | 3q28 | ||
| HPV16:4730,chrX:155259784 | 2/0/0 | L2 | ChrX | Intergenic | 1936; NONE | Xq28 | ||
| HPV16:4732,chr2:114360505 | 2/0/0 | L2 | Chr2 | ncRNA_intronic | . | 2q13 | ||
| HPV16:2621,chr20:3560423 | 2/0/0 | E1 | Chr20 | Intronic | . | 20p13 | ||
| HPV16:4732,chr18:10186 | 2/0/0 | L2 | Chr18 | Intergenic | NONE, | NONE; 1889 | 18p11.32 | |
| HPV16:3324,chr6:7328088 | 2/0/0 | E2 | Chr6 | Intronic | . | 6p24.3 | ||
| HPV16:6840,chr1:547835 | 2/0/0 | L1 | Chr1 | Intergenic | 179238; 14925 | 1p36.33 | ||
| HPV16:4793,chr12:80918599 | 2/0/0 | L2 | Chr12 | Intronic | . | 12q21.31 | ||
| HPV16:3340,chr15:40890192 | 2/0/0 | E2/E4 | Chr15 | Intronic | . | 15q15.1 | ||
| HPV16:4732,chr1:10164 | 2/0/0 | L2 | Chr1 | Intergenic | NONE; 1710 | 1p36.33 | ||
| HPV16:6899,chr2:72209094 | 2/0/0 | L1 | Chr2 | Intergenic | 295201; 147273 | 2p13.2 | ||
| HPV16:3340,chr20:35719229 | 0/35/0 | E2/E4 | Chr20 | Intronic | . | 20q11.23 | ||
| HPV16:3340,chr12:81091809 | 0/96/0 | E2/E4 | Chr12 | Intergenic | 17841; 9599 | 12q21.31 | ||
| HPV16:1355,chr2:99429402 | 0/6/0 | E1 | Chr2 | Intronic | . | 2q11.2 | ||
| HPV16:4782,chr2:99435835 | 0/6/0 | L2 | Chr2 | Intronic | . | 2q11.2 | ||
| HPV16:1030,chr9:123681030 | 0/4/0 | E1 | Chr9 | Intronic | . | 9q33.2 | ||
| HPV16:3378,chr17:69506855 | 0/4/0 | E2/E4 | Chr17 | Intergenic | 308535; 511137 | 17q24.3 | ||
| HPV16:1515,chr21:39554078 | 0/3/0 | E1 | Chr21 | Intergenic | 25473; 24172 | 21q22.13 | ||
| HPV16:1916,chr16:79637361 | 0/3/0 | E1 | Chr16 | Intergenic | 2739; 117848 | 16q23.2 | ||
| HPV16:2117,chr1:240419064 | 0/3/0 | E1 | Chr1 | Intronic | . | 1q43 | ||
| HPV16:3050,chr17:65126150 | 0/3/0 | E2 | Chr17 | Intronic | . | 17q24.2 | ||
| HPV16:3337,chr14:46921971 | 0/3/0 | E2/E4 | Chr14 | ncRNA_intronic | . | 14q21.2 | ||
| HPV16:4399,chr2:127164444 | 0/3/0 | L2 | Chr2 | Intergenic | 288882; 249067 | 2q14.3 | ||
| HPV16:4409,chr3:45985168 | 0/3/0 | L2 | Chr3 | Intronic | . | 3p21.31 | ||
| HPV:16:5685,chr3:56156825 | 0/3/0 | L1 | Chr3 | Intronic | . | 3p14.3 | ||
| HPV16:7086,chr2:99457820 | 0/3/0 | L1 | Chr2 | Intronic | . | 2q11.2 | ||
| HPV16:2724,chr20:30948621 | 0/0/8 | E1 | Chr20 | Intronic | 20q11.21 | |||
| HPV16:3883,chr3:156990726 | 0/0/4 | E5 | Chr3 | Intronic | 3q25.31 | |||
| HPV16:3344,chr6:7327990 | 0/89/43 | E2 | Chr6 | Intronic | . | 6p24.3 | ||
| HPV16:1381,chr11:31187618 | 0/3/0 | E1 | Chr11 | Intronic | . | 11p13 | ||
| HPV16:1891,chr20:60384079 | 0/3/0 | E1 | Chr20 | Intronic | . | 20q13.33 | ||
| HPV16:2075,chr12:39215471 | 0/3/0 | E1 | Chr12 | Intronic | . | 12q12 | ||
| HPV16:2237,chr4:28262779 | 0/3/0 | E1 | Chr4 | Intergenic | 978932;558425 | 4p15.1 | ||
| HPV16:2716,chr13:74250432 | 1/3/0 | E1 | Chr13 | Intergenic | 88416;9717 | 13q22.1 | ||
| HPV16:3340,chr12:81091909 | 0/58/0 | E2/E4 | Chr12 | Intergenic | 17941;9499 | 12q21.31 | ||
| HPV16:2724,chr20:30948899 | 0/11/0 | E1 | Chr20 | Intronic | . | 20q11.21 | ||
| HPV16:3340,chr1:212166417 | 0/27/0 | E2/E4 | Chr1 | Intronic | . | 1q32.3 |
FIGURE 4Summary of unique integration sites. (A) Venn diagram of overlapping integration sites of two identified methods. (B) Chromosome localization of unique integration sites from three datasets.
FIGURE 5PCR gel of verified integration sites and sequencing result of integration site D. (A) Gel image of amplified integration sites DNA fragments. (B) Sanger sequencing of integration site D. (C) The sequence and blast result image of the integration site D.