| Literature DB >> 35893066 |
Stefanus Bernard1, Hendra Wibawa2, Mohamad Saifudin Hakim3, Arli Aditya Parikesit1, Chandra Kusuma Dewa4, Yasubumi Sakakibara5.
Abstract
Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV-2) is a newly emerging virus well known as the major cause of the worldwide pandemic due to Coronavirus Disease 2019 (COVID-19). Major breakthroughs in the Next Generation Sequencing (NGS) field were elucidated following the first release of a full-length SARS-CoV-2 genome on the 10 January 2020, with the hope of turning the table against the worsening pandemic situation. Previous studies in respiratory virus characterization require mapping of raw sequences to the human genome in the downstream bioinformatics pipeline as part of metagenomic principles. Illumina, as the major player in the NGS arena, took action by releasing guidelines for improved enrichment kits called the Respiratory Virus Oligo Panel (RVOP) based on a hybridization capture method capable of capturing targeted respiratory viruses, including SARS-CoV-2; therefore, allowing a direct map of raw sequences data to SARS-CoV-2 genome in downstream bioinformatics pipeline. Consequently, two bioinformatics pipelines emerged with no previous studies benchmarking the pipelines. This study focuses on gaining insight and understanding of target enrichment workflow by Illumina through the utilization of different bioinformatics pipelines named as 'Fast Pipeline' and 'Normal Pipeline' to SARS-CoV-2 strains isolated from Yogyakarta and Central Java, Indonesia. Overall, both pipelines work well in the characterization of SARS-CoV-2 samples, including in the identification of major studied nucleotide substitutions and amino acid mutations. A higher number of reads mapped to the SARS-CoV-2 genome in Fast Pipeline and merely were discovered as a contributing factor in a higher number of coverage depth and identified variations (SNPs, insertion, and deletion). Fast Pipeline ultimately works well in a situation where time is a critical factor. On the other hand, Normal Pipeline would require a longer time as it mapped reads to the human genome. Certain limitations were identified in terms of pipeline algorithm, whereas it is highly recommended in future studies to design a pipeline in an integrated framework, for instance, by using NextFlow, a workflow framework to combine all scripts into one fully integrated pipeline.Entities:
Keywords: Illumina; Next Generation Sequencing; SARS-CoV-2; bioinformatics pipeline; enrichment
Mesh:
Year: 2022 PMID: 35893066 PMCID: PMC9394340 DOI: 10.3390/genes13081330
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.141
Data of patients with COVID-19 from Yogyakarta and Central Java that were involved in this study.
| No | NGS Sample Code | NGS Batch | Sample ID | Sex | Age (Years) | Collection Date | Comorbid |
|---|---|---|---|---|---|---|---|
| 1 | B6 | 1 | DIY-C25.2-02449 | Male | 77 | 22 June 2020 | Yes |
| 2 | C5 | 1 | DIY-C78.01481 | Female | 83 | 10 August 2020 | Yes |
| 3 | F2 | 1 | DIY-C25.2-00927 | Male | 30 | 16 May 2020 | No |
| 4 | F4 | 1 | KLN-C25.2-02538 | Female | 55 | 26 June 2020 | Yes |
| 5 | S3 | 2 | RSS-10001 | Male | 88 | 18 August 2020 | Yes |
| 6 | S9 | 2 | BBTKLPP-47964 | Male | 48 | 31 August 2020 | Yes |
| 7 | S10 | 2 | BBTKLPP-48651 | Male | 41 | 9 September 2020 | No |
| 8 | S15 | 2 | DIY-C78.00061 | Female | 49 | 16 June 2020 | No |
| 9 | S3-1 | 3 | DIY 1-58634 | Male | 65 | 18 September 2020 | Yes |
| 10 | S3-4 | 3 | DIY 1-24778 | Male | 34 | 23 December 2020 | No |
| 11 | S3-5 | 3 | DIY 1-10279 | Male | 77 | 7 September 2020 | No |
| 12 | S3-7 | 3 | DIY 1-10282 | Female | 42 | 7 September 2020 | No |
| 13 | S3-8 | 3 | DIY 1-24762 | Female | 48 | 23 December 2020 | No |
| 14 | S3-9 | 3 | RSS-10008 | Male | 58 | 27 December 2020 | Yes |
| 15 | S3-11 | 3 | DIY 1-24776 | Female | 34 | 23 December 2020 | No |
| 16 | S3-14 | 3 | 53311 | Female | 81 | 9 September 2020 | Yes |
Figure 1Fast Pipeline scheme; blue shapes represent the method; green shapes represent the tools used in each phase.
Figure 2Normal Pipeline scheme; blue shapes represent the method; green shapes represent the tools used in each phase.
The overview of WGS data that were involved in the study.
| NGS Sample Code | Batch | CT Value | Total Sequences (Paired-End Reads) | Sequence Length (bp) | % GC |
|---|---|---|---|---|---|
| B6 | 1 | 19.70 | 11,268,022 | 35–74 | 41 |
| C5 | 1 | 16.90 | 2,707,228 | 35–74 | 42 |
| F2 | 1 | 27.92 | 2,461,478 | 35–74 | 50 |
| F4 | 1 | 24.68 | 1,366,538 | 35–74 | 45 |
| S3 | 2 | 18.10 | 18,807,934 | 35–74 | 38 |
| S9 | 2 | 19.64 | 7,827,098 | 35–74 | 46 |
| S10 | 2 | 21.24 | 2,698,396 | 35–74 | 42 |
| S15 | 2 | 22.31 | 6,111,408 | 35–74 | 46 |
| S3-1 | 3 | 19.53 | 3,566,896 | 35–74 | 40 |
| S3-4 | 3 | 13.27 | 1,167,562 | 35–74 | 38 |
| S3-5 | 3 | 21.00 | 9,941,746 | 35–74 | 38 |
| S3-7 | 3 | 21.55 | 1,669,316 | 35–74 | 39 |
| S3-8 | 3 | 15.67 | 2,731,486 | 35–74 | 39 |
| S3-9 | 3 | 22.27 | 4,748,810 | 35–74 | 45 |
| S3-11 | 3 | 16.89 | 5,895,626 | 35–74 | 39 |
| S3-14 | 3 | 17.73 | 376,514 | 35–74 | 44 |
Total sequences before and after quality control using Trimmomatic.
| NGS Sample Code | Total Sequences | ||
|---|---|---|---|
| Before Trimming | Post-Trimming (QC) | Trimmed | |
| B6 | 11,268,022 | 11,184,784 | 0.74 |
| C5 | 2,707,228 | 2,683,232 | 0.89 |
| F2 | 2,461,478 | 2,440,518 | 0.85 |
| F4 | 1,366,538 | 1,345,416 | 1.55 |
| S3 | 18,807,934 | 18,387,180 | 2.24 |
| S9 | 7,827,098 | 7,587,506 | 3.06 |
| S10 | 2,698,396 | 2,590,256 | 4.01 |
| S15 | 6,111,408 | 5,942,890 | 2.76 |
| S3-1 | 3,566,896 | 3,502,824 | 1.80 |
| S3-4 | 1,167,562 | 1,155,934 | 1.00 |
| S3-5 | 9,941,746 | 9,807,834 | 1.35 |
| S3-7 | 1,640,458 | 1,640,458 | 1.73 |
| S3-8 | 2,731,486 | 2,696,200 | 1.29 |
| S3-9 | 4,670,496 | 4,670,496 | 1.65 |
| S3-11 | 5,816,070 | 5,816,070 | 1.35 |
| S3-14 | 372,662 | 372,662 | 1.02 |
| Average | 1.70 | ||
The alignment statistics summary of unmapped and fully mapped reads in each sample’s post-read mapping to the SARS-CoV-2 genome (NC_045512.2).
| NGS Sample Code | Unmapped to | Fully Mapped to | ||
|---|---|---|---|---|
| Number of | Percentage (%) | Number of | Percentage (%) | |
| B6 | 2,028,393 | 18.14 | 9,156,391 | 81.86 |
| C5 | 1,108,537 | 41.31 | 1,574,695 | 58.69 |
| F2 | 2,391,210 | 97.98 | 49,308 | 2.02 |
| F4 | 1,203,004 | 89.42 | 142,412 | 10.58 |
| S3 | 548,965 | 2.99 | 17,838,215 | 97.01 |
| S9 | 4,969,736 | 65.50 | 2,617,770 | 34.50 |
| S10 | 1,452,990 | 56.09 | 1,137,266 | 43.91 |
| S15 | 4,071,831 | 68.52 | 1,871,059 | 31.48 |
| S3-1 | 830,959 | 23.72 | 2,671,865 | 76.28 |
| S3-4 | 19,722 | 1.71 | 1,136,212 | 98.29 |
| S3-5 | 205,582 | 2.10 | 9,602,252 | 97.90 |
| S3-7 | 275,337 | 16.78 | 1,365,121 | 83.22 |
| S3-8 | 175,137 | 6.50 | 2,521,063 | 93.50 |
| S3-9 | 3,427,102 | 73.38 | 1,243,394 | 26.62 |
| S3-11 | 576,452 | 9.91 | 5,239,618 | 90.09 |
| S3-14 | 290,681 | 78.00 | 81,981 | 22.00 |
Distribution of reads in each sample post-read mapping to the human genome (GRCh38) and SARS-CoV-2 genome (NC_045512.2).
| NGS Sample Code | Fully Mapped to | Fully Mapped to | Neither Both | Skipped during BAM to FASTQ Conversion | ||||
|---|---|---|---|---|---|---|---|---|
| Number of Reads | Percentage (%) | Number of Reads | Percentage (%) | Number of Reads | Percentage (%) | Number of Reads | Percentage (%) | |
| B6 | 8,743,980 | 78.18 | 2,435,133 | 21.77 | 3444 | 0.02 | 2227 | 0.02 |
| C5 | 1,467,402 | 54.69 | 1,125,429 | 41.94 | 89,534 | 3.34 | 867 | 0.03 |
| F2 | 38,668 | 1.58 | 2,399,180 | 98.31 | 2272 | 0.09 | 398 | 0.02 |
| F4 | 134,956 | 10.03 | 1,210,132 | 89.94 | 108 | 0.01 | 220 | 0.02 |
| S3 | 15,158,756 | 82.44 | 3,216,116 | 17.49 | 9700 | 0.05 | 2608 | 0.01 |
| S9 | 2,363,094 | 31.14 | 5,214,314 | 68.72 | 7000 | 0.09 | 3098 | 0.04 |
| S10 | 1,009,265 | 38.96 | 1,546,174 | 59.69 | 34,167 | 1.32 | 650 | 0.03 |
| S15 | 1,676,134 | 28.20 | 4,235,721 | 71.27 | 28,482 | 0.48 | 2553 | 0.04 |
| S3-1 | 2,321,562 | 66.28 | 1,180,022 | 33.69 | 502 | 0.01 | 738 | 0.02 |
| S3-4 | 988,416 | 85.51 | 165,770 | 14.34 | 1706 | 0.15 | 42 | 0.00 |
| S3-5 | 8,996,852 | 91.73 | 807,254 | 8.23 | 3020 | 0.03 | 708 | 0.01 |
| S3-7 | 1,249,452 | 76.16 | 389,999 | 23.77 | 512 | 0.03 | 495 | 0.03 |
| S3-8 | 2,332,361 | 86.51 | 345,223 | 12.80 | 18,489 | 0.69 | 127 | 0.00 |
| S3-9 | 1,156,467 | 24.76 | 3,230,410 | 69.17 | 282,417 | 6.05 | 1202 | 0.03 |
| S3-11 | 4,628,769 | 79.59 | 1,178,473 | 20.26 | 8509 | 0.15 | 319 | 0.01 |
| S3-14 | 76,805 | 20.61 | 295,584 | 79.32 | 207 | 0.06 | 66 | 0.02 |
| Average | 53.52 | Average | 45.67 | Average | 0.78 | Average | 0.02 | |
Read mapping coverage results.
| NGS | Read Mapping Coverage (Times) | Difference | |
|---|---|---|---|
| Fast Pipeline | Normal Pipeline | ||
| B6 | 22,352.5 | 21,357.2 | 4.70 |
| C5 | 3833.9 | 3576.5 | 7.20 |
| F2 | 115 | 94.6 | 21.6 |
| F4 | 347.7 | 329.8 | 5.40 |
| S3 | 43,244.4 | 36,843 | 17.4 |
| S9 | 6350.02 | 5744.18 | 10.5 |
| S10 | 2756.31 | 2457.99 | 12.1 |
| S15 | 4545.72 | 4077.32 | 11.5 |
| S3-1 | 6494.44 | 5653.07 | 14.9 |
| S3-4 | 2764.01 | 2410.1 | 14.7 |
| S3-5 | 23,481.7 | 22,016.8 | 6.60 |
| S3-7 | 3333.82 | 3054.74 | 9.10 |
| S3-8 | 6163.56 | 5706.29 | 8.00 |
| S3-9 | 3033.52 | 2824.06 | 7.40 |
| S3-11 | 12,794.7 | 11,323.3 | 13.00 |
| S3-14 | 199.371 | 186.96 | 6.60 |
| Average | 10.66 | ||
Figure 3Variation (SNP, Insertion, and Deletion) Detected in All Samples Implemented in Both Pipelines.
Statistics Summary of Variation (SNP, Insertion, and Deletion) Detected in All Samples Implemented in Both Pipelines.
| NGS Sample Code | Fast Pipeline | Normal Pipeline | Difference | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| #SNP | #Insertion | #Deletion | All Variation | #SNP | #Insertion | #Deletion | All Variation | #SNP | #Insertion | #Deletion | All Variation | |
| B6 | 390 | 128 | 331 | 849 | 349 | 120 | 306 | 775 | 11.75 | 6.67 | 8.17 | 9.55 |
| C5 | 304 | 228 | 33 | 565 | 275 | 206 | 31 | 512 | 10.55 | 10.68 | 6.45 | 10.35 |
| F2 | 20 | 22 | 14 | 56 | 15 | 12 | 11 | 38 | 33.33 | 83.33 | 27.27 | 47.37 |
| F4 | 92 | 38 | 8 | 138 | 82 | 38 | 8 | 128 | 12.20 | 0.00 | 0.00 | 7.81 |
| S3 | 624 | 156 | 70 | 850 | 434 | 152 | 66 | 652 | 43.78 | 2.63 | 6.06 | 30.37 |
| S9 | 285 | 128 | 39 | 452 | 211 | 126 | 36 | 373 | 35.07 | 1.59 | 8.33 | 21.18 |
| S10 | 219 | 63 | 65 | 347 | 179 | 55 | 55 | 289 | 22.35 | 14.55 | 18.18 | 20.07 |
| S15 | 353 | 86 | 129 | 568 | 319 | 85 | 121 | 525 | 10.66 | 1.18 | 6.61 | 8.19 |
| S3-1 | 340 | 142 | 58 | 540 | 276 | 129 | 54 | 459 | 23.19 | 10.08 | 7.41 | 17.65 |
| S3-4 | 121 | 114 | 24 | 259 | 108 | 106 | 16 | 230 | 12.04 | 7.55 | 50.00 | 12.61 |
| S3-5 | 245 | 145 | 44 | 434 | 228 | 146 | 46 | 420 | 7.46 | −0.68 | −4.35 | 3.33 |
| S3-7 | 160 | 114 | 27 | 301 | 145 | 101 | 21 | 267 | 10.34 | 12.87 | 28.57 | 12.73 |
| S3-8 | 153 | 127 | 27 | 307 | 131 | 117 | 24 | 272 | 16.79 | 8.55 | 12.50 | 12.87 |
| S3-9 | 173 | 36 | 765 | 974 | 148 | 33 | 709 | 890 | 16.89 | 9.09 | 7.90 | 9.44 |
| S3-11 | 213 | 115 | 38 | 366 | 182 | 107 | 35 | 324 | 17.03 | 7.48 | 8.57 | 12.96 |
| S3-14 | 83 | 31 | 9 | 123 | 78 | 27 | 11 | 116 | 6.41 | 14.81 | −18.18 | 6.03 |
| Average | 18.11 | 11.90 | 10.84 | 15.16 | ||||||||
Identified SNPs in All Batch 1 Samples Running All Pipelines.
| POSITION | 5’UTR | NSP3-ORF1AB | NSP5-ORF1AB | NSP12-ORF1AB | NSP13-ORF1AB | NSP14-ORF1AB | SPIKE-S | NS3-ORF3A | MATRIX-M | NS7A-ORF7A | NP-N | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| POSITION | 241 | 3037 | 3529 | 4754 | 5184 | 10201 | 10507 | 14055 | 14292 | 14408 | 14694 | 15406 | 17964 | 18744 | 18877 | 23403 | 25553 | 25563 | 25687 | 26735 | 26867 | 27610 | 28735 | 28752 | 29209 |
| REFERENCE (NC_045512.2) | C | C | T | C | C | G | C | G | C | C | C | G | G | C | C | A | C | G | G | C | A | C | T | A | A |
| B6 FAST PIPELINE |
|
| T | C |
| G |
| G | C |
| C | G | G |
|
|
| C |
| G |
|
| C | T | A | A |
| B6 NORMAL PIPELINE |
|
| T | C |
| G |
| G | C |
| C | G | G |
|
|
| C |
| G |
|
| C | T | A | A |
| C5 FAST PIPELINE |
|
|
|
| C | G | C | G |
|
|
|
|
| C |
|
|
|
|
|
| A | C |
|
| A |
| C5 NORMAL PIPELINE |
|
|
|
| C | G | C | G |
|
|
|
|
| C |
|
|
|
|
|
| A | C |
|
| A |
| F2 FAST PIPELINE | C | C | T | C | C |
| C | G | C | C | C | G | G | C | C | A | C | G | G | C | A | C | T | A |
|
| F2 NORMAL PIPELINE | C | C | T | C | C |
| C | G | C | C | C | G | G | C | C | A | C | G | G | C | A | C | T | A |
|
| F4 FAST PIPELINE |
|
| T | C |
| G |
|
| C |
| C | G | G |
|
|
| C |
| G |
|
|
| T | A | A |
| F4 NORMAL PIPELINE |
|
| T | C |
| G |
|
| C |
| C | G | G |
|
|
| C |
| G |
|
|
| T | A | A |
(a) Identified SNPs in All Batch 2 Samples Running All Pipelines (Part 1). (b) Identified SNPs in All Batch 2 Samples Running All Pipelines (Part 2).
| (a) Identified SNPs in All Batch 2 Samples Running All Pipelines (Part 1) | ||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| REGION | 5’UTR | NSP1-ORF1AB | NSP3-ORF1AB | NSP5-ORF1AB | NSP6-ORF1AB | NSP8-ORF1AB | NSP9-ORF1AB | NSP12-ORF1AB | NSP13-ORF1AB | |||||||||||||||||
| POSITION | 241 | 1545 | 2263 | 2512 | 3037 | 4084 | 5184 | 5784 | 6312 | 7639 | 10089 | 10507 | 11083 | 12152 | 12439 | 12809 | 13730 | 14120 | 14183 | 14408 | 15543 | 15765 | 16156 | 16395 | 16647 | 16694 |
| REFERENCE (NC_045512.2) | C | C | C | A | C | C | C | C | C | C | A | C | G | G | C | C | C | C | C | C | G | A | A | A | G | C |
| S3 FAST PIPELINE |
|
| C | A |
|
|
| C | C | C |
|
| G | G | C | C | C | C |
|
| G | A | A |
|
| C |
| S3 NORMAL PIPELINE |
|
| C | A |
|
|
| C | C | C |
|
| G | G | C | C | C | C |
|
| G | A | A |
|
| C |
| S9 FAST PIPELINE | C | C | C |
| C | C | C | C |
| C | A | C |
|
|
|
|
| C | C | C | G | A | G | A | G |
|
| S9 NORMAL PIPELINE | C | C | C |
| C | C | C | C |
| C | A | C |
|
|
|
|
| C | C | C | G | A |
| A | G |
|
| S10 FAST PIPELINE |
| C | C | A |
| C | C | C | C | C | A | C | G | G | C | C | C |
| C |
| G |
| A | A | G | C |
| S10 NORMAL PIPELINE |
| C | C | A |
| C | C | C | C | C | A | C | G | G | C | C | C |
| C |
| G |
| A | A | G | C |
| S15 FAST PIPELINE |
| C |
| A |
| C |
|
| C |
| A |
| G | G | C | C | C | C | C |
|
| A | A | A | G | C |
| S15 NORMAL PIPELINE |
| C |
| A |
| C |
|
| C |
| A |
| G | G | C | C | C | C | C |
|
| A | A | A | G | C |
|
| ||||||||||||||||||||||||||
|
|
|
|
|
|
|
|
|
| ||||||||||||||||||
| POSITION | 18744 | 18877 | 19002 | 20124 | 21652 | 21742 | 21748 | 21809 | 22200 | 22334 | 23403 | 23593 | 23929 | 25563 | 26056 | 26735 | 26867 | 28073 | 28311 | 28628 | 28851 | 28975 | 29642 | |||
| REFERENCE (NC_045512.2) | C | C | A | T | T | C | T | G | T | T | A | G | C | G | G | C | A | G | C | G | G | G | C | |||
| S3 FAST PIPELINE |
|
| A | T | T |
| T | G |
| T |
| G | C |
| G |
|
| G | C |
|
| G | C | |||
| S3 NORMAL PIPELINE |
|
| A | T | T |
| T | G |
| T |
| G | C |
| G |
| G | G | C |
|
| G | C | |||
| S9 FAST PIPELINE | C | C | G |
| C | C |
| G | T |
|
| G |
| G | G | C | A |
|
| G | G | G | C | |||
| S9 NORMAL PIPELINE | C | C |
|
|
| C |
| G | T |
|
| G |
| G | G | C | A |
|
| G | G | G | C | |||
| S10 FAST PIPELINE | C |
| A | T | T | C | T |
| T | T |
|
| C |
|
|
| A | G | C | G | G |
|
| |||
| S10 NORMAL PIPELINE | C |
| A | T | T | C | T | C | T | T |
|
| C |
|
|
| A | G | C | G | G |
|
| |||
| S15 FAST PIPELINE |
|
| A | T | T | C | T | G | T | T |
| G | C |
| G |
| A | G | C | G | G | G | C | |||
| S15 NORMAL PIPELINE |
|
| A | T | T | C | T | G | T | T |
| G | C |
| G |
| A | G | C | G | G | G | C | |||
Identified SNPs in All Batch 3 Samples Running All Pipelines.
| REGION | NSP3-ORF1AB | NSP4-ORF1AB | NSP5A-ORF1AB | NSP6-ORF1AB | NSP7-ORF1AB | NSP12-ORF1AB | NSP15-ORF1AB | SPIKE GLYCOPROTEIN | ORF3A | ORF8 | NP-N | ||||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| POSITION | 3305 | 5184 | 5554 | 6309 | 6906 | 9701 | 9710 | 9711 | 10313 | 10904 | 10995 | 11219 | 11991 | 14120 | 14408 | 14741 | 15848 | 19793 | 19794 | 20443 | 20611 | 21575 | 22200 | 23042 | 23270 | 23403 | 23599 | 23629 | 25337 | 25563 | 25590 | 25904 | 28020 | 28628 | 28655 | 28724 | 28851 | 28881 | 28883 | 28975 | 28977 |
| REFERENCE (NC_045512.2) | A | C | G | G | C | A | T | C | C | A | A | A | A | C | C | C | C | G | G | G | C | C | T | T | G | A | T | T | G | G | A | C | T | G | G | C | G | G | G | G | C |
| S3-1 FAST PIPELINE | A |
| G | G | C | A | T | C | C | A | A | A | A | C |
|
| C | G | G |
|
| C |
| T | G |
| T | T | G |
| A | C | T |
| G | C |
| G | G | G | C |
| S3-1 NORMAL PIPELINE | A |
| G | G | C | A | T | C | C | A | A | A | A | C |
|
| C | G | G |
|
| C |
| T | G |
| T | T | G |
| A | C | T |
| G | C |
| G | G | G | C |
| S3-4 FAST PIPELINE | A | C | G | G | C | A |
|
|
| A | A | A | A |
| T | C |
| G | G | G | C | C | T | T |
|
|
| T |
|
| A | C | T | G | G | C | G | G | G | G |
|
| S3-4 NORMAL PIPELINE | A | C | G | G | C | A |
|
|
| A | A | A | A |
| T | C |
| G | G | G | C | C | T | T |
|
|
| T |
|
| A | C | T | G | G | C | G | G | G | G |
|
| S3-5 FAST PIPELINE |
| C | G | G | C |
| T | C | C | A | A | A | A |
|
| C |
| G | G | G | C |
| T | T | G |
| T | T | G |
|
|
| T | G |
| C | G | G | G | G |
|
| S3-5 NORMAL PIPELINE |
| C | G | G | C |
| T | C | C | A | A | A | A |
|
| C |
| G | G | G | C |
| T | T | G |
| T | T | G |
|
|
| T | G |
| C | G | G | G | G |
|
| S3-7 FAST PIPELINE |
| C | G | G | C |
| T | C | C | A | A | A | A |
|
| C |
| G | G | G | C |
| T | T | G |
| T | T | G |
|
|
| T | G |
| C | G | G | G | G |
|
| S3-7 NORMAL PIPELINE |
| C | G | G | C |
| T | C | C | A | A | A | A |
|
| C |
| G | G | G | C |
| T | T | G |
| T | T | G |
|
|
| T | G |
| C | G | G | G | G |
|
| S3-8 FAST PIPELINE | A |
| G |
| C | A | T | C | C | A | A |
| A | C |
| C | C | G | G | G | C | C | T |
| G |
| T |
| G |
| A | C | T |
| G | C | G | G | G |
| C |
| S3-8 NORMAL PIPELINE | A |
| G |
| C | A | T | C | C | A | A |
| A | C |
| C | C | G | G | G | C | C | T |
| G |
| T |
| G |
| A | C | T |
| G | C | G | G | G |
| C |
| S3-9 FAST PIPELINE | A | C |
| G |
| A | T | C | C |
|
| A |
| C |
| C | C |
|
| G | C | C | T | T | G |
| T | T | G | G | A | C |
| G | G |
| G |
|
| G | C |
| S3-9 NORMAL PIPELINE | A | C |
| G |
| A | T | C | C |
|
| A |
| C |
| C | C |
|
| G | C | C | T | T | G |
| T | T | G | G | A | C |
| G | G |
| G |
|
| G | C |
| S3-11 FAST PIPELINE | A | C | G | G | C | A |
|
|
| A | A | A | A |
|
| C |
| G | G | G | C | C | T | T |
|
|
| T |
|
| A | C | T | G | G | C | G | G | G | G |
|
| S3-11 NORMAL PIPELINE | A | C | G | G | C | A |
|
|
| A | A | A | A |
|
| C |
| G | G | G | C | C | T | T |
|
|
| T |
|
| A | C | T | G | G | C | G | G | G | G |
|
| S3-14 FAST PIPELINE | A |
| G | G | C | A | T | C | C | A | A | A | A | C | C | C | C | G | G | G | C |
| T | T | G |
| T | T | G |
| A | C | T |
| G | C |
| G | G | G | C |
| S3-14 NORMAL PIPELINE | A |
| G | G | C | A | T | C | C | A | A | A | A | C | C | C | C | G | G | G | C |
| T | T | G |
| T | T | G |
| A | C | T |
| G | C |
| G | G | G | C |
Result of consensus sequences constructed by using a combination of SAMtools and BEDtools.
| NGS Sample Code | Length of Consensus Sequence (bp) | |
|---|---|---|
| Fast Pipeline | Normal Pipeline | |
| B6 | 29,903 | 29,894 |
| C5 | 29,903 | 29,892 |
| F2 | 29,903 | 29,853 |
| F4 | 29,903 | 29,877 |
| S3 | 29,903 | 29,890 |
| S9 | 29,903 | 29,892 |
| S10 | 29,903 | 29,870 |
| S15 | 29,903 | 29,879 |
| S3-1 | 29,903 | 29,892 |
| S3-4 | 29,903 | 29,870 |
| S3-5 | 29,903 | 29,877 |
| S3-7 | 29,903 | 29,867 |
| S3-8 | 29,903 | 29,870 |
| S3-9 | 29,903 | 29,892 |
| S3-11 | 29,903 | 29,870 |
| S3-14 | 29,903 | 29,869 |
Identified Amino Acids Mutations in Batch 1 Samples Running All Pipelines.
| REGION | 5’UTR | NSP3-ORF1AB | NSP5-ORF1AB | NSP12-ORF1AB | NSP13-ORF1AB | SPIKE-S | NS3-ORF3A | NS7A-ORF7A | NP-N | |||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| POSITION | 81 | 679 | 822 | 49 | 314 | 646 | 769 | 576 | 614 | 54 | 57 | 99 | 73 | 160 |
| REFERENCE (NC_045512.2) | R | P | P | M | P | A | S | M | D | A | Q | A | H | Q |
| B6 FAST PIPELINE |
| P |
| M |
| A | S | M |
| A |
| A | H | Q |
| B6 NORMAL PIPELINE |
| P |
| M |
| A | S | M |
| A |
| A | H | Q |
| C5 FAST PIPELINE |
|
| P | M |
|
| S |
|
|
|
|
| H |
|
| C5 NORMAL PIPELINE |
|
| P | M |
|
| S |
|
|
|
|
| H |
|
| F2 FAST PIPELINE | R | P | P |
| P | A |
| M | D | A | Q | A | H | Q |
| F2 NORMAL PIPELINE | R | P | P |
| P | A |
| M | D | A | Q | A | H | Q |
| F4 FAST PIPELINE | R | P |
| M |
| A | S | M |
| A |
| A |
| Q |
| F4 NORMAL PIPELINE | R | P |
| M |
| A | S | M |
| A |
| A |
| Q |
Identified Amino Acids Mutations in Batch 2 Samples Running All Pipelines.
| REGION | 5’UTR | NSP3-ORF1AB | NSP5-ORF1AB | NSP6 | NSP8-ORF1AB | NSP9 | NSP12-ORF1AB | NSP13-ORF1AB | SPIKE-S | NS3-ORF3A | NP-N | ORF8 | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| POSITION | 81 | 822 | 1022 | 1198 | 12 | 37 | 21 | 42 | 88 | 218 | 239 | 314 | 897 | 153 | 83 | 213 | 258 | 614 | 677 | 57 | 222 | 13 | 119 | 193 | 234 | 29 |
| REFERENCE (NC_045512.2) | R | P | T | T | K | L | A | L | A | P | T | P | M | T | V | V | W | D | Q | Q | D | P | A | S | M | Q |
| S3 FAST PIPELINE |
|
| T | T |
| L | A | L | A | P |
|
| M | T | V |
| W |
| Q |
| D | P |
|
| M | Q |
| S3 NORMAL PIPELINE |
|
| T | T |
| L | A | L | A | P |
|
| M | T | V |
| W |
| Q |
| D | P |
|
| M | Q |
| S9 FAST PIPELINE | R | P | T |
| K |
|
|
|
| P | T | P |
|
| V | V |
|
| Q | Q | D |
| A | S | M | Q |
| S9 NORMAL PIPELINE | R | P | T |
| K |
|
|
|
| P | T | P |
|
| V | V |
|
| Q | Q | D |
| A | S | M | Q |
| S10 FAST PIPELINE |
| P | T | T | K | L | A | L | A |
| T |
| M | T |
| V | W |
|
|
|
| P | A | S |
|
|
| S10 NORMAL PIPELINE |
| P | T | T | K | L | A | L | A |
| T |
| M | T |
| V | W |
|
|
|
| P | A | S |
|
|
| S15 FAST PIPELINE |
|
|
| T | K | L | A | L | A | P | T |
| M | T | V | V | W |
| Q |
| D | P | A | S | M | Q |
| S15 NORMAL PIPELINE |
|
|
| T | K | L | A | L | A | P | T |
| M | T | V | V | W |
| Q |
| D | P | A | S | M | Q |
Identified Amino Acids Mutations in Batch 3 Samples Running All Pipelines.
| REGION | NSP3-ORF1AB | NSP4-ORF1AB | NSP5A-ORF1AB | NSP6-ORF1AB | NSP7-ORF1AB | NSP12-ORF1AB | NSP15-ORF1AB | SPIKE GLYCOPROTEIN | ORF3A | ORF8 | NP-N | ||||||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| POSITION | 196 | 822 | 945 | 1197 | 1369 | 383 | 386 | 87 | 284 | 8 | 83 | 50 | 227 | 323 | 434 | 803 | 58 | 275 | 331 | 5 | 213 | 494 | 570 | 614 | 679 | 689 | 1259 | 57 | 66 | 171 | 43 | 119 | 128 | 151 | 193 | 203 | 204 | 234 | 235 |
| REFERENCE (NC_045512.2) | M | P | K | S | S | I | S | L | S | K | M | E | P | P | S | T | W | V | L | L | V | S | A | D | N | S | D | Q | K | S | S | A | D | P | S | R | G | M | S |
| S3-1 FAST PIPELINE | M |
| K | S | S | I | S | L | S | K | M | E | P |
|
| T | W |
|
| L |
| S | A |
| N | S | D |
| K | S | S |
| D | P |
| R | G | M | S |
| S3-1 NORMAL PIPELINE | M |
| K | S | S | I | S | L | S | K | M | E | P |
|
| T | W |
|
| L |
| S | A |
| N | S | D |
| K | S | S |
| D | P |
| R | G | M | S |
| S3-4 FAST PIPELINE | M | P | K | S | S | I |
|
| S | K | M | E |
|
| S |
| W | V | L | L | V | S |
|
|
| S |
|
|
| S | S | A | D | P | S | R | G | M |
|
| S3-4 NORMAL PIPELINE | M | P | K | S | S | I |
|
| S | K | M | E |
|
| S |
| W | V | L | L | V | S |
|
|
| S |
|
|
| S | S | A | D | P | S | R | G | M |
|
| S3-5 FAST PIPELINE |
| P | K | S | S |
| S | L | S | K | M | E |
|
| S |
| W | V | L |
| V | S | A |
| N | S | D |
|
|
| S | A |
| P | S | R | G | M |
|
| S3-5 NORMAL PIPELINE |
| P | K | S | S |
| S | L | S | K | M | E |
|
| S |
| W | V | L |
| V | S | A |
| N | S | D |
|
|
| S | A |
| P | S | R | G | M |
|
| S3-7 FAST PIPELINE |
| P | K | S | S |
| S | L | S | K | M | E |
|
| S |
| W | V | L |
| V | S | A |
| N | S | D |
|
|
| S | A |
| P | S | R | G | M |
|
| S3-7 NORMAL PIPELINE |
| P | K | S | S |
| S | L | S | K | M | E |
|
| S |
| W | V | L |
| V | S | A |
| N | S | D |
|
|
| S | A |
| P | S | R | G | M |
|
| S3-8 FAST PIPELINE | M |
| K |
| S | I | S | L | S | K |
| E | P |
| S | T | W | V | L | L | V |
| A |
| N |
| D |
| K | S | S |
| D | P | S | R | G |
| S |
| S3-8 NORMAL PIPELINE | M |
| K |
| S | I | S | L | S | K |
| E | P |
| S | T | W | V | L | L | V |
| A |
| N |
| D |
| K | S | S |
| D | P | S | R | G |
| S |
| S3-9 FAST PIPELINE | M | P |
| S |
| I | S | L |
|
| M |
| P |
| S | T |
| V | L | L | V | S | A |
| N | S | D | Q | K | S |
| A | D |
| S |
|
| M | S |
| S3-9 NORMAL PIPELINE | M | P |
| S |
| I | S | L |
|
| M |
| P |
| S | T |
| V | L | L | V | S | A |
| N | S | D | Q | K | S |
| A | D |
| S |
|
| M | S |
| S3-11 FAST PIPELINE | M | P | K | S | S | I |
|
| S | K | M | E |
|
| S |
| W | V | L | L | V | S |
|
|
| S |
|
| K | S | S | A | D | P | S | R | G | M |
|
| S3-11 NORMAL PIPELINE | M | P | K | S | S | I |
|
| S | K | M | E |
|
| S |
| W | V | L | L | V | S |
|
|
| S |
|
| K | S | S | A | D | P | S | R | G | M |
|
| S3-14 FAST PIPELINE | M |
| K | S | S | I | S | L | S | K | M | E | P | P | S | T | W | V | L |
| V | S | A |
| N | S | D |
| K | S | S |
| D | P |
| R | G | M | S |
| S3-14 NORMAL PIPELINE | M |
| K | S | S | I | S | L | S | K | M | E | P | P | S | T | W | V | L |
| V | S | A |
| N | S | D |
| K | S | S |
| D | P |
| R | G | M | S |
Total time required to fully complete each pipeline in detecting nucleotide substitution and amino acids mutation.
| NGS Sample Code | Running Time (s) | |
|---|---|---|
| Fast Pipeline | Normal Pipeline | |
| B6 | 1778.0 | 5991.3 |
| C5 | 574.3 | 3980.5 |
| F2 | 324.5 | 3924.1 |
| F4 | 286.4 | 3539.5 |
| S3 | 3060.1 | 6521.0 |
| S9 | 1036.8 | 4755.5 |
| S10 | 537.5 | 3747.3 |
| S15 | 848.9 | 4356.8 |
| S3-1 | 552.2 | 4864.7 |
| S3-4 | 256.3 | 4461.3 |
| S3-5 | 1427.8 | 6377.2 |
| S3-7 | 330.6 | 4416.8 |
| S3-8 | 489.6 | 4824.5 |
| S3-9 | 486.5 | 4752.9 |
| S3-11 | 879.6 | 5394.5 |
| S3-14 | 57.6 | 4190.2 |
Figure 4Illustration depicting the possible translation result of ambiguous amino acid X detected in Sample C5 running both Normal Pipeline and Fast Pipeline. An X amino acid was detected at position 54 region NS3-ORF3A.
Figure 5Illustration depicting the possible translation result of ambiguous amino acid X detected in Sample F2 running in Normal Pipeline. An X amino acid was detected at position 769 region NSP12-ORF1AB.