| Literature DB >> 26242175 |
Guoqiang Zhang1, Jianfeng Wang2, Jin Yang3, Wenjie Li4, Yutian Deng5, Jing Li6, Jun Huang7, Songnian Hu8, Bing Zhang9.
Abstract
BACKGROUND: To promote the clinical application of next-generation sequencing, it is important to obtain accurate and consistent variants of target genomic regions at low cost. Ion Proton, the latest updated semiconductor-based sequencing instrument from Life Technologies, is designed to provide investigators with an inexpensive platform for human whole exome sequencing that achieves a rapid turnaround time. However, few studies have comprehensively compared and evaluated the accuracy of variant calling between Ion Proton and Illumina sequencing platforms such as HiSeq 2000, which is the most popular sequencing platform for the human genome. The Ion Proton sequencer combined with the Ion TargetSeq Exome Enrichment Kit together make up TargetSeq-Proton, whereas SureSelect-Hiseq is based on the Agilent SureSelect Human All Exon v4 Kit and the HiSeq 2000 sequencer.Entities:
Mesh:
Year: 2015 PMID: 26242175 PMCID: PMC4524363 DOI: 10.1186/s12864-015-1796-6
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Statistics of reads and alignment to reference genome for four samples’s exome sequencing on TargetSeq-Proton/SureSelect-HiSeq platform
| Sample | Sequencing platform | Total reads (M) | Total bases (Gb) | Total mapped reads (M) | Average read length (bp) | Average coverage depth | Coverage at 1× (%) | Coverage at 5× (%) | Coverage at 10× (%) | Coverage at 20× (%) |
|---|---|---|---|---|---|---|---|---|---|---|
| S1 | TargetSeq-Proton | 48.2 | 4.5 | 42.9 | 94.1 | 39.8 | 97 | 94 | 90 | 79 |
| SureSelect-HiSeq | 78.3 | 7.8 | 77.4 | 2*100 | 40.0 | 100 | 97 | 90 | 71 | |
| S2 | TargetSeq-Proton | 62.4 | 6.1 | 55.1 | 97.4 | 51.9 | 98 | 95 | 93 | 86 |
| SureSelect-HiSeq | 79.9 | 7.9 | 79.1 | 2*100 | 45.3 | 100 | 98 | 91 | 74 | |
| S3 | TargetSeq-Proton | 61.3 | 5.5 | 55.3 | 90.9 | 50.3 | 97 | 94 | 91 | 84 |
| SureSelect-HiSeq | 56.8 | 5.6 | 56.2 | 2*100 | 33.4 | 100 | 96 | 86 | 63 | |
| S4 | TargetSeq-Proton | 53.9 | 5.2 | 49.4 | 96.9 | 49.8 | 97 | 94 | 91 | 84 |
| SureSelect-HiSeq | 59.5 | 5.9 | 58.9 | 2*100 | 36.0 | 100 | 96 | 87 | 66 |
Variant loci detected by TargetSeq-Proton and SureSelect-HiSeq sequencing
| Total locia | Co-detected loci(%)b | Concordant locic | Disconcordant loci(TargetSeq-Proton/SureSelect-HiSeq)d | ||||
|---|---|---|---|---|---|---|---|
| Hom/Hom | Hom/Het | Het/Hom | Het/Het | ||||
| S1 | 25466 | 17314(68.0) | 17202 | 1 | 15 | 93 | 3 |
| S2 | 25413 | 19148(75.3) | 19039 | 2 | 20 | 84 | 3 |
| S3 | 25429 | 18222(71.7) | 18087 | 1 | 26 | 104 | 4 |
| S4 | 25080 | 17937(71.5) | 17808 | 1 | 16 | 111 | 1 |
aTotal loci: all variant loci in the overlapping regions detected by HiSeq 2000 or Ion Proton sequencing, which include the Concordant, Disconcordant, TargetSeq-HiSeq-specific and SureSelect-Proton-specific loci
bCo-detected loci: the variant loci co-detected by TargetSeq-HiSeq and SureSelect-Proton sequencing, which include Concordant and Disconcordant loci. The number in parentheses is percentage
cConcordant loci: the variant loci with the same genotype detected by between TargetSeq-HiSeq and SureSelect- Proton sequencing
dDisconcordant loci: the loci with different variant genotype detected by between TargetSeq-Proton and SureSelect-HiSeq. Hom/Het refers to the loci whose variant genotype is homozygotes detected by TargetSeq-Proton, but heterozygotes detected by SureSelect-HiSeq. Hom/Hom, Het/Hom and Het/Het refer to analogous variant genotype
Pairwise comparison of variants called for four samples by TargetSeq-Proton and SureSelect-HiSeq
| TargetSeq-Proton-specific(dbSNP|novel)a | Concordant(dbSNP|novel) | SureSelect-HiSeq-specific(dbSNP|novel) | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Total | SNPs | InDels | Total | SNPs | InDels | Total | SNPs | InDels | |
| S1 | 1470 (1021|449) | 1274 (998|276) | 196 (23|173) | 17202 (16288|914) | 16833 (15943|890) | 369 (345|24) | 6682 (6348|334) | 5851 (5606|245) | 831 (742|89) |
| S2 | 1432 (1018|414) | 1229 (996|233) | 203 (22|181) | 19039 (18038|1001) | 18655 (17683|972) | 384 (355|29) | 4833 (4533|300) | 4044 (3839|205) | 789 (694|95) |
| S3 | 1518 (1073|445) | 1305 (1046|259) | 213 (27|186) | 18087 (17149|938) | 17720 (16810|910) | 367 (339|28) | 5689 (5390|299) | 4897 (4671|226) | 792 (719|73) |
| S4 | 1462 (1069|393) | 1326 (1051|275) | 136 (18|118) | 17808 (16891|917) | 17409 (16521|888) | 399 (370|29) | 5681 (5353|328) | 4922 (4674|248) | 759 (679|80) |
aThe numbers of parentheses refer to known or unknown variant loci in dbSNP databases
Fig. 1Distribution of size and classification of small InDel called by exome sequencing on SureSelect-HiSeq and TargetSeq-Proton in sample S3. Four classes of small InDel were defined as concordant novel, concordant known, specific novel and specific known. Novel refers to InDels not reported in dbSNP build 137. Known refers to InDels previously reported in dbSNP build 137. a showed the size and classification of small InDels called by TargetSeq-Proton. b showed the size and classification of small InDels called by SureSelect-HiSeq
Sanger sequencing validation comparison on variant subsets of TargetSeq-Proton and SureSelect-HiSeq data calls
| SureSelect-HiSeq-specific | TargetSeq-Proton-specific | Concordant | ||||
|---|---|---|---|---|---|---|
| 1-bp InDels | SNPs | 1-bp InDels | SNPs | 1-bp InDels | SNPs | |
| Validated true | 89.6 %(60) | 88.3 %(53) | 15.8 %(6) | 60.0 %(21) | 100.0 %(47) | 91.5 %(65) |
| Validated false | 10.4 %(7) | 11.7 %(7) | 84.2 %(32) | 40.0 %(14) | 0.0 %(0) | 8.2 %(6) |
A total of 240 SNPs and 240 1-bp InDels from four samples were randomly selected for Sanger sequencing validation, with 80 loci from the set of TargetSeq-Proton-specific, 80 from the set of SureSelect-HiSeq-specific, and 80 from the set of concordance between two platforms
Comparison of the validation rates of variants called by different pipelines for SureSelect-HiSeq data
| bwa_pea variants | bwa_seb variants | stampy_sec variants | ||||
|---|---|---|---|---|---|---|
| InDels | SNPs | InDels | SNPs | InDels | SNPs | |
| Validated true | 93.1 %(108) | 90.0 %(117) | 93.1 %(108) | 86.6 %(110) | 92.5 %(99) | 88.6 %(101) |
| Validated false | 6.9 %(8) | 10.0 %(13) | 6.9 %(8) | 13.4 %(17) | 7.5 %(8) | 11.4 %(13) |
Note: abwa-pe, bwa mapping with paired-end reads mode
bbwa-se, bwa mapping with single-end reads mode
cstampy-se, stampy-1.0.22 software mapping with single-end reads mode