| Literature DB >> 25436869 |
Monika Zavodna1, Andrew Bagshaw2, Rudiger Brauning3, Neil J Gemmell4.
Abstract
To date we have little knowledge of how accurate next-generation sequencing (NGS) technologies are in sequencing repetitive sequences beyond known limitations to accurately sequence homopolymers. Only a handful of previous reports have evaluated the potential of NGS for sequencing short tandem repeats (microsatellites) and no empirical study has compared and evaluated the performance of more than one NGS platform with the same dataset. Here we examined yeast microsatellite variants from both long-read (454-sequencing) and short-read (Illumina) NGS platforms and compared these to data derived through Sanger sequencing. In addition, we investigated any locus-specific biases and differences that might have resulted from variability in microsatellite repeat number, repeat motif or type of mutation. Out of 112 insertion/deletion variants identified among 45 microsatellite amplicons in our study, we found 87.5% agreement between the 454-platform and Sanger sequencing in frequency of variant detection after Benjamini-Hochberg correction for multiple tests. For a subset of 21 microsatellite amplicons derived from Illumina sequencing, the results of short-read platform were highly consistent with the other two platforms, with 100% agreement with 454-sequencing and 93.6% agreement with the Sanger method after Benjamini-Hochberg correction. We found that the microsatellite attributes copy number, repeat motif and type of mutation did not have a significant effect on differences seen between the sequencing platforms. We show that both long-read and short-read NGS platforms can be used to sequence short tandem repeats accurately, which makes it feasible to consider the use of these platforms in high-throughput genotyping. It appears the major requirement for achieving both high accuracy and rare variant detection in microsatellite genotyping is sufficient read depth coverage. This might be a challenge because each platform generates a consistent pattern of non-uniform sequence coverage, which, as our study suggests, may affect some types of tandem repeats more than others.Entities:
Mesh:
Substances:
Year: 2014 PMID: 25436869 PMCID: PMC4250034 DOI: 10.1371/journal.pone.0113862
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Summary of detected microsatellite InDel mutations from three sequencing platforms.
Plot of the microsatellite InDel variants per amplicon (m) detected by Illumina- (white bars), 454-sequencing (grey bars) and Sanger sequencing (black bars). Insertions are denoted by bars above the X-axis and deletions are denoted by bars below the X-axis. The length of each bar segment is proportional to the length of the InDel in repeat units inserted or deleted. The amplicons' characteristics are summarized in Table S1.