| Literature DB >> 26055759 |
Erik Borgström1, David Redin1, Sverker Lundin1, Emelie Berglund1, Anders F Andersson1, Afshin Ahmadian1.
Abstract
High-throughput sequencing platforms mainly produce short-read data, resulting in a loss of phasing information for many of the genetic variants analysed. For certain applications, it is vital to know which variant alleles are connected to each individual DNA molecule. Here we demonstrate a method for massively parallel barcoding and phasing of single DNA molecules. First, a primer library with millions of uniquely barcoded beads is generated. When compartmentalized with single DNA molecules, the beads can be used to amplify and tag any target sequences of interest, enabling coupling of the biological information from multiple loci. We apply the assay to bacterial 16S sequencing and up to 94% of the hypothesized phasing events are shown to originate from single molecules. The method enables use of widely available short-read-sequencing platforms to study long single molecules within a complex sample, without losing phase information.Entities:
Mesh:
Substances:
Year: 2015 PMID: 26055759 PMCID: PMC4468844 DOI: 10.1038/ncomms8173
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Figure 1Method overview.
(I) An emulsion is generated with reaction compartments and smaller stabilizing droplets. (II) An active compartment consists of amplification reagents and one molecule from a population of 4 (ref. 15) degenerate barcode oligonucleotides. By utilizing a subset of these molecules, we ensure unique barcoding of each bead. (III and IV) A library with millions of uniquely barcoded primer sets is generated and enriched. (V) The bead-bound primer library is mixed with a diluted sample of genomic fragments and then introduced into a second emulsion. (VI and VII) Each bead is paired with one genomic fragment and the target-specific amplicons are coupled with the barcode through amplification. (VIII) Barcoded amplicons are enriched and sequenced.
Figure 2Results illustrated.
Monoclonality and phasing results for the sorting experiment, the model system, the biological sample and the biological sample after removal of the most abundant species (biological sample*). (a) The rate of enriched beads and monoclonal amplification observed for each sample (see Supplementary Fig. 3 for corresponding theoretical values). (b) Proportion of amplicon-carrying beads displaying the 16S.1, 16S.2 or both the targeted regions. (c) The observed rate of matching BLAST-based classifications of bacterial origins before (green) and after (red) random shuffling of the data set, for all data sets. (d) The average number of alignment hits for the two target regions and the overlap between them.