| Literature DB >> 25056481 |
Richard M Leggett, Dan MacLean.
Abstract
Reference-free SNP detection, that is identifying SNPs between samples directly from comparison of primary sequencing data with other primary sequencing data and not to a pre-assembled reference genome is an emergent and potentially disruptive technology that is beginning to open up new vistas in variant identification that reveals new applications in non-model organisms and metagenomics. The modern, efficient data structures these tools use enables researchers with a reference sequence to sample many more individuals with lower computing storage and processing overhead. In this article we will discuss the technologies and tools implementing reference-free SNP detection and the potential impact on studies of genetic variation in model and non-model organisms, metagenomics and personal genomics and medicine.Entities:
Mesh:
Year: 2014 PMID: 25056481 PMCID: PMC4083407 DOI: 10.1186/1471-2164-15-S4-S10
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Figure 1De Bruin graphs constructed from overlapping k-mers. De Bruijn graphs are networks of short overlapping sub-sequences of reads of length k. Typically, k-mers are set as the nodes in the graph and links are drawn between k-mers that have overlap of length k - 1, that is they overhang each other by just one nucleotide.
Figure 2Bubble structures formed in De Bruijn graphs by SNPs. Bubble structures form as the result of a divergence in sequence by one nucleotide, initially at the end of a k-mer, that then moves backwards at each progressive node, allowing for a close of the two paths at the end. Colouring the edges in the graph according to sample provenance helps identify inter-sample SNPs.
de novo reference-free analysis software and availability.
| Cortex | |
| Bubbleparse | |
| Bubbleparse accessories | |
| 2 | |
| NIKS | |
| discoSnp | |
| Stacks | |
| MaryGold |