| Literature DB >> 24502676 |
Kendall W Cradic, Stephen J Murphy, Travis M Drucker, Robert A Sikkink, Norman L Eberhardt, Claudia Neuhauser, George Vasmatzis1, Stefan K G Grebe.
Abstract
BACKGROUND: Recessive genes cause disease when both copies are affected by mutant loci. Resolving the cis/trans relationship of variations has been an important problem both for researchers, and increasingly, clinicians. Of particular concern are patients who have two heterozygous disease-causing mutations and could be diagnosed as affected (one mutation on each allele) or as phenotypically normal (both mutations on the same allele). Several methods are currently used to phase genes, however due to cost, complexity and/or low sensitivity they are not suitable for clinical purposes.Entities:
Mesh:
Substances:
Year: 2014 PMID: 24502676 PMCID: PMC3930533 DOI: 10.1186/1471-2350-15-19
Source DB: PubMed Journal: BMC Med Genet ISSN: 1471-2350 Impact factor: 2.103
Figure 1Mate pair library preparation. The MP protocol allows sequence information to be linked across greater distances than PE reads. Fragments of sheared DNA from a pool (500 – 5000 bp with an average of 2000 bp) are end-repaired using biotinylated nucleotides. Fragments are then self-ligated and all remaining linear DNA is removed by exonuclease treatment. Circularized DNA is fragmented again (black bars) to an average size of 500 bp and segments containing biotinylated junction points are isolated on streptavidin beads. In addition to fragments containing junction points, a portion of non-biotinylated DNA is co-purified and appears in the MP library as a subpopulation of PE reads. All fragments are end-repaired and indexed using TruSeq adapters followed by sequencing.
Figure 2Clean 10 Kb amplicons. A single, clean band at 10 Kb shows the specificity of our long-range amplification.
Figure 3Average linked coverage by PE and MP reads from four libraries. Linked read coverage is shown from PE and MP reads as a function of distance between linked positions for each of the enriched libraries (10, 100, 500 and 1,000 ng spiked lrPCR amplicon in WGA DNA).
Figure 4Confidence in phasing calls is dependant on coverage. (a) Association matrices from each spiked library show the relationship between the two disease-causing heterozygous positions in this specimen. Highlighting of the bases at each locus indicates wild-type (green) and mutant (red). (b) The probabilities and 99% confidence intervals for all possible IVS2-13 base calls associated with each c.60G/A allele are shown for all four amplicon spiked libraries. Coverage increases linearly as spike concentration increases. (c) Average confidence interval (CI) width was calculated to evaluate the level of coverage required for confident heterozygote association. A simulation run to test varying coverage and observed data points both indicate that coverage beyond 500× provides diminishing returns in CI width.
Figure 5One phase of the entire 10 kb amplicon. By beginning with the first heterozygote in the amplicon and sequentially moving through all downstream heterozygous positions, the phase of the entire 10 kb amplicon can be determined. Confidence intervals in the columns show the relationship of each base to the highest probability base call from the previous column. Lines showing the cumulative probability and confidence interval relate each downstream position to the very first in the chain. Cumulative probability diminishes in proportion to the quality of each association matrix in the chain (i.e. sufficient coverage and few errors). However, one can be sure of the accuracy of phasing so long as there is no overlap between the cumulative confidence interval and that of any rejected base.