| Literature DB >> 25901276 |
Ron Ammar1, Tara A Paton2, Dax Torti1, Adam Shlien3, Gary D Bader1,4,5.
Abstract
Haplotypes are often critical for the interpretation of genetic laboratory observations into medically actionable findings. Current massively parallel DNA sequencing technologies produce short sequence reads that are often unable to resolve haplotype information. Phasing short read data typically requires supplemental statistical phasing based on known haplotype structure in the population or parental genotypic data. Here we demonstrate that the MinION nanopore sequencer is capable of producing very long reads to resolve both variants and haplotypes of HLA-A, HLA-B and CYP2D6 genes important in determining patient drug response in sample NA12878 of CEPH/UTAH pedigree 1463, without the need for statistical phasing. Long read data from a single 24-hour nanopore sequencing run was used to reconstruct haplotypes, which were confirmed by HapMap data and statistically phased Complete Genomics and Sequenom genotypes. Our results demonstrate that nanopore sequencing is an emerging standalone technology with potential utility in a clinical environment to aid in medical decision-making.Entities:
Keywords: DNA; haplotype; nanopore; pharmacogenomics; sequencing
Year: 2015 PMID: 25901276 PMCID: PMC4392832 DOI: 10.12688/f1000research.6037.2
Source DB: PubMed Journal: F1000Res ISSN: 2046-1402
Primer sequences and amplicon lengths for HLA-A, HLA-B and CYP2D6.
| Oxford Nanopore PGx primer list | ||||
|---|---|---|---|---|
| Gene | Primer
| Primer seq | Chromosomal Coordinates | Amplicon Size |
| CYP2D6 | CYP2D6-2F | TAGCTCCCTGACGCCATGATTTGTCTT | chr22:42,522,077-42,527,144 | 5,067 bp |
| CYP2D6 | CYP2D6-2R | CCTGGTTATCCCAGAAGGCTTTGCAG | ||
| HLA-A | HLAA-2F | AGAAGAGTCCAGGTGGACAGGTAAGGAGTG | chr6:29,909,854-29,913,805 | 3,951 bp |
| HLA-A | HLAA-2R | TTCTACTGAAGGGCCAAGGACAATGGAG | ||
| HLA-B | HLAB-2F | TGGATTCAGCACCAAGATCACTAGAACCAG | chr6:31,321,279-31,325,303 | 4,024 bp |
| HLA-B | HLAB-2R | GTCTCTCCCTGGTTTCCACAGACAGATCCT | ||
Basecall and read mapping statistics.
| Read Type | Number
| Mean
| Number
| % of
| Mean Length,
| Mean
| Mean
| Mean
| Mean mapping
|
|---|---|---|---|---|---|---|---|---|---|
| 1D template | 19655 | 2693.7 | 3793 | 19.3% | 872.8 | 8.9% | 13.9% | 5.7% | 71.5% |
| 1D complement | 9584 | 2705.7 | 2717 | 28.3% | 292.7 | 7.3% | 15.4% | 4.1% | 73.2% |
| 2D consensus | 7540 | 3486.3 | 4761 | 63.1% | 2952.3 | 7.0% | 13.3% | 5.3% | 74.3% |
Figure 1. Integrate Genomics Viewer (IGV) diagram of MinION reads aligned to the CYP2D locus on chromosome 22 from 42,521,411 to 42,552,401.
The majority of reads aligned across the entire length of CYP2D6 as was expected by selective PCR amplification. Downstream, an insignificant number of read fragments aligned to CYP2D7 and CYP2D8 ( 2D8 is located from 42,545,874 to 42,551,097; exon-intron diagram not shown in gene annotation track). Due to the extremely high coverage at CYP2D6, not all reads are shown in this pileup diagram.
Haplotype translation table for CYP2D6.
| Haplotype Id | CYP2D6 | rs1065852 | rs28371706 | rs5030655 | rs3892097 | rs35742686 | rs5030656 | rs16947 | rs28371725 | rs1135840 |
|---|---|---|---|---|---|---|---|---|---|---|
| PA165816576 | *1 | G | G | A | C | T | CTT |
| C |
|
| PA165816577 | *2 | G | G | A | C | T | CTT | A | C | G |
| PA165816578 | *3 | G | G | A | C |
| CTT |
| C |
|
| PA165816579 | *4 |
| G | A |
| T | CTT |
| C | G |
| PA165948092 | *5 |
|
|
|
|
|
|
|
|
|
| PA165816581 | *6 | G | G |
| C | T | CTT |
| C |
|
| PA165948317 | *9 | G | G | A | C | T |
|
| C |
|
| PA165816582 | *10 |
| G | A | C | T | CTT |
| C | G |
| PA165816583 | *17 | G |
| A | C | T | CTT | A | C | G |
| PA165816584 | *41 | G | G | A | C | T | CTT | A |
| G |
Figure S1 (Updated). CopyCaller software analysis of data from Taqman copy number assays Hs04502391_cn and Hs04083572_cn.
CYP2D6 from sample NA12878 was observed to be diploid. Samples known to be 1 copy or greater than 2 copies for CYP2D6 (GM17235 and GM17232 respectively [28]) were run for the two Taqman copy number assays as well.
Figure 2. A. Length distribution of aligned reads. 4–5Kb reads represent full-length PCR amplicons. Slightly smaller fragments were likely byproducts of shearing during DNA handling in the experimental protocol. B. With depth of coverage of ~1000× for each of the genes, many chromosomal positions were called with 70–90% consensus. This is a short window of aligned reads for the *4 locus of CYP2D6 with over 1200× coverage. The heterozygous *4 allele rs1065852 is indicated with the arrow. C. Proportions of haplotypes of CYP2D6, HLA-A and HLA-B when directly measured from individual reads spanning all haplotype markers.
HLA alleles called with 4-digit resolution using the GATK HLACaller.
| Locus | HapMap A1 | HapMap A2 | Nanopore A1 | Nanopore A2 |
|---|---|---|---|---|
| HLA-A | 0101 | 1101 | 0132 | 0312 |
| HLA-B | 0801 | 5601 | 0765 | 5510 |