| Literature DB >> 34741133 |
Maaike van der Lee1,2, William J Rowell3, Roberta Menafra4, Henk-Jan Guchelaar1,2, Jesse J Swen5,6, Seyed Yahya Anvar7,8,9,10.
Abstract
The use of pharmacogenomics in clinical practice is becoming standard of care. However, due to the complex genetic makeup of pharmacogenes, not all genetic variation is currently accounted for. Here, we show the utility of long-read sequencing to resolve complex pharmacogenes by analyzing a well-characterised sample. This data consists of long reads that were processed to resolve phased haploblocks. 73% of pharmacogenes were fully covered in one phased haploblock, including 9/15 genes that are 100% complex. Variant calling accuracy in the pharmacogenes was high, with 99.8% recall and 100% precision for SNVs and 98.7% precision and 98.0% recall for Indels. For the majority of gene-drug interactions in the DPWG and CPIC guidelines, the associated genes could be fully resolved (62% and 63% respectively). Together, these findings suggest that long-read sequencing data offers promising opportunities in elucidating complex pharmacogenes and haplotype phasing while maintaining accurate variant calling.Entities:
Mesh:
Year: 2022 PMID: 34741133 PMCID: PMC8794781 DOI: 10.1038/s41397-021-00259-z
Source DB: PubMed Journal: Pharmacogenomics J ISSN: 1470-269X Impact factor: 3.550
Fig. 1Read length distribution.
Distribution of read length of genome in a bottle sample HG002 after sequencing on Pacific Bioscience sequel platform and construction of circular consensus sequence.
Variant calling performance for pharmacogenes.
| Variant caller | SNVs | Indels | ||||
|---|---|---|---|---|---|---|
| Precision (%) | Recall (%) | F1 (%) | Precision (%) | Recall (%) | F1 (%) | |
| GATK haplotype caller | 99.88 | 99.96 | 99.92 | 94.47 | 86.12 | 90.10 |
| DeepVariant (CCS model) | 99.84 | 100.0 | 99.92 | 98.74 | 98.00 | 98.37 |
Measured against the Genome in a Bottle benchmark v.3.3.2. using both GATK variant caller and DeepVariant. SNV single nucleotide variant, Indels insertions and deletions, GATK genomic analysis toolkit, CCS circular consensus sequence.
Fig. 2Haploblock resolution of GENCODE features.
A haploblock length distribution stratified by Gencode features and intergenic regions, overlap with pharmacogenes is highlighted in red. B For each protein coding feature the percentage that were resolved into haploblocks compared to the feature length. The red line reflects the mean read length. The majority of haploblocks are larger than the mean read length, indicating that not read length but the number of heterozygous variants is decisive for the length of a haploblock.
Fig. 3Complexity of pharmacogenes and proportion solved in haploblocks.
In (A), the pharmacogenes and their complexity related to the percentage covered in haploblocks. In bold genes included in the Ubiquitous pharmacogenomics (U-PGx) passport. B for genes included in the CPIC of DPWG guidelines the number of available actionable guidelines is mapped to the percentage of each gene which is phased into haploblocks. Actionable is defined as guidelines which recommends a dose change or drug switch. For each gene the percentage resolved in haploblocks is included in the panel headers. CPIC Clinical Pharmacogenetics Implementation Consortium, DPWG Dutch Pharmacogenetics Working group.