| Literature DB >> 23894464 |
Hongzhi Cao1, Jinghua Wu, Yu Wang, Hui Jiang, Tao Zhang, Xiao Liu, Yinyin Xu, Dequan Liang, Peng Gao, Yepeng Sun, Benjamin Gifford, Mark D'Ascenzo, Xiaomin Liu, Laurent C A M Tellier, Fang Yang, Xin Tong, Dan Chen, Jing Zheng, Weiyang Li, Todd Richmond, Xun Xu, Jun Wang, Yingrui Li.
Abstract
The major histocompatibility complex (MHC) is one of the most variable and gene-dense regions of the human genome. Most studies of the MHC, and associated regions, focus on minor variants and HLA typing, many of which have been demonstrated to be associated with human disease susceptibility and metabolic pathways. However, the detection of variants in the MHC region, and diagnostic HLA typing, still lacks a coherent, standardized, cost effective and high coverage protocol of clinical quality and reliability. In this paper, we presented such a method for the accurate detection of minor variants and HLA types in the human MHC region, using high-throughput, high-coverage sequencing of target regions. A probe set was designed to template upon the 8 annotated human MHC haplotypes, and to encompass the 5 megabases (Mb) of the extended MHC region. We deployed our probes upon three, genetically diverse human samples for probe set evaluation, and sequencing data show that ∼97% of the MHC region, and over 99% of the genes in MHC region, are covered with sufficient depth and good evenness. 98% of genotypes called by this capture sequencing prove consistent with established HapMap genotypes. We have concurrently developed a one-step pipeline for calling any HLA type referenced in the IMGT/HLA database from this target capture sequencing data, which shows over 96% typing accuracy when deployed at 4 digital resolution. This cost-effective and highly accurate approach for variant detection and HLA typing in the MHC region may lend further insight into immune-mediated diseases studies, and may find clinical utility in transplantation medicine research. This one-step pipeline is released for general evaluation and use by the scientific community.Entities:
Mesh:
Year: 2013 PMID: 23894464 PMCID: PMC3722289 DOI: 10.1371/journal.pone.0069388
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Data production and mapping results for the three samples used.
| Samples | YH | NA18532 | NA18555 |
| Target region ( bp) | 4970558 | 4970558 | 4970558 |
| Raw reads | 7988210 | 5377408 | 5457998 |
| Raw data (Mb) | 719 | 484 | 491 |
| Mapped reads | 7766430 | 4875514 | 4991960 |
| Uniquely mapped reads | 7317777 | 4559340 | 4672521 |
| Reads uniquely mapped to target | 4794113 | 3023307 | 3100770 |
| Capture specificity | 65.51% | 66.31% | 66.36% |
| Mean fold coverage depth (x) | 87.32 | 55.09 | 56.50 |
| Percent bases with coverage ≥1x | 97.29 | 96.95 | 97.20 |
| Percent bases with coverage ≥4x | 96.52 | 95.66 | 95.95 |
| Percent bases with coverage ≥10x | 95.50 | 93.84 | 94.13 |
Target regions here refer to contiguous MHC regions, rather than to the region actually covered by the designed probes. Capture specificity is defined as the percentage of uniquely mapped reads aligning to the target region.
Figure 1Distribution of per-base coverage depths in the MHC region for three samples.
X-axis denotes coverage depth, Y-axis indicates percentage of total target region with a given sequencing depth. The fraction of target bases with zero coverage is not shown in the figure.
Figure 2Single nucleotide variation distribution across the whole MHC region for three samples.
The MHC region was split into 4971 parts, with 1000 bp in each part. X-axis denotes the 4971 parts and Y-axis indicates the number of SNPs in each part.
Figure 3Capture bias evaluation using heterozygous genotypes in the Beadchip for three samples.
X-axis denotes the support reads number of the reference allele and Y-axis denotes the support reads number of the non-reference allele.
Figure 4The workflow of HLA typing method.