| Literature DB >> 33868290 |
Ping Liu1, Minya Yao2, Yu Gong3, Yunjie Song4, Yanan Chen4, Yizhou Ye4, Xiao Liu4, Fugen Li4, Hua Dong4, Rui Meng5, Hao Chen4, Aiwen Zheng6,7.
Abstract
With the great progress made recently in next generation sequencing (NGS) technology, sequencing accuracy and throughput have increased, while the cost for data has decreased. Various human leukocyte antigen (HLA) typing algorithms and assays have been developed and have begun to be used in clinical practice. In this study, we compared the HLA typing performance of three HLA assays and seven NGS-based HLA algorithms and assessed the impact of sequencing depth and length on HLA typing accuracy based on 24 benchmarked samples. The algorithms HISAT-genotype and HLA-HD showed the highest accuracy at both the first field and the second field resolution, followed by HLAscan. Our internal capture-based HLA assay showed comparable performance with whole exome sequencing (WES). We found that the minimal depth was 100X for HISAT-genotype and HLA-HD to obtain more than 90% accuracy at the third field level. The top three algorithms were quite robust to the change of read length. Thus, we recommend using HISAT-genotype and HLA-HD for NGS-based HLA genotyping because of their higher accuracy and robustness to read length. We propose that a minimal sequence depth for obtaining more than 90% HLA typing accuracy at the third field level is 100X. Besides, targeting capture-based NGS HLA typing may be more suitable than WES in clinical practice due to its lower sequencing cost and higher HLA sequencing depth.Entities:
Keywords: accuracy; algorithms; benchmark; human leukocyte antigen; next-generation sequencing
Year: 2021 PMID: 33868290 PMCID: PMC8045758 DOI: 10.3389/fimmu.2021.652258
Source DB: PubMed Journal: Front Immunol ISSN: 1664-3224 Impact factor: 7.561
HLA-typing software used in this study.
| Software | Resolution | Programming | Year | Journal | Cited |
|---|---|---|---|---|---|
| HLAminer | 4 | Perl | 2012 | Genome Medicine | 83 |
| seq2HLA | 4 | Python, R | 2012 | Genome Medicine | 93 |
| HLAforest | 8 | Perl | 2013 | PLOS ONE | 28 |
| HLA-VBSeq | 8 | Java | 2015 | BMC Genomics | 36 |
| HLA-HD | 6 | Shell | 2017 | Human mutation | 15 |
| HLAscan | 8 | Python | 2017 | BMC Bioinformatics | 22 |
| HISAT-genotype | 8 | C++, Python | 2019 | Nature Biotechnology | 81 |
Figure 1Workflow of HLA typing using benchmarked data sets. All HLA typing algorithms were run with default parameters.
Figure 2Performance of HLA typing algorithms and the three different HLA assays. Accuracy of HLA alleles typed at (A) the first field level; (B) the second field level; (C) the third field level based on the seven algorithms and three capture assays. Accuracy was calculated by the fraction of total number alleles that were correctly typed.
Figure 3Running time for different HLA typing software. Y axis is plotted in log10 scale.
Figure 4Distribution of the pattern of genotyping errors in HLA-A genes. (A) The number of miscalled alleles by each algorithm grouped by the HLA genes. (B) The pattern of discordant HLA-A alleles at the second field level. None, not determined by the algorithms.
Figure 5Accuracy of the three tools for HLA typing at the second field or the third field resolution for different depths and read lengths. Depth evaluation at (A) the second field level; (B) the third field level. For sequence depth evaluation, alignment files of the 24 Bofuri samples were down-sampled from 700X to 10X based on the raw depths of HLA genes. (C, D) are the overall HLA typing accuracy at the second field and the third field level, respectively, while the read length decreased from 150 bp to 76 bp.
Figure 6Positive prediction agreement of the seven algorithms for HLA typing at different resolutions and genes. (A) Agreement for the seven algorithms at the first field or the second field levels. (B) Agreement for the seven algorithms for different HLA genes at the second field level.