| Literature DB >> 26018555 |
Neema P Mayor1, James Robinson1, Alasdair J M McWhinnie2, Swati Ranade3, Kevin Eng3, William Midwinter2, Will P Bultitude2, Chen-Shan Chin3, Brett Bowman3, Patrick Marks3, Henny Braund2, J Alejandro Madrigal1, Katy Latham2, Steven G E Marsh1.
Abstract
Allele-level resolution data at primary HLA typing is the ideal for most histocompatibility testing laboratories. Many high-throughput molecular HLA typing approaches are unable to determine the phase of observed DNA sequence polymorphisms, leading to ambiguous results. The use of higher resolution methods is often restricted due to cost and time limitations. Here we report on the feasibility of using Pacific Biosciences' Single Molecule Real-Time (SMRT) DNA sequencing technology for high-resolution and high-throughput HLA typing. Seven DNA samples were typed for HLA-A, -B and -C. The results showed that SMRT DNA sequencing technology was able to generate sequences that spanned entire HLA Class I genes that allowed for accurate allele calling. Eight novel genomic HLA class I sequences were identified, four were novel alleles, three were confirmed as genomic sequence extensions and one corrected an existing genomic reference sequence. This method has the potential to revolutionize the field of HLA typing. The clinical impact of achieving this level of resolution HLA typing data is likely to considerable, particularly in applications such as organ and blood stem cell transplantation where matching donors and recipients for their HLA is of utmost importance.Entities:
Mesh:
Year: 2015 PMID: 26018555 PMCID: PMC4446346 DOI: 10.1371/journal.pone.0127153
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Fig 1Basic stages of the Single Molecule Real-Time (SMRT) DNA sequencing method.
SMRTbell adaptors are ligated onto the ends of a blunt-ended PCR amplicon to facilitate continuous sequencing of both strands of the amplicon. The entire sequence generated may include multiple copies of the sense and anti-sense strands of the PCR amplicon in a single read known as the Continuous Long Read (CLR). The post-sequencing bioinformatic post-processes are able to break down the CLR into shorter sub-reads, which encompass the sequence of one strand of the amplicon. These sub-reads can then be compared and used to create a consensus sequence.
Depth of coverage achieved using SMRT sequencing.
| ID | HLA-A | HLA-B | HLA-C | |||
|---|---|---|---|---|---|---|
| Allele | Number of sub-reads | Allele | Number of sub-reads | Allele | Number of sub-reads | |
|
| A*03:01 | 303 | B*07:02 | 841 | C*05:01 | 569 |
| A*11:01 | 385 | B*44:02 | 780 | C*07:02 | 726 | |
|
| A*25:01/02 | 353 | B*15:01 | 817 | C*03:03 | 514 |
| A*68:01:02 | 184 | B*18:01 | 583 | C*12:03 | 422 | |
|
| A*26:01 | 238 | B*14:01 | 1263 | C*02:02 | 282 |
| A*31:01:02 | 371 | B*27:05:02 | 498 | C*08:02 | 162 | |
|
| A*03:01 | 300 | B*27:05:18 | 197 | C*01:02 | 213 |
| A*32:01 | 799 | B*35:01 | 836 | C*04:01 | 584 | |
|
| A*01:01 | 1477 | B*08:01 | 2134 | C*07:01 | 2931 |
|
| A*02:01 | 516 | B*52:01 | 247 | C*07:01 | 278 |
| B*73:01 | 427 | C*15:05 | 156 | |||
|
| A*23:01 | 349 | B*42:01 | 327 | C*06:02 | 1080 |
| A*24:02 | 313 | B*50:01 | 1390 | C*17:01 | 1840 | |
A comparison of expected HLA types, as typed by Anthony Nolan, with those generated by the Single Molecule Real-Time (SMRT) DNA Sequencing method.
| ID | DNA source | Results group | HLA-A allele 1 | HLA-A allele 2 | HLA-B allele 1 | HLA-B allele 2 | HLA-C allele 1 | HLA-C allele 2 |
|---|---|---|---|---|---|---|---|---|
| AN1 | Saliva | AN |
|
|
|
|
|
|
| SMRT |
|
|
|
|
|
| ||
| AN2 | Blood | AN |
|
|
|
|
|
|
| SMRT |
|
|
|
|
|
| ||
| AN3 | Saliva | AN |
|
|
|
|
|
|
| SMRT |
|
|
|
|
|
| ||
| AN4 | Blood | AN |
|
|
|
|
|
|
| SMRT |
|
|
|
|
|
| ||
| AN5 | B-LCL | AN |
|
|
|
|
|
|
| SMRT |
|
|
|
|
|
| ||
| AN6 | B-LCL | AN |
|
|
|
|
|
|
| SMRT |
|
|
|
|
|
| ||
| AN7 | B-LCL | AN |
|
|
|
|
|
|
| SMRT |
|
|
|
|
|
|
HLA alleles in bold highlight novel alleles, genomic sequence corrections or genomic sequence extensions.
* AN—Anthony Nolan typing data as generated by Luminex LABType SSO typing kits (One Lambda, CA, USA), Sequencing-based typing and/or PCR using Sequence specific primers (PCR-SSP).
SMRT—Single Molecule Real-Time DNA sequencing method from Pacific Biosciences
Extensions and corrections to known HLA alleles identified in PacBio results.
| Sample | Expected Allele | Allele used for genomic sequence comparisons | Novel/ differentiating variants | Sequence type | EMBL Accession numbers |
|---|---|---|---|---|---|
| AN3 | B*14:01:01 | B*14:02:01 | Intron 2 G>T gDNA 665 | Extension | HG794368 |
| AN3 | B*27:05:02 | B*27:05:02 | Intron 5 C>T gDNA 2086 | Correction | HG794364 |
| AN4 | B*27:05:18 | B*27:05:02 | Exon 2 C>T gDNA 269; | Extension | HG530757 |
| AN6 | C*15:05:01 | C*15:05:02 | None | Extension | HG794367 |
Anomalies observed in the PacBio SMRT consensus sequences as compared to the expected allele.
| Sample | Expected Allele | Variants | Sequence Confirmed | New allele name | EMBL Accession number |
|---|---|---|---|---|---|
| AN2 | A*68:01:02 | Intron 7 G>A gDNA 2770 | Confirmed,new variant | A*68:01:02:02 | HG794362 |
| AN3 | C*02:02:02 | Intron 5 T>C gDNA 2487 | Confirmed,new variant | C*02:02:02:02 | HG794365 |
| AN3 | C*08:02:01 | Intron 3 A>G gDNA 1338 | Confirmed new variant | C*08:02:01:02 | HG794366 |
| AN6 | B*52:01:01:02 | 5´ UTR C>A gDNA -180 | Confirmed new variant | B*52:01:01:03 | HG794363 |
Homopolymer count in 38 PacBio sequences (total length: 130117 bp).
| Nucleotide | ||||
|---|---|---|---|---|
| Homopolymer count | A | C | G | T |
| 5-mers | 12 | 160 | 209 | 25 |
| 6-mers | 0 | 29 | 16 | 0 |
| 7-mers | 0 | 7 | 2 | 10 |
| 8-mers | 0 | 0 | 2 | 2 |
| 9-mers | 0 | 0 | 0 | 13 |