| Literature DB >> 29572469 |
S Fuselli1, R P Baptista2, A Panziera3,4, A Magi5, S Guglielmi3, R Tonin3,6, A Benazzo3, L G Bauzer7,8, C J Mazzoni8, G Bertorelle3.
Abstract
The major histocompatibility complex (MHC) acts as an interface between the immune system and infectious diseases. Accurate characterization and genotyping of the extremely variable MHC loci are challenging especially without a reference sequence. We designed a combination of long-range PCR, Illumina short-reads, and Oxford Nanopore MinION long-reads approaches to capture the genetic variation of the MHC II DRB locus in an Italian population of the Alpine chamois (Rupicapra rupicapra). We utilized long-range PCR to generate a 9 Kb fragment of the DRB locus. Amplicons from six different individuals were fragmented, tagged, and simultaneously sequenced with Illumina MiSeq. One of these amplicons was sequenced with the MinION device, which produced long reads covering the entire amplified fragment. A pipeline that combines short and long reads resolved several short tandem repeats and homopolymers and produced a de novo reference, which was then used to map and genotype the short reads from all individuals. The assembled DRB locus showed a high level of polymorphism and the presence of a recombination breakpoint. Our results suggest that an amplicon-based NGS approach coupled with single-molecule MinION nanopore sequencing can efficiently achieve both the assembly and the genotyping of complex genomic regions in multiple individuals in the absence of a reference sequence.Entities:
Mesh:
Year: 2018 PMID: 29572469 PMCID: PMC6133961 DOI: 10.1038/s41437-018-0070-5
Source DB: PubMed Journal: Heredity (Edinb) ISSN: 0018-067X Impact factor: 3.821
Fig. 1Laboratory procedure used to obtain MHC II DRB short and long amplicons, and assembly pipelines. The short amplicon was sequenced by standard Sanger sequencing (step 1), while the long amplicon was sequenced with Illumina MiSeq and nanopore MinION after gel extraction (step 2). The gene structure is based on Bos taurus (Ensembl: ENSBTAG00000013919) and Ovis aries (GenBank AM884914) structures. Boxes: coding regions, lines: introns, horizontal arrows: PCR primers
Sanger and Illumina sequencing in six different chamois
| Code | Exon 2 genotype | Filtered paired-end Illumina reads | Coverage |
|---|---|---|---|
| PDB44b | *1/*19 | 81,284 | ×3910 |
| PDB47 | *1/*19 | 147,388 | ×6294 |
| PDB60b | *1/*1 | 78,425 | ×2154 |
| PDB61 | *19/*19 | 102,110 | ×4341 |
| PDB66 | *19/*19 | 67,599 | ×2797 |
| PDB70a | *1/*1 | 140,721 | ×6220 |
aSample analyzed with MinION
Assembly pipelines and descriptive statistics of PDB70 amplicon reads obtained with Illumina and MinION nanopore
| Pipeline | Type | Input | Error corrector | Contig assembler | Consensus size (bp) | GC% | Coveragea | Identity between chamois and |
|---|---|---|---|---|---|---|---|---|
| CANU | De novo | MinION | Pilon | CANU | 9001 | 40.86 | ×7139.92 | 95.81% |
| Geneious | De novo | MinION | Pilon | Geneious algorithm | 9048 | 40.86 | ×7120.34 | 95.81% |
| DBG2OLC | Hybrid | Illumina, MinION | Pilon | Velvet (short reads) | 8931 | 40.87 | ×7065.76 | 91.75% |
aThe coverage was calculated using Samtools (Li et al. 2009) considering Illumina reads mapped to Pilon-polished CANU, Geneious and DBG2OLC assemblies, respectively
Fig. 2Distribution of 2D reads between 1 and 12 Kb length produced by the MinION run
Fig. 3Graphical representation of individual inferred haplotypes. Vertical numbers represent the position on the MHC amplicon. Different colors indicate reference allele (blue) and alternative allele (yellow). In white variable positions of undefined genotype. The shaded triangle shows the two alleles of the STR identified in intron 1
Fig. 4MinION sequencing error rate estimation using three mapping algorithms. The error rate for single nucleotide variants (SNV, average error: 4.8%), insertions (INS, average error: 4.1%), and deletions (DEL, average error: 3.0%) was estimated as a function of sequence position