| Literature DB >> 29553574 |
Miten Jain1, Hugh E Olsen1, Daniel J Turner2, David Stoddart2, Kira V Bulazel3, Benedict Paten1, David Haussler1, Huntington F Willard3,4, Mark Akeson1, Karen H Miga1,3.
Abstract
The human genome reference sequence remains incomplete owing to the challenge of assembling long tracts of near-identical tandem repeats in centromeres. We implemented a nanopore sequencing strategy to generate high-quality reads that span hundreds of kilobases of highly repetitive DNA in a human Y chromosome centromere. Combining these data with short-read variant validation, we assembled and characterized the centromeric region of a human Y chromosome.Entities:
Mesh:
Year: 2018 PMID: 29553574 PMCID: PMC5886786 DOI: 10.1038/nbt.4109
Source DB: PubMed Journal: Nat Biotechnol ISSN: 1087-0156 Impact factor: 54.908
Figure 1BAC-based longboard nanopore sequencing strategy on the MinION.
(a) Optimized strategy to cut each circular BAC once with transposase results in a linear and complete DNA fragment of the BAC for nanopore sequencing. (b) Yield plot of BAC DNA (RP11-648J18). (c) High-quality BAC consensus sequences were generated by multiple alignment of 60 full-length 1D reads (shown as blue and yellow for both orientations), sampled at random with ten iterations, followed by polishing steps (green) with the entire nanopore long-read data and Illumina data. (d) Circos representation[20] of the polished RP11-718M18 BAC consensus sequence. Blue arrowheads indicate the position and orientation of HORs. Purple tiles in yellow background mark the position of the Illumina-validated variants. Additional purple highlight extending from select Illumina-validated variants are used to identify single-nucleotide-sequence variants and mark the site of the DYZ3 repeat structural variants (6 kb) in tandem.
Figure 2Linear assembly of the RP11 Y centromere.
(a) Ordering of nine DYZ3-containing BACs spanning from proximal p-arm to proximal q-arm. The majority of the centromeric locus is defined by the DYZ3 conical 5.8-kb HOR (light blue). Highly divergent monomeric alpha satellite is indicated in dark blue. HOR variants (6.0 kb) indicated in purple. (b) The genomic location of the functional Y centromere is defined by the enrichment of centromere protein A (CENP-A), where enrichment (∼5–6×) is attributed predominantly to the DYZ3 HOR array.