| Literature DB >> 30669388 |
Sarah B Kingan1, Haynes Heaton2, Juliana Cudini3, Christine C Lambert4, Primo Baybayan5, Brendan D Galvin6, Richard Durbin7, Jonas Korlach8, Mara K N Lawniczak9.
Abstract
A high-quality reference genome is a fundamental resource for functional genetics, comparative genomics, and population genomics, and is increasingly important for conservation biology. PacBio Single Molecule, Real-Time (SMRT) sequencing generates long reads with uniform coverage and high consensus accuracy, making it a powerful technology for de novo genome assembly. Improvements in throughput and concomitant reductions in cost have made PacBio an attractive core technology for many large genome initiatives, however, relatively high DNA input requirements (~5 µg for standard library protocol) have placed PacBio out of reach for many projects on small organisms that have lower DNA content, or on projects with limited input DNA for other reasons. Here we present a high-quality de novo genome assembly from a single Anopheles coluzzii mosquito. A modified SMRTbell library construction protocol without DNA shearing and size selection was used to generate a SMRTbell library from just 100 ng of starting genomic DNA. The sample was run on the Sequel System with chemistry 3.0 and software v6.0, generating, on average, 25 Gb of sequence per SMRT Cell with 20 h movies, followed by diploid de novo genome assembly with FALCON-Unzip. The resulting curated assembly had high contiguity (contig N50 3.5 Mb) and completeness (more than 98% of conserved genes were present and full-length). In addition, this single-insect assembly now places 667 (>90%) of formerly unplaced genes into their appropriate chromosomal contexts in the AgamP4 PEST reference. We were also able to resolve maternal and paternal haplotypes for over 1/3 of the genome. By sequencing and assembling material from a single diploid individual, only two haplotypes were present, simplifying the assembly process compared to samples from multiple pooled individuals. The method presented here can be applied to samples with starting DNA amounts as low as 100 ng per 1 Gb genome size. This new low-input approach puts PacBio-based assemblies in reach for small highly heterozygous organisms that comprise much of the diversity of life.Entities:
Keywords: de novo genome assembly; long-read SMRT sequencing; low-input DNA; mosquito
Mesh:
Year: 2019 PMID: 30669388 PMCID: PMC6357164 DOI: 10.3390/genes10010062
Source DB: PubMed Journal: Genes (Basel) ISSN: 2073-4425 Impact factor: 4.096
Figure 1Anopheles coluzzii input DNA and resulting library. FEMTO Pulse traces and ‘gel’ images (inset) of the genomic DNA input (black) and the final library (blue) before sequencing.
Assembly statistics of raw and curated PacBio Anopheles coluzzii de novo assembly, compared with the previous Sanger-sequence based assembly for this species from [17] (GCA_000150765.1).
| PacBio Raw | PacBio Curated | Sanger Assembly | ||
|---|---|---|---|---|
|
|
| 266 | 251 | 224 |
|
| 372 | 206 | 27,063 | |
|
| 3.52 | 3.47 | 0.025 | |
|
|
| 78.5 | 89.2 | unresolved |
|
| 665 | 830 | N/A | |
|
| 0.22 | 0.199 | N/A |
Figure 2Alignment of the curated PacBio contigs to the AgamP4 PEST reference [21]. Alignments are colored by the primary PEST reference chromosome to which they align but are placed in the panel and Y offset to which the contig as a whole aligns best. Contig ends are denoted by horizontal lines in the assembly and vertical lines in PEST. However, there are many Ns in PEST not annotated as contig breaks so the percent Ns per megabase of PEST is overlaid (scale on the right Y axis). There are no Ns in the PacBio assembly.
Figure 3Example of a compressed repeat in PEST that has been expanded by the PacBio assembly. Dotted vertical lines represent a gap in the PEST assembly (10,000 Ns) between scaffolds, which is now spanned by the single PacBio contig. Coverage plot of the PacBio subreads aligned to PEST (bottom) highlights the region where excess coverage indicates a collapsed repeat in PEST, in contrast the coverage of PacBio subreads aligned to the PacBio contig (left) is more uniform.
Figure 4Alignment of X pericentromeric contigs to PEST, highlighting likely order and orientation issues in the PEST assembly that are resolved by a single PacBio contig (22F).