| Literature DB >> 33730157 |
Shengjun Bai1, Hainan Wu1, Jinpeng Zhang1, Zhiliang Pan1, Wei Zhao1, Zhiting Li1, Chunfa Tong1.
Abstract
Populus deltoides has important ecological and economic values, widely used in poplar breeding programs due to its superior characteristics such as rapid growth and resistance to disease. Although the genome sequence of P. deltoides WV94 is available, the assembly is fragmented. Here, we reported an improved chromosome-level assembly of the P. deltoides cultivar I-69 by combining Nanopore sequencing and chromosome conformation capture (Hi-C) technologies. The assembly was 429.3 Mb in size and contained 657 contigs with a contig N50 length of 2.62 Mb. Hi-C scaffolding of the contigs generated 19 chromosome-level sequences, which covered 97.4% (418 Mb) of the total assembly size. Moreover, repetitive sequences annotation showed that 39.28% of the P. deltoides genome was composed of interspersed elements, including retroelements (23.66%), DNA transposons (6.83%), and unclassified elements (8.79%). We also identified a total of 44 362 protein-coding genes in the current P. deltoides assembly. Compared with the previous genome assembly of P. deltoides WV94, the current assembly had some significantly improved qualities: the contig N50 increased 3.5-fold and the proportion of gaps decreased from 3.2% to 0.08%. This high-quality, well-annotated genome assembly provides a reliable genomic resource for identifying genome variants among individuals, mining candidate genes that control growth and wood quality traits, and facilitating further application of genomics-assisted breeding in populations related to P. deltoides. © The American Genetic Association. 2021.Entities:
Keywords: zzm321990 Populus deltoideszzm321990 ; Nanopore sequencing; chromosome conformation capture technology; genome assembly
Mesh:
Year: 2021 PMID: 33730157 PMCID: PMC8141683 DOI: 10.1093/jhered/esab010
Source DB: PubMed Journal: J Hered ISSN: 0022-1503 Impact factor: 2.645
Software used for genome assembly and annotation pipeline in the current study
| Software | Version | |
|---|---|---|
| Genome assembly pipeline | ||
| De novo assembly | Canu | v1.9 |
| Contig polishing | Racon | v1.4.3 |
| Medaka | v1.4.3 | |
| NextPolish | v1.1.0 | |
| Remove heterozygous sequences | Minimap2 | v2.12 |
| Purge Haplotigs | v1.1.1 | |
| Hi-C contact map generation | Juicer | v1.7.6 |
| Hi-C scaffolding | 3D-DNA | v180114 |
| Manual assembly inspection | Juicebox Assembly Tools | v1.11.8 |
| Genome evaluation | ||
| Assembly completeness | BUSCO | v4.0.1 |
| Read mapping | BWA-MEM | v0.7.17 |
| SNP calling | Sambamba | v0.7.1 |
| GATK | v4.1.8 | |
| Repeat annotation | ||
| Construct a de novo repeat library | RepeatModeler | v2.0.1 |
| Repeat assessment | RepeatMasker | v4.1.0 |
| Gene prediction | ||
| De novo transcript assembly | Trinity | v2.1.1 |
| Transcript evidence | PASA | v2.3.3 |
| Homology prediction | tBlastn | v2.2.31 |
| Exonerate | v2.4.0 | |
| Ab initio prediction | SNAP | v2006-07-28 |
| GeneMark-ES | v4.59 | |
| AUGUSTUS | v3.3.3 | |
| Integrate all gene structures | EVM | v1.1.1 |
| Update final gene sets | PASA | V2.3.3 |
Summary of sequencing data for the genome assembly and annotation of Populus deltoides I-69
| Data type | Number of reads | Total bases(bp) | Coverage |
|---|---|---|---|
| HQ Illumina sequencing | 235 343 502 | 23 769 512 714 | 55× |
| Nanopore sequencing | 2 752 090 | 44 352 852 004 | 100× |
| Hi-C sequencing | 340 036 044 | 51 005 406 600 | 118× |
| HQ RNA-seq of stem | 49 765 508 | 7 464 826 200 | 17× |
| HQ RNA-seq of leaf | 47 295 142 | 4 256 562 780 | 10× |
HQ, high-quality.
Comparison between the assemblies of Populus deltoides WV94 and P. deltoides I-69 genomes
| Assembly statistics |
|
|
|---|---|---|
| Assembly size (Mb) | 446.8 | 429.3 |
| Total number of contigs | NA | 657 |
| Longest contigs (Mb) | NA | 16.8 |
| Contig N50 (Mb) | 0.59 | 2.62 |
| Scaffold N50 (Mb) | 21.7 | 21.5 |
| Total number of scaffolds | 1375 | 934 |
| 19 chromosomes (%) | 90.2 | 97.4 |
| Gaps (%) | 3.2 | 0.08 |
| GC content (%) | 32.32 | 33.38 |
| Complete BUSCOs (%) | 97.7 | 98.2 |
NA, not applicable.
Figure 1.The genome-wide Hi-C interaction heatmap of 19 chromosomes in Populus deltoides. Each chromosome is framed with a bigger block, and each contig is framed with a smaller block. Heatmap shows Hi-C interactions under the resolution of 500 Kb. Deeper color indicates higher contact frequency.
Figure 2.Comparison of the BUSCO analysis between genome assemblies and gene annotations of Populus deltoides I-69 and P. deltoides WV94.