| Literature DB >> 30107523 |
Robert VanBuren1,2, Ching Man Wai1, Marivi Colle1, Jie Wang3, Shawn Sullivan4, Jill M Bushakra5, Ivan Liachko4, Kelly J Vining6, Michael Dossett6, Chad E Finn7, Rubina Jibran8, David Chagné8, Kevin Childs3, Patrick P Edger1, Todd C Mockler9, Nahla V Bassil5.
Abstract
Background: The fragmented nature of most draft plant genomes has hindered downstream gene discovery, trait mapping for breeding, and other functional genomics applications. There is a pressing need to improve or finish draft plant genome assemblies. Findings: Here, we present a chromosome-scale assembly of the black raspberry genome using single-molecule real-time Pacific Biosciences sequencing and high-throughput chromatin conformation capture (Hi-C) genome scaffolding. The updated V3 assembly has a contig N50 of 5.1 Mb, representing an ∼200-fold improvement over the previous Illumina-based version. Each of the 235 contigs was anchored and oriented into seven chromosomes, correcting several major misassemblies. Black raspberry V3 contains 47 Mb of new sequences including large pericentromeric regions and thousands of previously unannotated protein-coding genes. Among the new genes are hundreds of expanded tandem gene arrays that were collapsed in the Illumina-based assembly. Detailed comparative genomics with the high-quality V4 woodland strawberry genome (Fragaria vesca) revealed near-perfect 1:1 synteny with dramatic divergence in tandem gene array composition. Lineage-specific tandem gene arrays in black raspberry are related to agronomic traits such as disease resistance and secondary metabolite biosynthesis. Conclusions: The improved resolution of tandem gene arrays highlights the need to reassemble these highly complex and biologically important regions in draft plant genomes. The updated, high-quality black raspberry reference genome will be useful for comparative genomics across the horticulturally important Rosaceae family and enable the development of marker assisted breeding in Rubus.Entities:
Mesh:
Year: 2018 PMID: 30107523 PMCID: PMC6131213 DOI: 10.1093/gigascience/giy094
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Comparison of the black raspberry V1 and V3 assemblies
| V1 | V3 | |
|---|---|---|
| Total assembly size, Mb | 243 | 290 |
| Number of contigs | 11,936 | 235 |
| Number of scaffolds | 2,226 | 7 |
| Contig N50, kb | 33.1 | 5100 |
| Scaffold N50, Mb | 0.35 | 41.1 |
| LTR composition, % | 16.20 | 32.60 |
| Number of genes | 28,005 | 34,545 |
Figure 1:Updated chromosome scale assembly of black raspberry. (A) Syntenic dot plot of the black raspberry V1 and V3 assemblies. Each blue point denotes a collinear genomic region. (B) Assembly graph of the V3 reference. Each line (node) represents a contig in the Canu assembly, and connections (edges) between contigs represent ambiguities in the graph structure. The color of contigs is randomly assigned. (C) Post-clustering heat map showing density of Hi-C interactions between contigs from the Proximity-Guided Assembly.
Summary of chromosome anchoring using the HiC genome map
| Chromosome | Anchored contigs | Total size (bp) |
|---|---|---|
| Ro01 | 19 | 34,302,027 |
| Ro02 | 19 | 40,757,823 |
| Ro03 | 30 | 43,767,452 |
| Ro04 | 30 | 38,746,748 |
| Ro05 | 25 | 41,095,993 |
| Ro06 | 37 | 50,854,034 |
| Ro07 | 75 | 41,277,220 |
| Total | 235 | 290,801,297 |
Figure 2:Genome landscape of the black raspberry V3 genome. The composition of long terminal repeat (LTR) retrotransposons, centromeric repeat arrays (Cent. DNA), gene models carried over from the V1 assembly, and new gene models in V3 are plotted in 50-kb bins with a 25-kb sliding window. Terminal telomeric repeats are denoted by purple dots.
Figure 3:Expression patterns of new genes in the V3 black raspberry assembly. (A) Heat map of expression patterns of all 6,070 new genes with detectable expression. (B) Expression patterns of the top 100 genes with highest expression. Blue indicates low expression and red indicates high expression. Expression values are plotted as log2 transformed FPKM.
Figure 4:Comparison of tandem gene array sizes in the V1 and V3 black raspberry assemblies. (A) Venn diagram of gene models specific to V1 (blue), specific to V3 (orange), and shared. (B) Comparison of total tandem gene duplicates in V1 and V3. (C) The number of genes found in both the V1 and V3 assemblies (blue) or only V3 (orange) is plotted for tandem arrays ranging in size from 5 to 26 copies. Array size is based on the V3 annotation.
Figure 5:Comparative genomics of the black raspberry V3 and woodland strawberry (Fragaria vesca) V4 genomes. (A) Macrosyntenic dot plot between the black raspberry and F. vesca genomes. Each black dot represents a syntenic region between the two genomes. The inlaid bar graph shows syntenic depth of each red raspberry and F. vesca syntenic block. (B) Chromosome-scale collinearity between black raspberry and F. vesca. Collinear regions between chromosomes Ro01 and Fvb1 and between chromosomes Ro06 and Fvb6 are highlighted in red and blue respectively and shown in more detail in (C). (C) Microsynteny of two regions showing lineage-specific expansion in Fvb1 (top comparison) and Ro06 (bottom). Genes are shown in red or blue (top and bottom, respectively), with colors indicating gene orientation (light are forward, dark are reverse). Syntenic gene pairs are connected by gray lines.