| Literature DB >> 32432329 |
Annarita Marrano1, Monica Britton2, Paulo A Zaini1, Aleksey V Zimin3,4, Rachael E Workman3, Daniela Puiu4, Luca Bianco5, Erica Adele Di Pierro5, Brian J Allen1, Sandeep Chakraborty1, Michela Troggio5, Charles A Leslie1, Winston Timp3,4, Abhaya Dandekar1, Steven L Salzberg3,4,6, David B Neale1.
Abstract
BACKGROUND: The release of the first reference genome of walnut (Juglans regia L.) enabled many achievements in the characterization of walnut genetic and functional variation. However, it is highly fragmented, preventing the integration of genetic, transcriptomic, and proteomic information to fully elucidate walnut biological processes.Entities:
Keywords: Hi-C; Iso-Seq; Nanopore; allergens; gene prediction; genetic diversity; proteome
Year: 2020 PMID: 32432329 PMCID: PMC7238675 DOI: 10.1093/gigascience/giaa050
Source DB: PubMed Journal: Gigascience ISSN: 2047-217X Impact factor: 6.524
Comparison among the 4 assemblies of Chandler
| Statistic | Chandler v1.0 | Chandler v1.5 | Chandler hybrid | Chandler HiRise | Chandler v2.0 | JrSerr_v1.0 |
|---|---|---|---|---|---|---|
| No. of scaffolds | 27,032 | 4,401 | 3,497 | 2,656 | 2,643 | 73 |
| N50 length (scaffolds) (bp) | 304,423 | 637,984 | 1,640,935 | 32,655,472 | 37,114,715 | 35,197,335 |
| L50 (scaffolds) | 344 | 272 | 89 | 8 | 7 | 7 |
| Total length of assembled scaffolds (bp) | 667,299,356 | 650,478,320 | 567,378,842 | 567,480,142 | 567,796,851 | 534,671,929 |
| No. of contigs | 53,156 | 7,411 | 3,592 | 3,700 | 3,684 | 127 |
| N50 length (contigs) (bp) | 42,417 | 317,751 | 1,512,354 | 1,083,883 | 1,083,883 | 15,066,219 |
| L50 (contigs) | 3,630 | 482 | 97 | 144 | 144 | 13 |
| Total size of assembled contigs (bp) | 641,521,787 | 617,088,256 | 567,276,004 | 567,276,244 | 567,192,099 | 530,618,363 |
Scaffolds shorter than 1,000 bp are not included in these totals.
Figure 1:Collinearity between the high-density “Chandler” genetic map of Marrano et al. [16] and the 16 chromosomal pseudomolecules of Chandler v2.0.
Figure 2:Summary of gene distribution and genetic diversity across the 16 chromosomes of Chandler v2.0. Tracks from outside to inside: (i) gene density of Chandler v2.0 in 1-Mb windows; (ii) Chandler heterozygosity in 1-Mb windows (white = low heterozygosity; blue = high heterozygosity); (iii) Recombination rate for sliding windows of 10 Mb (average = 2.63 cM/Mb); (iv) FST in 500-kb windows. Windows in the 95th percentiles of the FST distribution are highlighted in red; (v) ROD values for 500-kb windows.
Statistics on the gene annotation of Chandler v2.0 compared with the previous gene annotations of the Chandler genome
| Statistics | Chandler v2.0 | Chandler v1.0 | Chandler RefSeq v1.0 |
|---|---|---|---|
| No. of genes | 37,554 | 32,496 | 41,188 |
| Mean gene length (bp) | 5,319 | 4,358 | 4,641 |
| Single-exon transcripts | 6,613 | 6,247 | 6,749 |
| Mean CDS length (bp) | 1,335 | 1,222 | 1,336 |
| No. of exons | 242,208 | 172,273 | 230,261 |
| Mean exon length (bp) | 257.8 | 229.5 | 314 |
| No. of introns | 201,290 | 139,775 | 181,419 |
| Mean intron length | 853.9 | 730 | 835 |
| Mean number of introns per gene | 5.9 | 5.3 | 4.4 |
Statistics of the completeness of Chandler v2.0 assessed with BUSCO and compared with other Fagales genomes
| Genome | BUSCO complete (%) | BUSCO duplicated (%) | BUSCO fragmented (%) | BUSCO missing (%) | Reference |
|---|---|---|---|---|---|
|
| 95.1 | 12.6 | 1.3 | 3.6 | This genome |
|
| 94.8 | 13.8 | 1.2 | 4.0 | [ |
|
| 94.5 | 11.1 | 1.5 | 4.0 | [ |
|
| 94 | 19 | 1.7 | 3.6 | [ |
|
| 96.7 | 7.7 | 1.4 | 1.9 | [ |
|
| 94 | 23 | 1.4 | 3.6 | [ |
|
| 96 | 6 | 1 | 3 | [ |
|
| 90 | 52 | 4 | 6 | [ |
|
| 93 | 49 | 3 | 4 | [ |
Figure 3:Clustering of the samples used in the proteomic analysis. (A) Hierarchical clustering based on Euclidian distances of normalized abundances of detected proteins. Samples are represented in columns and proteins in rows. (B) Principal component analysis of the 12 samples analyzed, clustering according to tissue type.
Figure 4:Graphical visualization of haplotype block (HB) inheritance on Chr15 along with the Chandler pedigree. (A) The inner circle highlights in grey 2 regions of heterozygosity (5 HB the first and 7 HB the second), and in light green 2 regions of homozygosity (3 HB the first and 4 HB the second). The circle in the middle shows maternally inherited HBs, while the HBs inherited through the paternal line are visualized in the outer circle. Payne's haplotypes are clearly present in both parental lines. White spaces represent segments of missing haplotype information. (B) Chandler pedigree, where Pedro is the maternal line and 56–224, the paternal line.
Figure 5:Graphical visualization of the haplotype block (HB) inheritance across Chandler pedigree in the 16 chromosomes. The inner circle highlights in grey the regions of heterozygosity and in light green the regions of homozygosity for each chromosome. The circle in the middle shows the maternally inherited HBs, while the HBs inherited from the paternal line are visualized in the outer circle. In both parental line circles, missing data are highlighted in grey. Payne haplotypes are inherited along both parental lines in all chromosomes but Chr5, Chr9, Chr10, Chr14, and Chr16. Chandler pedigree is represented on the side, where Pedro is the maternal line and 56–224 the paternal line.