| Literature DB >> 30398473 |
Fabien Dutreux1,2, Corinne Da Silva1, Léo d'Agata1, Arnaud Couloux1, Elise J Gay2, Benjamin Istace1, Nicolas Lapalu2, Arnaud Lemainque1, Juliette Linglin2, Benjamin Noel1, Patrick Wincker3, Corinne Cruaud1, Thierry Rouxel2, Marie-Hélène Balesdent2, Jean-Marc Aury1.
Abstract
Leptosphaeria maculans and Leptosphaeria biglobosa are ascomycete phytopathogens of Brassica napus (oilseed rape, canola). Here we report the complete sequence of three Leptosphaeria genomes (L. maculans JN3, L. maculans Nz-T4 and L. biglobosa G12-14). Nz-T4 and G12-14 genome assemblies were generated de novo and the reference JN3 genome assembly was improved using Oxford Nanopore MinION reads. The new assembly of L. biglobosa showed the existence of AT rich regions and pointed to a genome compartmentalization previously unsuspected following Illumina sequencing. Moreover nanopore sequencing allowed us to generate a chromosome-level assembly for the L. maculans reference isolate, JN3. The genome annotation was supported by integrating conserved proteins and RNA sequencing from Leptosphaeria-infected samples. The newly produced high-quality assemblies and annotations of those three Leptosphaeria genomes will allow further studies, notably focused on the tripartite interaction between L. maculans, L. biglobosa and oilseed rape. The discovery of as yet unknown effectors will notably allow progress in B. napus breeding towards L. maculans resistance.Entities:
Mesh:
Year: 2018 PMID: 30398473 PMCID: PMC6219404 DOI: 10.1038/sdata.2018.235
Source DB: PubMed Journal: Sci Data ISSN: 2052-4463 Impact factor: 6.444
Figure 1General description of the assembly workflow.
Amelioration of the existing JN3 assembly and de novo assembly of Nz-T4 and G12-14 isolates.
Metrics of raw nanopore datasets.
| All | Number of reads | 1,736,075 (23 flowcells) | 883,625 (30 flowcells) | 931,376 (19 flowcells) |
| Cumulative size | 6,044,436,091 | 4,813,632,244 | 3,362,956,224 | |
| Estimated coverage | 134X | 107X | 96X | |
| Average Size (bp) | 3,482 | 5,448 | 3,611 | |
| Longest read (bp) | 1,612,597 | 1,769,119 | 949,732 | |
| N50 (bp) | 6,441 | 7,197 | 7,115 | |
| # reads > 10Kb | 72,482 | 76,176 | 61,856 | |
| 1D | Number of reads | 1,278,253 | 443,063 | 482,374 |
| Cumulative size | 4,402,377,873 | 2,272,788,773 | 1,693,737,429 | |
| Estimated coverage | 98X | 51X | 48X | |
| Average Size (bp) | 3,444 | 5,130 | 3,511 | |
| Longest read (bp) | 1,612,597 | 1,769,119 | 949,732 | |
| N50 (bp) | 6,431 | 7,497 | 7,250 | |
| # reads > 10Kb | 52,999 | 42,132 | 33,148 | |
| 2D | Number of reads | 457,822 | 440,562 | 449,002 |
| Cumulative size | 1,642,058,218 | 2,540,843,471 | 1,669,218,795 | |
| Estimated coverage | 36X | 56X | 48X | |
| Average Size (bp) | 3,587 | 5,767 | 3,718 | |
| Longest read (bp) | 237,404 | 301,948 | 84,061 | |
| N50 (bp) | 6,469 | 7,008 | 7,005 | |
| # reads > 10Kb | 19,483 | 34,044 | 28,708 | |
Metrics of illumina datasets.
| Number of reads | 36,421,608 | 19,106,872 | 22,108,636 |
| Cumulative size | 8,089,956,155 | 4,399,409,082 | 4,198,089,452 |
| Estimated coverage | 180 X | 98 X | 120 X |
| Size (bp) | 2×251 # HiSeq2500 | 2×251 # MiSeq | 2×251# MiSeq |
Metrics of nanopore datasets used for genome assemblies.
| status | |||
|---|---|---|---|
| complete dataset | 2D dataset | 2D dataset | |
| Number of reads | 1,736,075 | 440,562 | 449,002 |
| Cumulative size | 6,044,436,091 | 2,540,843,471 | 1,669,218,795 |
| Estimated coverage | 134X | 56X | 48X |
| Average Size (bp) | 3,482 | 5,767 | 3,718 |
| Longest read (bp) | 1,612,597 | 301,948 | 84,061 |
| N50 (bp) | 6,441 | 7,008 | 7,005 |
| # reads > 10Kb | 72,482 | 34,044 | 28,708 |
Metrics of the existing and new assemblies.
| Reference | |||||
|---|---|---|---|---|---|
| Rouxel | This study | This study | Grandaubert | This study | |
| # sequences | 41 | 33 | 288 | 606 | 156 |
| Cumulative size | 44,892,605 | 45,986,477 | 43,426,637 | 31,788,051 | 34,950,111 |
| N50 | 1,769,547 | 2,437,616 | 383,462 | 779,070 | 462,395 |
| N90 | 1,020,521 | 1,391,278 | 64,152 | 125,916 | 123,107 |
| L50 | 10 | 8 | 38 | 14 | 22 |
| L90 | 22 | 18 | 150 | 49 | 77 |
| # of N’s | 1,128,152 (2.51%) | 489,445 (1.06%) | 0 (0%) | 2,343,201 (7.37%) | 0 (0%) |
| % GC | 45.24% | 45.22% | 45.69% | 51.39% | 49.13% |
Figure 2General description of the gene prediction workflow.
Metrics of the existing and new gene predictions.
| Reference | |||||
|---|---|---|---|---|---|
| Rouxel | This study | This study | Grandaubert | This study | |
| # genes | 12,611 | 13,047 | 14,026 | 11,390 | 12,678 |
| # mono-exonic genes | 2,931 | 5,204 | 5,898 | 2,726 | 4,630 |
| Gene length (avg:med) | 1,592:1,278 | 1,652:1,341 | 1,507:1,211 | 1,501:1,217 | 1,679:1,337 |
| # exons per gene (avg:med) | 2.94:2 | 2.28:2 | 2.20:2 | 2.67:2 | 2.35:2 |
| CDS length (avg:med) | 1,392:1,110 | 1,177:891 | 1,107:795 | 1,307:1,065 | 1,208:954 |
| # introns | 24,475 | 16,700 | 16,887 | 18,998 | 17,076 |
| introns length (avg:med) | 103:63 | 127:57 | 98:57 | 116:57 | 151:56 |
| Coding fraction | 38.9% | 33.4% | 35.8% | 46.8% | 43.8% |
| BUSCO (euk) | 89% | 98% | 96% | 93% | 96% |
| BUSCO (fungi) | 84% | 96% | 93% | 90% | 95% |
Figure 3Genome browser database for the three Leptosphaeria isolates.
The genome browser is available at http://www.genoscope.cns.fr/leptolife and contains repeats (green track), coverage of genomic reads (black wiggle), gene prediction (blue track), RNA contigs (dark green track) and protein homologies (salmon track).
Genome compartmentalization of old and new assemblies.
| Reference | ||||||
|---|---|---|---|---|---|---|
| Rouxel | This study | This study | Grandaubert | This study | ||
| ∗Results from Grandaubert | ||||||
| GC blocks | # blocks | 399 | 534 | 587 | 389 | 318 |
| % assembly | 64 | 63.1 | 66.6 | 95.1 | 84.5 | |
| mean size (kb) | 70.4 | 54.4 | 49.2 | 77.7 | 92.9 | |
| stdev (kb) | — | 63.1 | 61.9 | 226 | 157 | |
| min (kb) | 1 | 1 | 1 | 1 | 1 | |
| max (kb) | 500 | 431 | 349 | 1708 | 1180 | |
| # genes | — | 12,892 | 13,790 | 11,361 | 12,603 | |
| AT blocks | # blocks | 413 | 564 | 779 | 324 | 335 |
| % assembly | 36 | 36.9 | 33.4 | 4.87 | 15.5 | |
| mean size (kb) | 38.6 | 30.1 | 18.6 | 4.8 | 16.2 | |
| stdev (kb) | — | 48.1 | 23.5 | 10.7 | 17.7 | |
| min (kb) | 1 | 1 | 1 | 1 | 1 | |
| max (kb) | 320 | 319 | 245 | 120 | 142 | |
| # genes | — | 233 | 236 | 29 | 75 |