| Literature DB >> 27552804 |
Adriaan Vanheule1,2, Kris Audenaert1, Sven Warris2, Henri van de Geest2, Elio Schijlen2, Monica Höfte3, Sarah De Saeger4, Geert Haesaert1, Cees Waalwijk2, Theo van der Lee5.
Abstract
BACKGROUND: Eukaryotes display remarkable genome plasticity, which can include supernumerary chromosomes that differ markedly from the core chromosomes. Despite the widespread occurrence of supernumerary chromosomes in fungi, their origin, relation to the core genome and the reason for their divergent characteristics are still largely unknown. The complexity of genome assembly due to the presence of repetitive DNA partially accounts for this.Entities:
Keywords: Fusarium; Gene duplications; Repeat-induced point mutation; Single-molecule real-time sequencing; Supernumerary chromosomes; Translocation; Transposable elements
Mesh:
Substances:
Year: 2016 PMID: 27552804 PMCID: PMC4994206 DOI: 10.1186/s12864-016-2941-6
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Comparison of the SMRT and HiSeq assemblies of isolate 2516. The statistics for the SMRT assembly were extracted from the final version of the assembly: four core chromosomes and 172 supernumerary contigs
| SMRT assembly | HiSeq assembly | |
|---|---|---|
| Number of contigs | 176 | 1253 |
| Average coverage | 20.2 | 111.5 |
| Total sequence length (bp) | 46309701 | 39020932 |
| Average sequence length (bp) | 263123 | 31142 |
| Minimum sequence length (bp) | 10816 | 1004 |
| Maximum sequence length (bp) | 11790407 | 701709 |
| N50 sequence index (# of contigs) | 2 | 62 |
| N50 sequence length (bp) | 8783590 | 170721 |
Fig. 1Chromosome alignments between F. graminearum (x axis) and F. poae (y axis). The best 1:1 alignment is shown between the four chromosomes of F. graminearum and the four core chromosomes of F. poae. Red indicates best hits in the same orientation while blue indicates inversions. The short arm of F. graminearum chromosome four ends in ~1.4 Mb of rDNA repeats that are not assembled in F. poae. All F. graminearum telomeres except the telomere of the short arm of chromosome 4 are assembled. For F. poae, the same telomere is lacking as well as the one on the long arm of chromosome 1. Telomeres that are assembled are shown with green bars on the arms of the chromosomes. Two insertions into F. poae chromosome 3 are denoted with black arrows. Approximate locations of the centromeres are shown with black circles (see Additional file 18)
General features of the machine annotation of F. poae 2516. These were compared to the features of the published annotation of F. graminearum PH-1 [35]
|
|
| |||
|---|---|---|---|---|
| Total | Core | Supernumerary | ||
| Genome size (bp) | 46 309 701 | 38 129 297 | 8 180 404 | 37 958 956 |
| GC% | 46.30 % | 46.00 % | 47.60 % | 48.20 % |
| # of genes | 14 817 | 12 097 | 2 720 | 14 160 |
| Mean gene density (per Mb) | 320 | 317 | 332 | 373 |
| Median gene length (bp) | 1 391 | 1 406 | 1 309 | 1 257 |
| Avg introns/gene | 1.82 | 1.88 | 1.57 | 1.72 |
| Median intron length (bp) | 54 | 54 | 57 | 55 |
BUSCO analyses of F. poae and F. graminearum
| Organism | Complete | Fragmented | Missing | Total |
|---|---|---|---|---|
|
| 1431 | 7 | 0 | 1438 |
|
| 1432 | 6 | 0 | 1438 |
Classification and key characteristics of TE families in the genome of F. poae 2516. Elements below the length threshold for RIP are not included (MITE, ZIT1). Repetitive elements such as the rDNA tandem and two families of telomere linked RecQ helicases are not included. Nomenclature of TEs is as recommended in literature [73]. R retrotransposon, D DNA transposon, L long terminal repeat (LTR), T terminal inverted repeat (TIR), G Gypsy, C Copia, F Fot1/Pogo, T Tc1/mariner, M Mutator, A hAT, x unknown. n/a designates instances where a TIR/LTR could not be detected for a specific element
| Core | Supernumerary | Size (bp) | LTR/TIR (bp) | Family | |||
|---|---|---|---|---|---|---|---|
| Intact | RIP | Intact | RIP | ||||
| Retrotransposons | |||||||
| RLG_ | 27 | 25 | 11 | - | 5684 | 240 | Gypsy/Ty3 like |
| RLG_ | 5 | 7 | 13 | - | 6561 | 379 | Gypsy/Ty3 like |
| RLC_ | - | 1 | 14 | - | 4900 | 195 | Copia/Ty1 like |
| Rxx_ | - | - | 30 | - | 2234 | n/a | unknown |
| DNA transposons | |||||||
| DTF_ | 1 | - | - | - | 1852 | 48 | Pogo |
| DTF_ | - | - | 1 | - | 2133 | 43 | Pogo |
| DTF_ | - | 2 | 41 | - | 2220 | 90 | Pogo |
| DTF_ | - | 1 | 7 | - | 2212 | 75 | Pogo |
| DTF_ | - | 1 | 20 | - | 2200 | 73 | Pogo |
| DTF_ | - | - | 9 | - | 2203 | 73 | Pogo |
| DTF_ | 40 | 10 | 9 | - | 1865 | 51 | Pogo |
| DTF_ | - | 15 | 7 | - | 1865 | 51 | Pogo |
| DTF_ | - | - | 21 | - | 2909 | 98 | Pogo |
| DTF_ | 12 | 11 | 24 | - | 2868 | 90 | Pogo |
| DTF_ | 33 | - | 24 | - | 1934 | 51 | Pogo |
| DTF_ | 8 | 6 | 8 | - | 2885 | 84 | Pogo |
| DTF_ | - | 1 | 10 | - | 1854 | 36 | Pogo |
| DTF_ | - | - | 12 | - | 2749 | 79 | Pogo |
| DTA_ | - | - | 17 | - | 2912 | 27 | hAT-like |
| DTA_ | - | - | 11 | - | 2975 | 22 | hAT-like |
| DTA_ | - | - | 12 | - | 2954 | n/a | hAT-like |
| DTA_ | - | 1 | 10 | - | 2613 | n/a | hAT-like |
| DTA_ | - | - | 22 | - | 2739 | n/a | hAT-like |
| DTA_ | - | - | 11 | - | 2965 | n/a | hAT-like |
| DTA_ | - | - | 20 | - | 2852 | 28 | hAT-like |
| DTA_ | - | - | 15 | - | 2838 | 26 | hAT-like |
| DTA_ | - | - | 23 | - | 2779 | n/a | hAT-like |
| DTA_ | - | - | 13 | - | 3867 | 19 | hAT-like |
| DTA_ | - | - | 36 | - | 2850 | 30 | hAT-like |
| DTA_ | - | - | 5 | - | 2480 | 29 | hAT-like |
| DTA_ | - | - | 12 | - | 4236 | n/a | hAT-like |
| DTM_ | 8 | 10 | 2 | - | 3449 | 81 | Mutator |
| DTM_ | 1 | - | 7 | - | 2825 | 97 | Mutator |
Fig. 2Circos plot showing differences between the core and supernumerary parts of the genome. Outer circle: blue lines denote the distribution of a MITE, red triangles denote ZIT1 copies. Second circle: core chromosomes and supernumerary contigs are colored, blue blocks on the chromosomes indicate the centromeres, black blocks show the two insertions of supernumerary sequence into the core chromosomes. Third circle: black lines represent intact (not RIPped) copies of TEs. Fourth circle: red lines represent RIPped copies of TEs. At the center of the plot, black lines connect gene duplications between the core genome and the supernumerary genome. Only protein hits larger than 266 amino acids are shown as their corresponding genes are supposed to be above the length threshold for RIP. Duplications within the supernumerary genome are not mapped
Fig. 3Estimation of TE numbers in the different F. poae isolates used in the study, as determined by a coverage-based method. Repeat families are classified in decreasing order of incidence in the genome of F. poae 2516; only class I and II transposable elements that are intact in F. poae 2516 are included, therefore elements such as the rDNA tandem and two families of telomere linked RecQ helicases are not in the table. X denotes families for which RIP was detected. It should be noted that average read coverage does not account for possible truncations and therefore the numbers in this table should be considered an estimate
Fig. 4Integration of intact TEs on supernumerary contig 308. The graphs shows in a sliding 1 kb window the fraction of bases from the reference contig that is covered by HiSeq reads of every isolate (value between 0 and 1). The upper track shows all TEs on contig 308 of isolate 2516 that are >1 kb and >90 % identity to the element prototype (Additional file 5) with yellow dots. This TE landscape was used for comparison with isolates 2548, 7555 and bfb0173. Dots for these three isolates indicate elements for which there is read mapping that an element has integrated in the exact same location as the element in isolate 2516 (and is therefore ancestral). Dots that align vertically are conserved in multiple isolates
Fig. 5Divergence estimation of intact (not RIPped) TE copies on the core (left) and supernumerary (right) genomes. Copies were aligned and branch lengths extracted from a maximum-likelihood phylogenetic tree. Branch lengths were used to calculate divergence times with a fixed substitution rate (1.05 * 10-9 substitutions per site per year [66]). Y axis scale was cut off at 25 Mya, but for the supernumerary genome many outliers are above this value. Additional file 19 shows the boxplot with outliers for the supernumerary genome. The boxes for every TE show the lower and upper quartile of the divergence estimates and the median (thick line within the boxes). The whiskers represent the minimum and maximum values. Circles and asterisks are outliers and extreme values which fall respectively outside of one-and-a-half additional box lengths and three additional box lengths counted from the upper quartile limit
Isolates used for whole genome sequencing
| ID | Location | Year | Host | Reference |
|---|---|---|---|---|
| bfb0173 | China | 2005 | barley | [ |
| 2516 | Belgium | 2011 | wheat | this study |
| 2548 | Belgium | 2011 | wheat | this study |
| 7555 | Belgium | 1965 | wheat | MUCL |
MUCL Mycothèque de l’Université catholique de Louvain (Louvain-la-Neuve, Belgium)