| Literature DB >> 31712258 |
Mareike Busche1, Boas Pucker1, Prisca Viehöver1, Bernd Weisshaar1, Ralf Stracke2.
Abstract
Different Musa species, subspecies, and cultivars are currently investigated to reveal their genomic diversity. Here, we compare the genome sequence of one of the commercially most important cultivars, Musa acuminata Dwarf Cavendish, against the Pahang reference genome assembly. Numerous small sequence variants were detected and the ploidy of the cultivar presented here was determined as triploid based on sequence variant frequencies. Illumina sequence data also revealed a duplication of a large segment on the long arm of chromosome 2 in the Dwarf Cavendish genome. Comparison against previously sequenced cultivars provided evidence that this duplication is unique to Dwarf Cavendish. Although no functional relevance of this duplication was identified, this example shows the potential of plants to tolerate such aneuploidies.Entities:
Keywords: banana; crop genome assembly; pan-genomics; small sequence variants
Mesh:
Year: 2020 PMID: 31712258 PMCID: PMC6945009 DOI: 10.1534/g3.119.400847
Source DB: PubMed Journal: G3 (Bethesda) ISSN: 2160-1836 Impact factor: 3.154
Figure 1M. acuminata Dwarf Cavendish plant, nine month old.
Figure 2Coverage distribution. Chromosomes are ordered by increasing number with the north end on the left hand side. Centromere positions (D’Hont ) are indicated by thin vertical gray lines. Mapping of M. acuminata Dwarf Cavendish reads against the DH Pahang v2 reference sequence assembly revealed a 6.2 Mbp tetraploid region on the long arm of chromosome 2 in Dwarf Cavendish (see enlarged box in the upper right). Apparent large scale deletions, indicated by regions with almost zero coverage, are technical artifacts caused by large stretches of ambiguous bases (Ns) in the Pahang assembly that cannot be covered by reads; these artifacts are marked with horizontal gray lines. Plots with higher per chromosome resolution data are presented in Supplementary File S1.
Figure 3Allele frequency histogram. Visualization of mapping results of Dwarf Cavendish Illumina reads against the Pahang v2 reference sequence, used for the determination of SNV frequencies. The frequencies of the reference allele at SNV positions are displayed here, excluding those positions at which the Pahang reference sequence deviates from an invariant sequence position of Dwarf Cavendish. Black vertical lines indicate allele frequencies of 0.33, 0.5, and 0.66, respectively. SNVs in the duplicated segment on the long arm of chromosome 2 (magenta) are distinguished from all other variants (lime). Within the duplicated segment on chromosome 2, the frequency of the reference alleles is often 0.75 or 0.25 indicating the presence of three similar alleles and one diverged allele.
Figure 4Genome-wide distribution of small sequence variants. SNVs (green) and InDels (magenta) distinguish Dwarf Cavendish from Pahang. Variants were counted in 100 kb windows and are displayed on two different y-axes to allow maximal resolution (Pucker ).
acuminata Dwarf Cavendish de novo genome assembly statistics
| Parameter | Value |
|---|---|
| Number of scaffolds | 256,523 |
| Maximal scaffold length | 240,314 bp |
| Assembly size | 963,409,601 bp (0.96 Gbp) |
| GC content | 38.78% |
| N50 | 5,432 bp |
| N90 | 1,592 bp |