| Literature DB >> 31529539 |
Tyler Alioto1,2, Konstantinos G Alexiou3,4, Amélie Bardil4, Fabio Barteri4, Raúl Castanera4, Fernando Cruz1,2, Amit Dhingra5, Henri Duval6, Ángel Fernández I Martí7,8, Leonor Frias1,2, Beatriz Galán9, José L García9, Werner Howad3,4, Jèssica Gómez-Garrido1,2, Marta Gut1,2, Irene Julca2,10, Jordi Morata4, Pere Puigdomènech4, Paolo Ribeca1,2,11, María J Rubio Cabetas12,13, Anna Vlasova10, Michelle Wirthensohn14, Jordi Garcia-Mas3,4, Toni Gabaldón2,10,15, Josep M Casacuberta4, Pere Arús3,4.
Abstract
We sequenced the genome of the highly heterozygous almond Prunus dulcis cv. Texas combining short- and long-read sequencing. We obtained a genome assembly totaling 227.6 Mb of the estimated almond genome size of 238 Mb, of which 91% is anchored to eight pseudomolecules corresponding to its haploid chromosome complement, and annotated 27 969 protein-coding genes and 6747 non-coding transcripts. By phylogenomic comparison with the genomes of 16 additional close and distant species we estimated that almond and peach (Prunus persica) diverged around 5.88 million years ago. These two genomes are highly syntenic and show a high degree of sequence conservation (20 nucleotide substitutions per kb). However, they also exhibit a high number of presence/absence variants, many attributable to the movement of transposable elements (TEs). Transposable elements have generated an important number of presence/absence variants between almond and peach, and we show that the recent history of TE movement seems markedly different between them. Transposable elements may also be at the origin of important phenotypic differences between both species, and in particular for the sweet kernel phenotype, a key agronomic and domestication character for almond. Here we show that in sweet almond cultivars, highly methylated TE insertions surround a gene involved in the biosynthesis of amygdalin, whose reduced expression has been correlated with the sweet almond phenotype. Altogether, our results suggest a key role of TEs in the recent history and diversification of almond and its close relative peach.Entities:
Keywords: zzm321990Prunus dulciszzm321990; zzm321990Prunus persicazzm321990; crop evolution; divergence; genome sequence; indels; seed bitterness; transposable elements; variability
Mesh:
Substances:
Year: 2019 PMID: 31529539 PMCID: PMC7004133 DOI: 10.1111/tpj.14538
Source DB: PubMed Journal: Plant J ISSN: 0960-7412 Impact factor: 6.417
Texas genome assembly and annotation statistics
| Assembly length | 227.6 Mb |
| Contig N50 | 103.9 kb |
| Scaffold N50 | 381.5 kb |
| Pseudomolecule N50 | 24.8 Mb |
| Per cent anchored to pseudomolecules | 91.47% |
| BUSCO complete genes | 95.4% |
| BUSCO fragmented genes | 1.0% |
| BUSCO missing genes | 3.6% |
| Genomic GC content | 37.65% |
| Number of protein‐coding genes | 27 969 |
| Median gene length (bp) | 2288 |
| Number of transcripts | 34 039 |
| Number of unique protein products | 32 559 |
| Number of exons | 184 149 |
| Number of unique exons | 148 374 |
| Number of coding exons | 140 538 |
| Coding GC content | 44.12% |
| Median intron length (bp) | 171 |
| Exons/transcript | 5.41 |
| Transcripts/gene | 1.22 |
| Multi‐exonic transcripts | 81% |
| Gene density (genes Mb–1) | 123 |
Figure 1Species tree obtained from the concatenation of 262 widespread single‐gene families.
(a) Full species tree. All Prunus species are highlighted in pink. All bootstrap values that are not maximal (bootstrap 100%) are indicated in red. Green numbers correspond to the nodes in Table S4. Bars at the nodes indicate the uncertainty around mean age estimates based on 95% credibility intervals. Scale at the bottom shows the divergence time in Mya (million years ago). Green dots represent selected calibration points.
(b) Zoom‐in of the Prunus group. Numbers indicate the duplication ratio for each branch calculated with the phylome of almond (red) and peach (blue).
Figure 2Distribution of gene and transposable element (TE) abundance.
Distribution of gene and TE abundance along Prunus dulcis (a) and Prunus persica (b) chromosomes. Outer to inner tracks represent the coverage per 100 kb of genes, TEs, Copia long terminal repeat (LTR) retrotransposons, Gypsy LTR retrotransposons, and miniature inverted‐repeat transposable elements. The chromosome scale is in Mbp.
Figure 3Dynamics of long terminal repeat (LTR) retrotransposons in peach and almond.
(a) Insertion time of complete LTR retrotransposons in Prunus dulcis and Prunus persica.
(b) Insertion time (MYA, million years ago) of polymorphic and fixed orthologous LTR retrotransposons in almond (left) and peach (right).
Figure 4Analysis of the locus of the CYP71AN24 gene in almond varieties and related Prunus species.
(a) Nucleotide conservation of the CYP71AN24 region between Prunus avium, Prunus mume, Prunus persica and Prunus dulcis based on a Mauve multiple alignment (physical distance scale is in bp). White boxes represent inserted regions in P. dulcis. The Integrative Genomics Viewer (IGV) tracks of the gene and transposable element (TE) annotations of P. dulcis and P. persica along with their DNA methylation levels in the three different contexts (CG, CHG and CHH) are shown below.
(b) The IGV spnapshot of the region containing the CYP71AN24 gene and the polymorphic TE insertions displaying the coverages of mapped DNA‐seq reads from re‐sequencing data of sweet‐ and bitter‐kernel P. dulcis varieties, as well as from that of the closely related Prunus webbii.