| Literature DB >> 33854089 |
Nathalie Raharimalala1, Stephane Rombauts2,3, Andrew McCarthy4, Andréa Garavito5,6, Simon Orozco-Arias7,8, Laurence Bellanger9, Alexa Yadira Morales-Correa5, Solène Froger9, Stéphane Michaux9, Victoria Berry9, Sylviane Metairon10, Coralie Fournier10,11, Maud Lepelley9, Lukas Mueller12, Emmanuel Couturon13, Perla Hamon13, Jean-Jacques Rakotomalala1, Patrick Descombes10, Romain Guyot14,15, Dominique Crouzillat16.
Abstract
Caffeine is the most consumed alkaloid stimulant in the world. It is synthesized through the activity of three known N-methyltransferase proteins. Here we are reporting on the 422-Mb chromosome-level assembly of the Coffea humblotiana genome, a wild and endangered, naturally caffeine-free, species from the Comoro archipelago. We predicted 32,874 genes and anchored 88.7% of the sequence onto the 11 chromosomes. Comparative analyses with the African Robusta coffee genome (C. canephora) revealed an extensive genome conservation, despite an estimated 11 million years of divergence and a broad diversity of genome sizes within the Coffea genus. In this genome, the absence of caffeine is likely due to the absence of the caffeine synthase gene which converts theobromine into caffeine through an illegitimate recombination mechanism. These findings pave the way for further characterization of caffeine-free species in the Coffea genus and will guide research towards naturally-decaffeinated coffee drinks for consumers.Entities:
Year: 2021 PMID: 33854089 PMCID: PMC8046976 DOI: 10.1038/s41598-021-87419-0
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Figure 1Representation of Coffea humblotiana. (A) Full tree of the C. humblotiana accession RM-CF-00679. (B) and (C) Inflorescences and collected fruits. Pictures were done by Emmanuel Couturon (IRD). (D) Location of Mayotte island. The map was drawn using Inkscape V.1.
Statistics for the C. humblotiana genome and gene annotation.
| Number of scaffolds [#] | 390 |
| Total size of scaffolds [Mp] | 420.72 |
| Longest scaffold [bp] | 57,522,413 |
| N50 scaffold length [bp] | 29,629,744 |
| L50 scaffold count [#] | 6 |
| Number of genes | 32,874 |
| Average overall gene size [bp] | 2,733 |
| Average overall CDS size [bp] | 1000 |
| Average overall exon size [bp] | 214 |
Figure 2Features of the C. humblotiana genome. The density of the following features (predicted genes in black; all predicted transposable elements—TE—in green, Gypsy LTR retrotransposons in red; Del lineage, CRM lineage in orange and Copia LTR retrotransposons in blue) were calculated with a window’s length of 100 kb. Pseudochromosomes are oriented in a similar way to C. canephora available genome[15].
Figure 3Comparative structural genomics between C. humblotiana and C. canephora. (A) Coding-region-based synteny between C. canephora (red) and C. humblotiana (green). For graphical purpose only, non-anchored contigs/scaffolds were merged into a single chromosome-zero (1000 N interspersed). (B) Whole genome dot plot between the C. humblotiana pseudo-chromosomes (horizontal sequence) and the C. canephora published pseudo-chromosomes (vertical sequence). (C) Dot plot between the pseudo chromosomes 3 of C. humblotiana and C. canephora. (D) Graphical representation of the 727 Kb region from C. humblotiana (upper line), and the 820 Kb region from C. canephora (lower line) at the SH3 locus, showing their annotated genes. Red boxes correspond to genes with one orthologous gene found in the compared segment, while black boxes account for unpaired or duplicated genes. White boxes represent a stretch of Ns found on the C. canephora genome. Colored lines linking both genomes represent the percentage of protein identity found in a pairwise comparison between genes.
List of annotated and classified NMT genes.
| Gene name | Chromosome | Putative orthologous gene ID in | Chromosome | |
|---|---|---|---|---|
| cc02_g09350 | Chr2 | Cohum02g11490 | Chr2 | |
| cc01_g00720 | Chr1 | – | – | |
| cc09_g06970 | Chr9 | Cohum09g08800 | Chr9 | |
| cc00_g24720 | Chr9 | Cohum09g08760 | Chr9 | |
| cc09_g06960 | Chr9 | Cohum09g08830; cohum09g08820 | Chr9 | |
| cc09_g06950 | Chr9 | Cohum09g08730 | Chr9 |
Figure 4Evolution of NMT genes in C. humblotiana and C. canephora. (A) Representation of the methylation steps of the caffeine biosynthesis in coffee. (B) Phylogenetic analysis of complete NMT proteins in C. humblotiana, C. canephora and Gardenia jasminoides. Reference proteins for XMT (A4GE69), MXMT (jx978517) and DXMT (jx978516) are from C. canephora. IDs of C. humblotiana proteins are in red and IDs of C. canephora (in black) are named as in the genome annotation release. The protein Cc09_g07000 (NMT4) is used as outgroup. Numbers indicate the aLRT branch support. (C) Multiple sequence alignment of the C. canephora and C. humblotiana MNT proteins. Secondary structure plot is given for CcDXMT (CcDXMT (above) and CcXMT (below). The SAM binding motifs (A, B’ and C) and the conserved YFFF region are marked by boxes, and green circles identify crucial residues in substrate recognition and catalysis.
Figure 5Microsynteny between C. canephora (CC) and C. humblotiana (CH) at the DXMT locus on chromosome 1 (A) and at the MXMT/XMT locus on chromosome 9 (B). (A) Representation of the microsynteny between C. canephora (CC; chromosome 1 position 1.1–1.3 Mb) and C. humblotiana (CH; chromosome 1 position 0.4–0.6 Mb). The DMXT gene, only present on C. canephora is indicated in red. The shaded boxes indicate the duplicated region in CC versus CH. (B) Representation of the microsynteny between C. canephora (CC; chromosome 9 position 8.1–8.35 Mb) and C. humblotiana (CH; chromosome 9 position 6.5–6.9 Mb). MTL genes are indicated in red. Asterisks indicate pseudogenes. In Grey are indicated the position of gaps in the assembly of the C. canephora genome. Triangles indicate the position and orientation of coding regions. Lines indicate the conservation of genes named by their IDs.
Caffeine, theobromine and total chlorogenic acids contents on young leaves of C. arabica, C. canephora and C. humblotiana.
| Caffeine | Theobromine | Total Chlorogenic acids | |||||||
|---|---|---|---|---|---|---|---|---|---|
| Minimum | 0.91 | 2.16 | 0.00 | 0.01 | 0.00 | 0.04 | 4.14 | 5.04 | 0.40 |
| Maximum | 2.09 | 4.36 | 0.01 | 0.04 | 1.25 | 0.23 | 5.47 | 7.50 | 2.23 |
| Means | 1.50a | 3.43b | 0.003c | 0.02a | 0.42ab | 0.13b | 4.60a | 5.89a | 1.40b |
| Std dev | 0.51 | 1.01 | 0.00 | 0.01 | 0.64 | 0.09 | 0.62 | 1.22 | 0.82 |
The biochemical compounds are average of three genotypes for each Coffea species (Supplementary Table S3). Data are expressed in percent of dry matter basis (% dmb) and their class membership, according to Kruskal–Wallis test, is indicated by a letter. All tests were significant (P < 0.001).