| Literature DB >> 31320697 |
Francesco Mercati1, Ignazio Fontana2, Alessandro Silvestre Gristina2, Adriana Martorana2, Mahran El Nagar3, Roberto De Michele2, Silvio Fici4, Francesco Carimi5.
Abstract
Caper (Capparis spinosa L.) is a xerophytic shrub cultivated for its flower buds and fruits, used as food and for their medicinal properties. Breeding programs and even proper taxonomic classification of the genus Capparis has been hampered so far by the lack of reliable genetic information and molecular markers. Here, we present the first genomic resource for C. spinosa, generated by transcriptomic approach and de novo assembly. The sequencing effort produced nearly 80 million clean reads assembled into 124,723 unitranscripts. Careful annotation and comparison with public databases revealed homologs to genes with a key role in important metabolic pathways linked to abiotic stress tolerance and bio-compounds production, such purine, thiamine and phenylpropanoid biosynthesis, α-linolenic acid and lipid metabolism. Additionally, a panel of genes involved in stomatal development/distribution and encoding for Stress Associated Proteins (SAPs) was also identified. We also used the transcriptomic data to uncover novel molecular markers for caper. Out of 50 SSRs tested, 14 proved polymorphic and represent the first set of SSR markers for the genus Capparis. This transcriptome will be an important contribution to future studies and breeding programs for this orphan crop, aiding to the development of improved varieties to sustain agriculture in arid conditions.Entities:
Mesh:
Substances:
Year: 2019 PMID: 31320697 PMCID: PMC6639398 DOI: 10.1038/s41598-019-46613-x
Source DB: PubMed Journal: Sci Rep ISSN: 2045-2322 Impact factor: 4.379
Overview of sequencing outputs and assembly of Capparis spinosa leaf transcriptome.
| Items | Trinity transcriptome assembly | Assembly after CD-HIT-EST clustering |
|---|---|---|
| Number of transcripts | 208,677 | 124,723 |
| Total size of transcripts | 311,614,381 | 176,788,523 |
| Longest contig (bp) | 17,493 | 17,493 |
| Shortest contig (bp) | 201 | 201 |
| Mean contig size (bp) | 1,493 | 1,417 |
| Median contig size (bp) | 1,114 | 999 |
| N50 contig length (bp) | 2,431 | 2,380 |
Figure 1Species-based distribution of blastx matches for each clustered unitranscript of Capparis spinosa leaf transcriptome. The species with a match <1% were grouped in the ‘Other’ category.
Overview of functional annotation by homology of Capparis spinosa leaf transcriptome.
| Category | N° of unitranscripts |
|---|---|
| Predicted ORFs | 104,505 |
| Predicted proteins | 64,541 |
| Uniref90_Top_BLASTX_hit | 85,294 |
| sprot_Top_BLASTX_hit | 66,902 |
| Uniref90_Top_BLASTP_hit | 62,339 |
| sprot_Top_BLASTP_hit | 51,048 |
| Pfam | 46,099 |
| SignalP | 3,341 |
ORFs, open reading frames;
Uniref90_Top_BLASTX_hit, top blastx hits against UniRef90 database;
sprot_Top_BLASTX_hit, top blastx hits against UniProtKB/Swiss-Prot database;
Uniref90_Top_BLASTP_hit, top blastp hits against UniRef90 database;
sprot_Top_BLASTP_hit, top blastp hits against UniProtKB/Swiss-Prot database;
Pfam, protein domain analysis was performed using http://www.sanger.ac.uk/software/pfam/;
SignalP, the presence of signal peptides was detected using http://www.cbs.dtu.dk/services/SignalP/.
Figure 2EuKaryotic Orthologous Groups (KOG) in Capparis spinosa leaf transcriptome. The unigenes with significant homologies in the KOG database were grouped into 24 categories. The number of unigenes belonging to each category was reported in the y-axis, while the subgroups in the KOG classification were represented in the x-axis.
Figure 3Analysis of purine (A) and thiamine (B) metabolism pathways by KEGG, showing the identified enzymes in Capparis spinosa leaf transcriptome (Enzyme Code - EC - identified are in green).
Figure 4KEGG analysis showing genes involved in α-linolenic acid metabolism in Capparis spinosa leaf transcriptome (Enzyme Code - EC - identified are in green).
Figure 5KEGG analysis showing genes involved in phenylpropanoid biosynthesis (A) and glycerolipid metabolism (B) in Capparis spinosa leaf transcriptome (Enzyme Code - EC - identified are in green).
List of Capparis spinosa leaf transcripts homologous to genes encoding for SAPs.
| Gene code | Gene name | Species | Homology (%)* | |
|---|---|---|---|---|
|
|
|
| 69.23 | TRINITY_DN23049_c0_g1_i2 |
| 68.64 | TRINITY_DN23049_c0_g3_i1 | |||
| 69.23 | TRINITY_DN23049_c0_g1_i3 | |||
| 69.23 | TRINITY_DN23049_c0_g1_i4 | |||
| 68.64 | TRINITY_DN23049_c0_g3_i3 | |||
| 69.23 | TRINITY_DN23049_c0_g1_i7 | |||
| 69.23 | TRINITY_DN23049_c0_g1_i8 | |||
|
|
|
| 58.24 | TRINITY_DN21298_c0_g1_i2 |
|
|
|
| 65.03 | TRINITY_DN23901_c0_g2_i2 |
| 62.18 | TRINITY_DN23049_c0_g2_i4 | |||
| 62.18 | TRINITY_DN23049_c0_g2_i12 | |||
| 62.18 | TRINITY_DN23049_c0_g2_i14 | |||
|
|
|
| 58.58 | TRINITY_DN20182_c1_g1_i1 |
| 58.58 | TRINITY_DN20182_c1_g1_i2 | |||
| 58.58 | TRINITY_DN20182_c1_g1_i3 | |||
| 58.58 | TRINITY_DN20182_c1_g1_i4 | |||
| 87.5 | TRINITY_DN26805_c2_g4_i1 | |||
| 80.85 | TRINITY_DN21640_c0_g3_i4 | |||
| 80.85 | TRINITY_DN21640_c0_g3_i6 | |||
| 80.85 | TRINITY_DN21640_c0_g3_i8 | |||
|
|
|
| 57.87 | TRINITY_DN18210_c0_g1_i2 |
| 55.91 | TRINITY_DN18210_c0_g2_i1 | |||
|
|
|
| 66.48 | TRINITY_DN23049_c0_g5_i1 |
|
|
|
| 57.25 | TRINITY_DN5273_c0_g1_i1 |
|
| 60.92 | TRINITY_DN21298_c0_g1_i1 | ||
|
| 60.69 | TRINITY_DN21298_c0_g1_i11 | ||
|
|
|
| 60.32 | TRINITY_DN33098_c0_g1_i1 |
|
|
|
| 77.95 | TRINITY_DN19581_c0_g1_i1 |
| 79.86 | TRINITY_DN19581_c0_g1_i2 | |||
|
|
|
| 65.52 | TRINITY_DN14592_c0_g1_i1 |
| 70.76 | TRINITY_DN14592_c0_g1_i2 | |||
| 66.19 | TRINITY_DN14592_c0_g1_i3 |
*The homology (best hit) was obtained blasting C. spinosa transcripts to NR database.
Summary of EST-SSRs and their repeat motif isolated from Capparis spinosa leaf transcriptome.
| Repeat motif | Number of repeat units | Total | (%) | |||||||
|---|---|---|---|---|---|---|---|---|---|---|
| 4 | 5 | 6 | 7 | 8 | 9 | 10 | >10 | |||
| Di- nucleotide | — | — | — | — | — | — | — | 362 |
|
|
| Tri- nucleotide | — | — | — | 1070 | 810 | 227 | 239 | 410 |
|
|
| Tetra- nucleotide | — | 343 | 152 | 17 | 15 | 19 | 2 | 18 |
|
|
| Penta- nucleotide | — | 131 | 48 | 19 | 8 | 4 | — | — | 210 |
|
| Hexa- nucleotide | 828 | 120 | 82 | 40 | 21 | 15 | 5 | 4 |
|
|
|
|
|
|
|
|
|
|
|
|
| |
| % |
|
|
|
|
|
|
| |||
Main genetic parameters from the 14 polymorphic EST-SSR loci of the population under investigation (sample size 75).
| Marker | Allele | Size range | He | Ho | PIC | Fis | Fst |
|---|---|---|---|---|---|---|---|
| ESTcapp5 | 11 | 95–125 | 0.750 | 0.587 | 0.721 | −0.003 | 0.495 |
| ESTcapp8 | 6 | 122–137 | 0.700 | 0.118 | 0.647 | 0.830 | 0.515 |
| ESTcapp10 | 5 | 134–142 | 0.529 | 0.187 | 0.498 | 0.602 | 0.279 |
| ESTcapp11 | 2 | 147–153 | 0.420 | 0.440 | 0.332 | −0.057 | 0.010 |
| ESTcapp14 | 4 | 136–148 | 0.531 | 0.107 | 0.476 | 0.768 | 0.382 |
| ESTcapp18 | 5 | 102–112 | 0.462 | 0.440 | 0.441 | −0.055 | 0.210 |
| ESTcapp20 | 5 | 120–128 | 0.631 | 0.254 | 0.561 | 0.471 | 0.695 |
| ESTcapp21 | 6 | 130–145 | 0.585 | 0.480 | 0.520 | −0.026 | 0.325 |
| ESTcapp32 | 10 | 159–181 | 0.735 | 0.227 | 0.699 | 0.519 | 0.657 |
| ESTcapp33 | 9 | 136–163 | 0.843 | 0.548 | 0.826 | 0.245 | 0.594 |
| ESTcapp35 | 4 | 101–113 | 0.472 | 0.480 | 0.417 | −0.058 | 0.096 |
| ESTcapp37 | 6 | 86–106 | 0.759 | 0.533 | 0.718 | 0.062 | 0.494 |
| ESTcapp46 | 6 | 129–156 | 0.687 | 0.347 | 0.636 | 0.310 | 0.574 |
| ESTcapp49 | 8 | 131–161 | 0.711 | 0.440 | 0.677 | −0.014 | 0.610 |
|
|
|
|
|
|
|
|
He: Genetic diversity; Ho: Observed heterozygosity; PIC: Polymorphism Information Content; Fis: Inbreeding coefficient; Fst: Fixation index.
Figure 6Genetic relationships among genotypes belonging to Capparis spinosa collection sampled across the distribution area of the species. (A) Dendrogram generated by 14 polymorphic EST-SSR developed in the present study, using the UPGMA method and Bruvo’s distance. (B) DAPC analysis clustering of the eight populations studied using the first two principal components (Y-axis and X-axis, respectively). CC: C. spinosa subsp. spinosa; CR: C. spinosa subsp. rupestris. The samples used for the EST-SSR validation were gathered in 8 main groups: CC Sicily, CC world, CR Favignana, CR Italy, CR Pantelleria, CR Salina, CR Sicily and CR Ustica.