| Literature DB >> 35955640 |
Naganeeswaran Sudalaimuthuasari1, Rashid Ali1,2, Martin Kottackal1, Mohammed Rafi1, Mariam Al Nuaimi1, Biduth Kundu3, Raja Saeed Al-Maskari3, Xuewen Wang4, Ajay Kumar Mishra1, Jithin Balan1, Srinivasa R Chaluvadi4, Fatima Al Ansari3, Jeffrey L Bennetzen4, Michael D Purugganan5,6, Khaled M Hazzouri1, Khaled M A Amiri1,3.
Abstract
The mimosoid legumes are a clade of ~40 genera in the Caesalpinioideae subfamily of the Fabaceae that grow in tropical and subtropical regions. Unlike the better studied Papilionoideae, there are few genomic resources within this legume group. The tree Prosopis cineraria is native to the Near East and Indian subcontinent, where it thrives in very hot desert environments. To develop a tool to better understand desert plant adaptation mechanisms, we sequenced the P. cineraria genome to near-chromosomal assembly, with a total sequence length of ~691 Mb. We predicted 77,579 gene models (76,554 CDS, 361 rRNAs and 664 tRNAs) from the assembled genome, among them 55,325 (~72%) protein-coding genes that were functionally annotated. This genome was found to consist of over 58% repeat sequences, primarily long terminal repeats (LTR-)-retrotransposons. We find an expansion of terpenoid metabolism genes in P. cineraria and its relative Prosopis alba, but not in other legumes. We also observed an amplification of NBS-LRR disease-resistance genes correlated with LTR-associated retrotransposition, and identified 410 retrogenes with an active burst of chimeric retrogene creation that approximately occurred at the same time of divergence of P. cineraria from a common lineage with P. alba~23 Mya. These retrogenes include many biotic defense responses and abiotic stress stimulus responses, as well as the early Nodulin 93 gene. Nodulin 93 gene amplification is consistent with an adaptive response of the species to the low nitrogen in arid desert soil. Consistent with these results, our differentially expressed genes show a tissue specific expression of isoprenoid pathways in shoots, but not in roots, as well as important genes involved in abiotic salt stress in both tissues. Overall, the genome sequence of P. cineraria enriches our understanding of the genomic mechanisms of its disease resistance and abiotic stress tolerance. Thus, it is a very important step in crop and legume improvement.Entities:
Keywords: NBS-LRR gene amplification; abiotic stress response genes; mesquites; retrogenes; terpenoid synthesis genes
Mesh:
Year: 2022 PMID: 35955640 PMCID: PMC9369113 DOI: 10.3390/ijms23158503
Source DB: PubMed Journal: Int J Mol Sci ISSN: 1422-0067 Impact factor: 6.208
Figure 1Distribution, genome assembly and orthology analysis of P. cineraria. (A) Geographic distribution of Prosopis species from around the world. P. cineraria is native to the Near East and Indian subcontinent, while other species are native to North and South America, and the Caribbean. (B) Blob Toolkit Snail plot describing assembly statistics. From inside to outside, cumulative scaffold count on log scale is depicted as light-gray spirals, and the changes in order of magnitude with white scale lines. The dark-gray segments show distribution of scaffold lengths, and the longest scaffold depicted in red was used to scale the plot radius. N50 and N90 scaffold lengths are highlighted in orange and light-orange rings, respectively. Blue and light-blue rings represent the percentages of GC, AT, and N in the genome assembly. (C) Orthologous group analysis of P. cineraria and P. alba from the mimosoid clade compared with other legumes are represented using UpsetR plot. Green bars represent groups shared with other legumes, while red bars are orthologous gene groups shared only by P. cineraria and P. alba. (D) GO enrichment analysis of the shared red bar plot orthogroups plotted using REVIGO, displaying the biological process. Each sphere represents a GO term colored by p-value in −log10 scale. The semantic similarity of these GO terms is represented by the position and distance among them. The log size is the logarithm of the number of terms present in each sphere.
P. cineraria genome assembly statistics.
| Features | Values |
|---|---|
| Total scaffolds | 2265 |
| Total genome size | 691,392,202 bp |
| Pseudochromosome | 14 |
| Pseudochromosome coverage | ~86% |
| (A + T) percentage | 67.8% |
| (G + C) percentage | 32.1% |
| N percentage | 2.44% |
| Min sequence length | 4999 bp |
| Max sequence length | 59,799,197 bp |
| Average sequence length | 305,250.42 bp |
| N50 length | 41,482,946 bp |
| L50 number | 8 |
| Repeat % | 58% |
| Number of genes | 76,554 |
| Number of exons | 344,680 |
| Number of rRNA genes | 361 |
| Number of tRNA genes | 664 |
Figure 2Genome evolution of P. cineraria. (A) Ultrametric tree of 18 legumes, including the mimosoid clade (highlighted by red box) and the two outgroups A. thaliana and O. sativa. CAFÉ analysis depicts total number of expanded and contracted gene families as well as rapidly evolving genes. The bubble on the node and leaf of the tree highlights the average expansion or contraction for each of the species, where a positive number depicts more expansion. (B) Venn diagram of top significant gene families (p < 10−2) that are expanded or contracted in P. cineraria and under positive selection. (C) GO term enrichments of expanded gene families that are under selection (left) and contracted gene families under selection (right).
Figure 3Comparative genome analysis of repeats, including disease resistance genes (NBS-LRR). (A) Pie chart of the percentage of DNA, LTR-retrotranspons, and unclassified repeats of the genome mapped onto the ultrameric tree. Z-score of the number of NBS-LRR, copia-like and gypsy-like LTR-retrotransposons across the phylogenetic tree. Positive values show increase in number while negative values depict a decrease. (B) Circos plot of co-localization of repeats and disease-resistance genes (NBS-LRRs) with different layers from outside to inside (black arrow direction) showing gene density followed by GC content, gypsy-like/copia-like repeats, (nucleotide diversity) in 20 Kb windows, and NBS-LRR distribution. Connected bands on the inside represents parental and retrogene distributions across the 14 longest scaffolds (pseudochromosomes) of P. cineraria. (C) Spearman correlation of disease-resistance genes (NBS-LRR) with total repeats as well as LTR-retrotransposons (top). Spearman correlation of genome size and total repeats as well as disease-resistance genes.
Figure 4Retrogene identification and selection. (A) Genomic distribution of parental and retrogenes across 15 P. cineraria scaffolds (14 pseudochromosomes and one other scaffold). (B) The ratio of nonsynonymous substitutions to synonymous substitutions (Ka/Ks) of the parental to retrocopied genes for different classes of retrogenes. For Ka/Ks > 1, the Early nodulin 93 and NB-ARC genes are highlighted. (C) GO enrichment analysis of retrogenes displaying biological processes. Each sphere represents a GO term whose degree of enrichment is reflected in color on a −log10(P) scale. The semantic similarity of these GO terms is represented by the position and distance among them. The log size is the logarithm of the number of terms that are present in each sphere.
Figure 5The timing of chimerical retrogene generation in P. cineraria. (A) Ks distribution of the different classes of retrogenes. The red dotted lines around Ks = 0.01–0.05 highlights the initial amplification. (B) Ultrametric tree highlighting the time of divergence of P. cineraria and P. alba, which overlaps with major amplification of chimerical retrogenes, using a mutation rate of 10−9 and using the formula (T = k/2u).
Figure 6Differential gene expression in shoots and roots of P. cineraria under 250 Mm salt stress. (A) Volcano plot of shoots of up-regulated (red) and down-regulated (blue) at log2 fold change in 2 and −log10 significance. Gray dots represent neutral (relatively unchanged) genes. The names of the top up-regulated and down-regulated genes are indicated. Pathway enrichment analysis for the up-regulated (red) and down-regulated (blue) genes are depicted outside the volcano plot. (B) Volcano plot of roots of up-regulated (red) and down-regulated (blue) at log2 fold change in 2 and −log10 significance. Gray dots represent neutral (relatively unchanged) genes. Names of top up-regulated and down-regulated genes are indicated. Pathway enrichment analysis for the up-regulated (red) and down-regulated (blue) genes are depicted outside the volcano plot.