| Literature DB >> 35774624 |
Rui Huang1, Wayne A Snedden1, George C diCenzo1.
Abstract
Host/symbiont compatibility is a hallmark of the symbiotic nitrogen-fixing interaction between rhizobia and legumes, mediated in part by plant-produced nodule-specific cysteine-rich (NCR) peptides and the bacterial BacA membrane protein that can act as a NCR peptide transporter. In addition, the genetic and metabolic properties supporting symbiotic nitrogen fixation often differ between compatible partners, including those sharing a common partner, highlighting the need for multiple study systems. Here, we report high-quality nodule transcriptome assemblies for Medicago sativa cv. Algonquin and Melilotus officinalis, two legumes able to form compatible symbioses with Sinorhizobium meliloti. The compressed M. sativa and M. officinalis assemblies consisted of 79,978 and 64,593 contigs, respectively, of which 33,341 and 28,278 were assigned putative annotations, respectively. As expected, the two transcriptomes showed broad similarity at a global level. We were particularly interested in the NCR peptide profiles of these plants, as these peptides drive bacterial differentiation during the symbiosis. A total of 412 and 308 NCR peptides were predicted from the M. sativa and M. officinalis transcriptomes, respectively, with approximately 9% of the transcriptome of both species consisting of NCR transcripts. Notably, transcripts encoding highly cationic NCR peptides (isoelectric point > 9.5), which are known to have antimicrobial properties, were ∼2-fold more abundant in M. sativa than in M. officinalis, and ∼27-fold more abundant when considering only NCR peptides in the six-cysteine class. We hypothesize that the difference in abundance of highly cationic NCR peptides explains our previous observation that some rhizobial bacA alleles which can support symbiosis with M. officinalis are unable to support symbiosis with M. sativa.Entities:
Keywords: NCR peptides; legumes; rhizobia; symbiotic nitrogen fixation; transcriptomics
Year: 2022 PMID: 35774624 PMCID: PMC9219011 DOI: 10.1002/pld3.408
Source DB: PubMed Journal: Plant Direct ISSN: 2475-4455
Summary statistics from the de novo trinity and compressed (SuperTranscripts) nodule transcriptome assemblies
|
|
| |||
|---|---|---|---|---|
| Trinity assembly | SuperTranscripts | Trinity assembly | SuperTranscripts | |
| Total number of contigs | 253,871 | 79,978 | 192,165 | 64,593 |
| Total number of base pairs (bp) | 239,421,306 | 96,211,060 | 191,200,637 | 81,371,534 |
| Average contig length (bp) | 943 | 1,203 | 995 | 1,260 |
| Median contig length (bp) | 618 | 734 | 646 | 780 |
| Contig N50 (bp) | 1,466 | 1,912 | 1,584 | 1968 |
| Minimum contig length (bp) | 176 | 193 | 183 | 198 |
| Maximum contig length (bp) | 12,655 | 29,784 | 14,584 | 23,941 |
| Overall alignment rate (%) | 98.99 | 92.14 | 99.15 | 93.76 |
FIGURE 1Estimates of nodule transcriptome completeness. Completeness of the Medicago and nodule transcriptome assemblies was assessed using BUSCO with the (a) Viridiplantae and (b) Fabales single‐copy marker gene datasets. The fraction of BUSCO genes identified as complete and single‐copy (light blue), complete but duplicated (dark blue), fragmented (yellow), and missing (red) is shown
FIGURE 2Summary of the slim GO biological processes annotations for the nodule transcriptomes. Transcripts were annotated with slim GO terms, and the annotations for the biological processes were summarized as pie charts for (a) Medicago and (b)
The 50 most highly abundant transcripts in the nodule transcriptome, with the average expression level in transcripts per million (TPM) and the functional annotation
| Gene ID | TPM | Functional prediction |
|---|---|---|
| Cluster‐2.43518 | 30,662 | Hypothetical protein (hypothetical leghaemoglobin) |
| Cluster‐2.26078 | 17,769 | Putative albumin I |
| Cluster‐2.23447 | 12,515 | Hypothetical protein (hypothetical leghaemoglobin) |
| Cluster‐2.23176 | 8,065 | Nodulin‐25 |
| Cluster‐2.22272 | 7,992 | Putative late nodulin |
| Cluster‐2.23033 | 6,778 | None |
| Cluster‐2.21873 | 5,919 | Hypothetical protein |
| Cluster‐2.22935 | 5,253 | Belongs to the globin family |
| Cluster‐2.33070 | 5,232 | Putative ribonuclease H‐like domain‐containing protein |
| Cluster‐2.24197 | 4,245 | Component of the replication protein A complex (RPA) |
| Cluster‐2.22983 | 4,196 | Belongs to the globin family |
| Cluster‐2.24546 | 3,886 | Predicted NCR peptide (crp1450_Cluster‐2.24546_0M_1) |
| Cluster‐2.19511 | 3,344 | None |
| Cluster‐2.26207 | 3,296 | Extensin‐like_protein_repeat |
| Cluster‐2.49512 | 3,173 | Putative blue (type 1) copper binding protein |
| Cluster‐2.21810 | 3,050 | Predicted NCR peptide (crp1160_Cluster‐2.21810_0M_1) |
| Cluster‐2.22936 | 3,014 | Belongs to the globin family |
| Cluster‐2.29430 | 2,789 | Putative late nodulin |
| Cluster‐2.22881 | 2,788 | Predicted NCR peptide (crp1430_Cluster‐2.22881_0M_1) |
| Cluster‐2.22836 | 2,509 | None |
| Cluster‐2.23729 | 2,271 | Hypothetical protein |
| Cluster‐2.22458 | 2,245 | Belongs to the globin family |
| Cluster‐2.24829 | 2,227 | Hypothetical protein |
| Cluster‐2.25168 | 2,209 | None |
| Cluster‐2.24278 | 2,093 | Nodule‐specific_GRP_repeat |
| Cluster‐2.23928 | 2,038 | Late_nodulin_protein |
| Cluster‐2.22042 | 2,029 | None |
| Cluster‐2.23245 | 2,019 | Putative translationally controlled tumor protein |
| Cluster‐2.25794 | 2,018 | Putative protein‐synthesizing GTPase |
| Cluster‐2.31376 | 1,957 | Predicted NCR peptide (crp1190_Cluster‐2.31376_0M_1) |
| Cluster‐2.21809 | 1,939 | Predicted NCR peptide (crp1160_Cluster‐2.21809_0M_1) |
| Cluster‐2.28083 | 1,854 | Predicted NCR peptide (crp1210_Cluster‐2.28083_0M_1) |
| Cluster‐2.23524 | 1,844 | Early nodulin‐16 |
| Cluster‐2.34649 | 1,839 | Putative late nodulin |
| Cluster‐2.16993 | 1,808 | Hypothetical protein |
| Cluster‐2.21813 | 1,731 | None |
| Cluster‐2.22457 | 1,705 | Belongs to the globin family |
| Cluster‐2.28283 | 1,674 | None |
| Cluster‐2.26205 | 1,621 | Predicted NCR peptide (crp1240_Cluster‐2.26205_0M_1) |
| Cluster‐2.23310 | 1,607 | Asparagine synthetase |
| Cluster‐2.21870 | 1,596 | Nodule‐specific_GRP_repeat |
| Cluster‐2.19604 | 1,583 | Predicted NCR peptide (crp1420_Cluster‐2.19604_0M_1) |
| Cluster‐2.18485 | 1,509 | Hypothetical protein |
| Cluster‐2.26876 | 1,458 | Predicted NCR peptide (crp1410_Cluster‐2.26876_0M_1) |
| Cluster‐2.21536 | 1,441 | Ubiquitin_family |
| Cluster‐2.22785 | 1,408 | None |
| Cluster‐2.23374 | 1,385 | Predicted NCR peptide (crp1420_Cluster‐2.23374_0M_1) |
| Cluster‐2.28787 | 1,311 | Putative late nodulin |
| Cluster‐2.22402 | 1,292 | Predicted NCR peptide (crp1520_Cluster‐2.22402_0M_1) |
| Cluster‐2.30081 | 1,289 | Late_nodulin_protein |
The 50 most highly abundant transcripts in the Melilotus nodule transcriptome, with the average expression level in transcripts per million (TPM) and the functional annotation
| Gene ID | TPM | Functional prediction |
|---|---|---|
| Cluster‐3554.18801 | 38,953 | Belongs to the globin family |
| Cluster‐3554.18778 | 14,091 | Belongs to the globin family |
| Cluster‐3554.16063 | 10,412 | Late_nodulin_protein |
| Cluster‐3554.15387 | 7,885 | Putative late nodulin |
| Cluster‐3554.16088 | 6,146 | Putative late nodulin |
| Cluster‐3554.18892 | 6,104 | None |
| Cluster‐3554.15771 | 5,813 | Late_nodulin_protein |
| Cluster‐3554.18802 | 5,362 | Belongs to the globin family |
| Cluster‐3554.18596 | 5,347 | Predicted NCR peptide (crp1430_Cluster‐3554.18596_0M_1) |
| Cluster‐3554.15456 | 3,912 | Putative translationally controlled tumor protein |
| Cluster‐3554.18808 | 3,864 | Belongs to the globin family |
| Cluster‐3554.33215 | 3,302 | Hypothetical protein |
| Cluster‐3554.18775 | 3,097 | Putative late nodulin |
| Cluster‐3554.23000 | 2,990 | Two predicted NCR peptide (crp1180_Cluster‐3554.23000_0M_1 and crp1180_Cluster‐3554.23000_0M_2) |
| Cluster‐3554.36681 | 2,877 | Predicted NCR peptide (crp1500_Cluster‐3554.36681_0M_1) |
| Cluster‐3554.21555 | 2,868 | Predicted NCR peptide (crp1430_Cluster‐3554.21555_0M_1) |
| Cluster‐3554.16074 | 2,809 | Putative BURP domain‐containing protein |
| Cluster‐3554.29297 | 2,784 | Hypothetical protein |
| Cluster‐3554.18196 | 2,723 | None |
| Cluster‐3554.18256 | 2,571 | Predicted NCR peptide (crp1440_Cluster‐3554.18256_0M_1) |
| Cluster‐3554.27063 | 2,522 | None |
| Cluster‐3554.22577 | 2,449 | Putative blue (type 1) copper binding protein |
| Cluster‐3554.15361 | 2,409 | eEF1A |
| Cluster‐3554.21311 | 2,384 | Belongs to the globin family |
| Cluster‐3554.18838 | 2,340 | Late_nodulin_protein |
| Cluster‐3554.18700 | 2,296 | Late_nodulin_protein |
| Cluster‐3554.11713 | 2,261 | None |
| Cluster‐3554.18706 | 2,259 | None |
| Cluster‐3554.23445 | 2,214 | None |
| Cluster‐3554.17366 | 2,141 | Predicted NCR peptide (crp1430_Cluster‐3554.17366_0M_1) |
| Cluster‐3554.23502 | 2,073 | Hypothetical protein |
| Cluster‐3554.25153 | 2,009 | Metallothionein‐like protein 2 |
| Cluster‐3554.21451 | 1,953 | Belongs to the globin family |
| Cluster‐3554.28600 | 1,931 | Hypothetical protein |
| Cluster‐3554.22971 | 1,849 | Hypothetical protein |
| Cluster‐3554.18877 | 1,812 | Belongs to the glyceraldehyde‐3‐phosphate dehydrogenase family |
| Cluster‐3554.13172 | 1,732 | Predicted NCR peptide (crp1440_Cluster‐3554.13172_0M_1) |
| Cluster‐3554.18195 | 1,704 | Zinc_knuckle |
| Cluster‐3554.30751 | 1,653 | Putative late nodulin |
| Cluster‐3554.21774 | 1,635 | Late_nodulin_protein |
| Cluster‐3554.13784 | 1,629 | Predicted NCR peptide (crp1420_Cluster‐3554.13784_0M_1) |
| Cluster‐3554.24033 | 1,611 | Prolyl isomerase (PPIase) |
| Cluster‐3554.30294 | 1,556 | Metallothionein‐like protein |
| Cluster‐3554.24764 | 1,552 | None |
| Cluster‐3554.18368 | 1,552 | Nucleoside diphosphate kinase 1 |
| Cluster‐3554.25344 | 1,537 | Metallothionein‐like protein 1 |
| Cluster‐3554.23646 | 1,514 | None |
| Cluster‐3554.9874 | 1,512 | Belongs to the universal ribosomal protein uL13 family |
| Cluster‐3554.18476 | 1,503 | Late_nodulin_protein |
| Cluster‐3554.31722 | 1,476 | Putative late nodulin |
FIGURE 3Transcript abundances for conserved and species‐specific transcripts. Box plots displaying the distribution of average transcript abundances from triplicate samples, shown separately for genes with orthologs in both Medicago and (orange), annotated transcripts found in only or (blue), or transcripts that lack annotations and are found in only or (green). Statistically significant differences between the distributions of a species are indicated with the asterisks (p‐value < 1e−10; pairwise Wilcox tests)
FIGURE 4Correlation between transcript abundances of orthologous transcripts in Medicago and . Each datapoint represents the transcript abundance of single‐copy orthologous transcripts in and . Red datapoints represent transcripts that are differentially abundant between the two species (|log2[fold change]| > 5, adjusted p‐value < .05); all other datapoints are in gray. The blue line represents the robust linear regression line, calculated with the rlm function of the MASS package in R
FIGURE 5Nodule‐specific cysteine‐rich (NCR) peptide profiles of and . NCR peptides were predicted from the (orange) and (blue) transcriptome assemblies, and the properties of the NCR peptides are shown in these graphs. (a) Box plots showing the distribution of the abundance (in transcripts per million, TPM) of NCR transcripts, based on triplicate samples. The difference in the distributions for the two species was statistically significant (p‐value < .001; pairwise Wilcox test). (b) Box plots showing the distribution of the amino acid lengths of mature NCR peptides. No statistically significant difference in the distributions for the two species was detected. (c, d) Histograms showing the distributions of the isoelectric points (pI) for the mature NCR peptides. Histograms are based either on the number of NCR peptides with a given pI value (C) or the total abundance of the transcripts encoding NCR peptides with a given pI value (D). (e, f) Histograms showing distributions of pI for 4‐cysteines (e) and 6‐cysteines (f) mature NCR peptides based on total abundance of the transcripts encoding NCR peptides with a given pI value