Literature DB >> 31585525

A homology-guided, genome-based proteome for improved proteomics in the alloploid Nicotiana benthamiana.

Jiorgos Kourelis¹, Farnusch Kaschani², Friederike M Grosse-Holz¹, Felix Homma¹, Markus Kaiser², Renier A L van der Hoorn³.

Abstract

BACKGROUND: Nicotiana benthamiana is an important model organism of the Solanaceae (Nightshade) family. Several draft assemblies of the N. benthamiana genome have been generated, but many of the gene-models in these draft assemblies appear incorrect.
RESULTS: Here we present an improved proteome based on the Niben1.0.1 draft genome assembly guided by gene models from other Nicotiana species. Due to the fragmented nature of the Niben1.0.1 draft genome, many protein-encoding genes are missing or partial. We complement these missing proteins by similarly annotating other draft genome assemblies. This approach overcomes problems caused by mis-annotated exon-intron boundaries and mis-assigned short read transcripts to homeologs in polyploid genomes. With an estimated 98.1% completeness; only 53,411 protein-encoding genes; and improved protein lengths and functional annotations, this new predicted proteome is better in assigning spectra than the preceding proteome annotations. This dataset is more sensitive and accurate in proteomics applications, clarifying the detection by activity-based proteomics of proteins that were previously predicted to be inactive. Phylogenetic analysis of the subtilase family of hydrolases reveal inactivation of likely homeologs, associated with a contraction of the functional genome in this alloploid plant species. Finally, we use this new proteome annotation to characterize the extracellular proteome as compared to a total leaf proteome, which highlights the enrichment of hydrolases in the apoplast.
CONCLUSIONS: This proteome annotation provides the community working with Nicotiana benthamiana with an important new resource for functional proteomics.

Entities: Chemical Disease Species

Keywords: Genome annotation; Nicotiana benthamiana; Proteomics; Solanaceae; Subtilases

Mesh：

Substances：
Hydrolases

Year: 2019 PMID： 31585525 PMCID： PMC6778390 DOI： 10.1186/s12864-019-6058-6

Source DB: PubMed Journal: BMC Genomics ISSN： 1471-2164 Impact factor: 3.969

Background

Nicotiana benthamiana has risen to prominence as a model organism for several major reasons. First, N. benthamiana is highly susceptible to viruses, resulting in highly efficient virus-induced gene-silencing (VIGS) for rapid reverse genetic screens [42]. This hypersusceptibility to viruses is due to an ancient disruptive mutation in the RNA-dependent RNA polymerase 1 gene (Rdr1), present in the lineage of N. benthamiana which is used in laboratories around the world [1]. Reverse genetics using N. benthamiana have confirmed many genes important for disease resistance [43, 58]. Second, N. benthamiana is highly amenable to the generation of stable transgenic lines [6, 48] and to transient expression of transgenes [17]. This easy manipulation has facilitated rapid forward genetic screens and has established N. benthamiana as the plant bioreactor of choice for the production of biopharmaceuticals [50]. Finally, N. benthamiana is a member of the Solanaceae (Nightshade) family which includes important crops such as potato (Solanum tuberosum), tomato (Solanum lycopersicum), eggplant (Solanum melongena), and pepper (Capsicum ssp.), as well as tobacco (Nicotiana tabacum) and petunia (Petunia ssp.). N. benthamiana belongs to the Suaveolentes section of the Nicotiana genus, and has an ancient allopolyploid origin (~6Mya, [5]) accompanied by chromosomal re-arrangements resulting in a complex genome with 19 chromosomes per haploid genome, a reduced number when compared to the ancestral allotetraploid 24 chromosomes. The estimated haploid genome size is ~ 3.1Gb [17, 27, 56]. There are four independent draft assemblies of the N. benthamiana genome [3, 32], as well as a de-novo transcriptome generated from short-read RNAseq [33]. These datasets have greatly facilitated research in N. benthamiana, allowing for efficient prediction of off-targets of VIGS [14] and genome editing using CRISPR/Cas9 [29], as well as RNAseq and proteomics studies [18]. During our research using proteomics and reverse genetic approaches, we realized that many of the gene models in these draft assemblies are incorrect, and that putative truncated gene-products are often annotated as functional protein-encoding genes. This is at least partly because these draft assemblies are highly fragmented and because N. benthamiana is an paleotetraploid. Mapping short read sequences onto polyploid draft genomes frequently results in mis-assigned transcripts and mis-annotated gene models [53]. The de-novo transcriptome assembly also has a high proportion of chimeric transcripts. Because of incorrect annotations, extensive bioinformatics analysis is required to select target genes for reverse genetic approaches such as gene silencing and editing, or for phylogenetic analysis of gene families. We realized that the genome annotations of several other species in the Nicotiana genus (N. sylvestris, N. tomentosiformis, N. tabacum, and N. attenuata) generated using the NCBI Eukaryotic Genome Annotation Pipeline provided more accurate gene models, and we therefore decided to re-annotate the proteome of available N. benthamiana draft genomes using these Nicotiana gene models as a template. We generated a full proteome annotation on the Niben1.0.1 draft genome assembly and extracted proteins missing from this annotation from the other draft genome assemblies. Here we show that this dataset explains activity-based proteomics datasets better, is more accurate and sensitive for proteomics, and that this annotation greatly facilitates phylogenetic analysis of gene families.

Results and discussion

Generation of the genome-based proteomes in N. benthamiana genome assemblies

The observation that predicted proteins from other Nicotiana species were more accurate for several gene families inspired us to use the corresponding coding sequences to re-annotate the proteome of N. benthamiana. We used Scipio [23] to transfer the gene models from Nicotiana species to N. benthamiana. Scipio refines the transcription start-site, exon-intron boundaries, and the stop-codon positions of protein-encoding sequences aligned to the genome using BLAT [23]. When input protein-encoding sequences are well-annotated, this method is more accurate and sensitive than other gene prediction methods [22]. To generate the input dataset, we selected the predicted protein sequences from recently sequenced Nicotiana species [44, 45, 59] generated using the NCBI Eukaryotic Genome Annotation Pipeline (Fig. 1) and used CD-HIT at a 95% identity cut-off to reduce the redundancy and remove partial sequences (Fig. 1, Step 1). The resulting dataset, Nicotiana_db95, contains 85,453 protein-encoding sequences from various Nicotiana species. We then used this Nicotiana_db95 dataset as an input dataset to annotate gene-models in the Niben1.0.1 draft genome assembly using Scipio (Fig. 1, Step 2 Additional files 5, 8 and 9).

Fig. 1

Bioinformatics pipeline for improved Nicotiana benthamiana proteome annotation. The predicted proteins of Nicotiana species generated by the NCBI Eukaryotic Genome Annotation Pipeline were retrieved from Genbank and clustered at 95% identity threshold to reduce redundancy (Step 1), and used to annotate the Niben1.0.1 genome assembly (Step 2). Only those proteins with an alignment coverage ≥60% to the Nicotiana predicted proteins as determined by BLASTP were retained (Step 3) to produce the NbD core dataset. Similarly, the other draft genome assemblies were annotated (Step 4), and only those proteins with an alignment coverage ≥90% to the Nicotiana predicted proteins as determined by BLASTP were retained (Step 5). CD-HIT-2D was used at 100% sequence identity to retain proteins missing in NbD dataset (Step 6), resulting in supplemental dataset NbE. NbD and NbE can be combined (NbDE) to maximise the spectra annotation for proteomics experiments The Scipio-generated dataset contains a majority of full-length protein-encoding sequences, but also partial sequences caused by fragmentation in the Niben1.0.1 dataset. In addition, we noticed many incomplete gene products caused by premature stop-codons and frameshift mutations. To remove these partial sequences from the protein dataset, we used BLASTP against the NCBI Nicotiana reference proteomes and retained only protein-encoding sequences with ≥60% coverage. This resulted in 53,411 protein-encoding gene annotations (NbD dataset, Fig. 1, Step 3). To complement the NbD dataset, we used the same approach to re-annotate the Nbv0.5, Niben0.4.4 and Nbv0.3 draft assemblies using Scipio and the Nicotiana_db95 dataset as template (Step 4). We retained proteins with ≥90% BLASTP coverage with the Nicotiana reference proteomes in NCBI (Fig. 1, Step 5) and we used CD-HIT-2D at a 100% threshold to select coding sequences not present in the NbD dataset (Step 6). The resulting supplemental NbE dataset contains 21,391 additional protein-encoding sequences (Fig. 1, Additional files 6 and 7) . Besides new gene models, this NbE dataset contains sequences that are nearly identical to those in the NbD dataset. These proteins may be derived from homeologs caused by polyploidisation, but can also be caused by sequence polymorphisms between different sequenced plants and by sequencing and assembly errors. For protein annotation in mass-spectrometry (MS) experiments, however, the combination of the NbD and NbE datasets (the NbDE dataset, Additional files 10, 11, 12, 13 and 14) will maximise the annotation of the spectra.

The NbDE proteome is more complete, sensitive, accurate, and relatively small

We next compared the predicted proteome to the published predicted proteomes. The NbD and NbDE datasets have relatively few entries when compared to the other datasets (Fig. 2a). The preceding datasets include the predicted proteomes from the Niben0.4.4 (76,379 entries, [3])) and Niben1.0.1 (57,140 entries, [3])). We also included a previously described curated dataset in which gene-models from Niben1.0.1 were corrected using RNAseq reads (74,093 entries, [18])), and the predicted proteome (Nbv5.1*) derived from a de-novo transcriptome (191,039 entries, [33]). We also included the Nicotiana_db95 dataset (85,453 entries) in this comparison.

Fig. 2

Increased lengths, coverage and annotation of N. benthamiana proteins. a NbD/NbDE datasets have relatively few entries when compared to preceding datasets. b NbD/NbDE datasets contain nearly all benchmark genes as full-length genes, according to Benchmarking Universal Single-Copy Orthologs (BUSCO) of embryophyta. c The NbD/NbDE datasets have higher number of annotated PFAM domains. d NbD/NbDE datasets have relatively longer protein lengths. Violin and boxplot graph of log10 protein length distribution of each dataset. Jittered dots show the raw underlying data. e NbD/NbDE annotated proteins have a higher percentage coverage to the tomato proteins as determined by BLASTP To determine the completeness of the NbD and NbDE datasets, we analysed the presence of BUSCOs (Benchmarking Universal Single-Copy Orthologs [47, 57]). We used the embryophyta BUSCO set, which contains 1440 highly conserved plant genes that are predominantly found as single-copy genes [47]. Nicotiana_db95 has one fragmented and nine missing BUSCOs, indicating that this dataset contains 99.3% of the Nicotiana genes (Fig. 2a). In the NbD dataset, 98.1% of the BUSCO proteins were identified as full-length proteins, and only 0.6% of the BUSCOs is fragmented and 1.3% is missing (Fig. 2a). The 98.1% full length BUSCO proteins in the NbD dataset consists of 45% single-copy genes and 53.1% duplicated genes, consistent with the allotetraploid nature of N. benthamiana. The combined NbDE dataset contains 99.0% of the BUSCOs and contains 71.1% duplicated full length BUSCOs (Fig. 2a), but part of this increased duplication may be due to small sequence variations between the different genome assemblies rather than from genes missing in the NbD annotation. The BUSCO scores of the NbD and NbDE datasets are superior when compared to previously published annotations. The best previously predicted proteome is from dataset Nbv5.1, which has 96.9% of the BUSCOs as full-length proteins, whereas 0.8% is missing and 2.2% is incomplete (Fig. 2a). However, Nbv5.1 contains nearly five times more protein coding sequences than NbD and 69.7% of BUSCOs are duplicated. Second, we investigated the number of unique PFAM identifiers found with each entry in each proteome, because mis-annotated sequences and fragmented gene products are less likely to receive a PFAM annotation [53]. Indeed, over 80% of the proteins in the NbD and NbDE datasets get at least one PFAM identifier whereas this match is lower with the preceding datasets, indicating that proteins in NbD and NbDE are better annotated (Fig. 2b). Third, we mapped the length distributions of the predicted proteins in comparison with the Nicotiana_db95 dataset. The proteins in the NbD and NbDE datasets are significantly longer than those in the previously predicted proteomes except for the Nbv5.1 primary + alternate proteome (Fig. 2c). This is probably because the Niben0.4.4, Niben1.0.1 and curated datasets contain many partial genes, while the Nbv5.1 primary + alternative proteome has a high proportion of chimeric sequences, which are biased towards long transcripts due to the assembly of short sequencing reads. Fourth, to verify that the increased protein lengths is not due to incorrect annotation, we performed BLASTP searches against the Solanum lycopersicum RefSeq proteins (annotation v103), using the N. benthamiana annotated proteins as the query and the S. lycopersicum RefSeq proteins as the target. 70% of the NbD and NbDE proteins show a full coverage of tomato proteins, whereas in the other datasets less than 60% of the proteins show full coverage (Fig. 2d). This indicates that in addition to a more complete annotation with fewer missing genes as determined by BUSCO, the NbD and NbDE datasets are also more accurate in the annotation of gene models. Finally, since phylogenetic analysis of gene families in closely related species often relies on gene-annotations, we compared our NbD proteome annotation against the predicted proteomes of Solanaceae species for which genomes have been sequenced (Additional file 1: Figure S1). Our NbD proteome compare well to the predicted proteomes of other sequences Solanaceae species generated by the NCBI Eukaryotic Genome Annotation Pipeline. In addition, since the predicted proteomes of some of these species were not annotated using the NCBI pipeline, they miss a relatively high proportion of genes (up to 28.5% of genes missing or fragmented). Therefore, care must be taken to not over-interpret results derived from phylogenetic analysis using these datasets.

Improved annotation of spectra in proteomics experiments

We next tested the annotation of MS spectra from four biological replicates of total leaf extracts using the different datasets. Both the NbD and NbDE datasets outperform the preceding datasets (Fig. 3a). This is notable because the NbD dataset has the fewest entries and yet works well in spectra annotation. More specifically, the NbD dataset also identifies the highest number of unique peptides per protein, consistent with having the fewest entries and increased length (Fig. 3b). These metrics show that the new NbD and NbDE datasets are more sensitive and accurate for proteomics than the currently available datasets.

Fig. 3

NbD/NbDE datasets outperform the annotation of spectra in proteomics. a Percentage of annotated MS/MS spectra in total leaf extract samples. b Average number of unique peptides assigned per protein in the different datasets. a and b Means and standard error of the mean are shown for four biological replicates of total leaf extracts. c Mis-annotations of papain-like Cys proteases (PLCPs) detected by activity-based protein profiling [54]. Leaf extracts were labelled with activity-based probes for PLCPs and labelled proteins were purified and analysed by MS. Shown are the protein annotations found in the NbDE (top) Niben1.0.1 (middle) and curated datasets (bottom), highlighting mis-annotations (red) caused by partial transcripts, mis-annotation of exon-intron boundaries, and mis-assemblies We independently validated the NbD and NbDE annotations on an independent dataset where we re-analysed a previously published apoplastic proteome of agro-infiltrated N. benthamiana as compared to non-infiltrated N. benthamiana (PRIDE repository PXD006708, [18]). Both the NbD and NbDE annotations are more sensitive and accurate than the curated dataset on this experiment (18,059 and 18,352 vs 17,954 peptides detected; 21.85 ± 3.0 and 22.2% ±3.1% vs 21.7 ± 3.0% spectra identified). This independently confirms the high performance of the NbD and NbDE datasets. To confirm that our new annotations are correct and biologically relevant we examined the annotation of 14 papain-like Cys proteases (PLCPs) that we identified recently from agroinfiltrated N. benthamiana leaves [54]. These proteases were active proteins because they reacted with an activity-based probe (DCG-04, a biotinylated PLCP inhibitor) to facilitate the purification and detection of these proteins. Of the 14 detected PLCPs, eight were identical between the datasets. However, six PLCPs were mis-annotated in the Niben1.0.1 and curated datasets (Fig. 3c). Two mis-annotated PLCPs lacked catalytic residues; four PLCPs lacked a Cys residue that is crucial for PLCP stability; two PLCPs lacked large parts of the sequence, one PLCP (ALP) carried a C-terminal tandem fusion, and one lacked the signal peptide required for targeting to the endomembrane system. Importantly, none of these mis-annotated proteins could have been active proteases, confirming that their annotation is incorrect. By contrast, the NbDE annotation would predict functional, active proteases that could react with activity-based probes and hence explain the purification of these proteins in activity profiling experiments.

Two examples of improved subtilase annotations

To further illustrate gene annotations in the different datasets and show the relevance of adding missing genes from different genome assemblies with the NbE dataset, we analysed the gene-models of two subtilases which are missing in the Niben1.0.1 draft genome assembly. Both subtilases are encoded by a single-exon gene-models in the NbDE dataset (Fig. 4). The sequence of NbE05066806.1 is present on three non-overlapping contigs in the Niben1.0.1 dataset, one of which contains three single nucleotide polymorphisms (SNPs, Fig. 4a). Two of these SNPs are also present in the Niben0.4.4 dataset, but this annotation contains an additional 132 bp insertion which is annotated as coding. The coding sequence of this subtilase is incomplete in the Nbv0.5 dataset and we identified 13 sequences corresponding to partial or chimeric variants of this subtilase in the Nb5.1 primary + alternate predicted proteome.

Fig. 4

Examples of subtilase mis-annotations in the different genome assemblies. a Gene-models corresponding to subtilase NbE05066806 and the corresponding annotations in the various datasets. This subtilase gene is fragmented in Niben1.0.1; truncated in Nbv0.5; and carries two SNPs and an extra sequence in Niben0.4.4. b Gene-models corresponding to subtilase NbE03059263 and the corresponding annotations in the various datasets. This subtilase has an inactivated homeolog (dark grey) that was not retained in the NbDE dataset as it encodes a protein with < 60% coverage because it contains premature stop codons. The truncated proteins caused mis-assembly in the Niben1.0.1 dataset, resulting in a hybrid sequence. Mis-annotated exon-intron boundaries also effected gene models in Niben1.0.1, Niben0.4.4 and Nbv5.1. Peptides matched to the different gene models are indicated below the gene models A more complicated situation exists for subtilase NbE03059263, which we detected in the apoplast by proteomics. This subtilase has a close paralog which contains several inactivating mutations which was not retained in the NbDE annotation because it contains a premature stop codon and the coverage of the encoded protein is therefore below 60% (Fig. 4b). Consequently, only NbE03059263 is detected with 17 unique peptides. In Niben1.0.1, however, sequences of these two homologous subtilases are misassembled, resulting in hybrids. In addition, several exon-intron boundaries have been misannotated, so the predicted proteins lack several crucial sequences. Consequently, only 12 of the peptides match the annotated subtilases in the Niben1.0.1 dataset, of which seven are unique. In Niben0.4.4, one subtilase gene carries a 133 bp insertion that is annotated as an intron, but this insertion causes an amino acid substitution that prevents a match with one of the peptides. The paralog in Niben0.4.4 is mis-annotated as protein-encoding and consequently only 10 of the 16 peptides are unique. Finally, this subtilase is represented by four transcripts in the Nbv5.1 dataset, two of which encode a C-terminally truncated protein that lacks a match with three peptides, and none of the peptides are unique. Transcript sequences encoding the truncated and inactivated proteins are absent from this Nbv5.1 dataset. These two examples illustrate the different issues that occur with the gene annotations in the different datasets.

Inactivated subtilases are consistent with a contracting functional genome

To study inactivation of homeologs further, we investigated the subtilase-encoding gene family. Several subtilases are implicated in immunity, most notably the tomato P69 clade of subtilases [51]. Our NbDE dataset contains 64 complete subtilase genes, and one partial gene. Sixteen of the subtilase genes are likely duplicated, and an additional 18 are inactive subtilases with > 60% coverage to the NCBI Nicotiana RefSeq proteins. Three subtilases are missing in the NbD dataset, highlighting the utility of combining the NbD and NbE datasets to increase coverage of the proteome. Remarkably, no SBT3 clade family members were identified in N. benthamiana (Table 1, Additional file 1: Figures S2 and S3). Three N. benthamiana subtilases may possess phytaspase activity based on the presence of an aspartic acid residue at the pro-domain junction as well as a histidine residue in the S1 pocket which is thought to bind to P1 aspartic acid (Additional file 1: Figures S2 and S3, [38]).

Table 1

Subtilase gene number according to familya

	Arabidopsis thaliana	Solanum lycopersicum	Nicotiana benthamiana
SBT1	9	53	37 (24)
SBT2	6	4	9
SBT3	17 (1)	1	0
SBT4^b	16 (1)	3	8 (8)
SBT5		6	8 (8)
SBT6	1	fragmented	1
SBT7	1	1	2 (3)
Total	54 (2)	68	65 (43)

a truncated/inactive proteins in brackets; b including SBT5.2

Subtilase gene number according to familya a truncated/inactive proteins in brackets; b including SBT5.2 In comparison, the Niben1.0.1 genome annotation predicts 103 different subtilase gene products. However, 38 of these genes encode inactive subtilases and 16 subtilase genes are absent from Niben1.0.1 (Additional file 2: Table S1 for a comprehensive comparison). Importantly, none of the remaining 49 subtilase-encoding genes in the Niben1.0.1 dataset are correctly annotated to encode a functional product. Furthermore, the predicted proteome from the Nbv5.1 primary + alternate transcriptome contains more than 400 subtilase gene products, which can be explained by chimeric sequences. By searching the Niben1.0.1 and Niben0.4.4 genome assemblies using BLASTN, we identified 43 subtilase genes with internal stop-codons and/or frame-shift mutations and are therefore likely non-coding. Phylogenetic analysis shows that probable homeologs are often inactivated (Fig. 5). This pattern of inactivation in the subtilase gene family is consistent with a contracting functional genome upon polyploidization, where for many functional protein-encoding genes there is a corresponding homeolog that is inactivated (Fig. 5, Table 1, and Additional file 1: Figure S2). In conclusion, the new genome annotation represents a significant improvement over previous annotations and facilitates more accurate and meaningful phylogenetic analysis of gene families in N. benthamiana.

Fig. 5

Birth and death of subtilase paralogs in N. benthamiana. The evolutionary history of the subtilase gene family was inferred by using the Maximum Likelihood method based on the Whelan and Goldman model. The bootstrap consensus tree inferred from 500 replicates is taken to represent the evolutionary history of the taxa analysed. Non-functional subtilases are indicated in grey. Subtilases identified in apoplastic fluid (AF) and/or total extract (TE) are indicated with yellow and green dots, respectively. Naming of subtilase clades is according to [51]. Additional file 1: Figure S2 includes the individual names

Hydrolases are enriched in the leaf secretome of N. benthamiana

Finally, we used the NbDE dataset to analyse the extracellular protein repertoire (secretome) of the N. benthamiana apoplast. The plant apoplast is the primary interface in plant-pathogen interactions [12, 31] and apoplastic proteins include many enzymes that may act in plant-pathogen interactions. To identify apoplastic proteins, we performed MS analysis of apoplastic fluids (AFs) and total extracts (TEs) isolated from the same leaves in four biological replicates. Annotation using the NbDE dataset show that these AF and TE proteomes are clearly distinct (Fig. 6a). We assigned 510 proteins to the apoplast because they were only detected in AF samples or highly enriched in the AF samples over TE samples (log2FC ≥ 1.5, p ≤ 0.01, BH-adjusted moderated t-test, Fig. 6b). Similarly, we assigned 1042 proteins as intracellular because they were only detected in the TE samples or enriched in TE samples over AF samples (log2FC ≤ − 1.5, p ≤ 0.01, Fig. 6b). The remaining 833 proteins were considered both apoplastic and intracellular. As expected, the apoplastic proteome is significantly enriched for proteins containing a SignalP-predicted signal peptide, while the intracellular proteins and proteins present both in the apoplast and intracellular are significantly enriched for proteins lacking a signal peptide (BH-adjusted hypergeometric test, p < 0.001).

Fig. 6

Annotation of the N. benthamiana apoplastic proteome. a Correlation matrix heat map of the log2 transformed LFQ intensity of protein groups in the four biological replicates of apoplastic fluid (AF) and total extract (TE) samples. Biological replicates are clustered on similarity. b A volcano plot is shown plotting log2 fold difference (log2FC) of AF/TE over –log10 BH-adjusted moderated p-values. Proteins with log2FC ≥ 1.5 and p ≤ 0.01 were considered apoplastic, as well as those only found in AF. Conversely, proteins with a log2FC ≤ 1.5 and p ≤ 0.01 were considered intracellular, as well as those found only in TE. c Percentage of proteins in each fraction annotated with biological process-associated GO-SLIM terms. d Percentage of proteins in each fraction annotated with molecular function-associated GO-SLIM terms. c and d GO-SLIM annotations are shown when significantly enriched or depleted (BH-adjusted hypergeometric test, p < 0.05) in at least one of the fractions (AF, TE, or both). Each bubble indicates the percentage of genes containing that specific GO-SLIM annotation in that compartment. Colours indicate whether the GO-SLIM annotations are enriched or depleted in that fraction (p < 0.05, n.s., non-significant). The apoplastic proteome is enriched for proteins acting in catabolic processes and carbohydrate and lipid metabolic processes (Fig. 6c), which is reflected in the enrichment of peptidases, glycosidases and other hydrolases (Fig. 6d, Additional file 3: Table S2 for a full list). Proteins considered predominantly intracellular are enriched for GO-SLIM terms associated with translation, photosynthesis and transport as biological processes (Fig. 6c), and a similar pattern is seen for GO-SLIM terms describing molecular functions (Fig. 6d). Proteins present both in TE and in AF are enriched for GO-SLIM terms associated with biosynthetic processes and homeostasis (Fig. 6c). These processes are usually performed by proteins acting at multiple subcellular localizations. To specify which peptidases are enriched in the apoplast, we assigned the PFAM-annotations to MEROPS peptidase identifiers [40]. Three of the 15 different families of peptidases detected in the apoplast have significantly more members enriched in the AF as compared to TE: the subtilase family (S08; 13 members, p < 0.001), serine carboxypeptidase-like family (S10; 8 members, p < 0.01), and aspartic peptidase family (A01; 17 members, p < 0.001). By contrast, the proteasome is enriched in the intracellular fraction (T01; 26 members, p < 0.001) (BH-adjusted hypergeometric test, Additional file 4: Table S3 for a full list).

Conclusions

Homology-guided re-annotation of the Nicotiana benthamiana proteins resulted in an improved proteome for the alloploid model plant Nicotiana benthamiana. This approach identifies many genes missing from previous annotations, and improves the annotation of exon-intron boundaries and overcomes problems associated with mis-assignment of short reads from inactivated homeologs. By removing sequences with < 60% coverage to well-annotated Nicotiana proteins, we removed partial sequences caused by frameshift mutations and premature stop codons. Besides the core NbD dataset containing 53,411 coding sequences, we provide a supplemental NbE dataset with 21,391 coding sequences, including homeologs. Both the core NbD dataset and the combined NbDE datasets have longer protein sequences with increased BUSCO scores and higher coverage to the tomato proteome. These datasets also have improved frequency of PFAM annotations while maintaining a relatively low number of entries. Both datasets outperform the preceding datasets in the annotation of spectra during proteomics experiments. These datasets provide the research community with improved capacity to annotate spectra during proteomics experiments. These datasets will provide a valuable basis for further genome annotation. Further improvements of the genome annotation of N. benthamiana will involve long read genome sequencing and mapping of RNA sequencing datasets using a single plant lineage. This may remove remaining misannotated intron-exon boundaries and more accurately describe the expression of splice variants and predict the accumulation of truncated proteins. However, until such full genome annotation is available, the NbDE proteomics dataset presented with this work will be valuable for the research community.

Methods

Sequence retrieval

Table 2 summarizes the genomes and genome annotations used in this study.

Table 2

Genomes and gene annotations useda

Species	Genome build	Annotation	Reference
Arabidopsis thaliana ecotype Colombia	GCF_000001735.4	RefSeq	Arabidopsis Genome Initiative, 2000
Beta vulgaris subsp. vulgaris	GCF_000511025.2	NCBI Beta vulgaris subsp. vulgaris Annotation Release 101	Dohm et al., 2014
Capsicum annuum cv. CM334	GCA_000512255.2	Pepper.v.1.55.proteins.annotated	[37]
Capsicum annuum cv. Zunla-1	GCF_000710875.1	NCBI Capsicum annuum Annotation Release 100	[24]
Capsicum annuum var. glabriusculum	GCA_000950795.1	CaChiltepin.pep	[37]
Daucus carota subsp. sativus cv. DH1	GCF_001625215.1	NCBI Daucus carota subsp. sativus Annotation Release 100	[21]
Nicotiana attenuata strain UT	GCF_001879085.1	NCBI Nicotiana attenuata Annotation Release 100	[59]
Nicotiana benthamiana	Niben1.0.1	Niben101_annotation.proteins.wdesc	[3]
Nicotiana benthamiana	Niben0.4.4	Niben.genome.v0.4.4.proteins.wdesc	[3]
Nicotiana benthamiana	Nbv0.3	–	[32]
Nicotiana benthamiana	Nbv0.5	–	[32]
Nicotiana benthamiana	Nbv5.1 transcriptome	Nbv5.1_transcriptome_primary_alternate_correct	[33]
Nicotiana obtusifolia cv. 1x inbred	GCA_002018475.1	NIOBT_r1.0	[59]
Nicotiana sylvestris	GCF_000393655.1	NCBI Nicotiana sylvestris Annotation Release 100	[44]
Nicotiana tabacum cv. TN90	GCF_000715135.1	NCBI Nicotiana tabacum Annotation Release 100	[45]
Nicotiana tomentosiformis	GCF_000390325.1	NCBI Nicotiana tomentosiformis Annotation Release 101	[44]
Petunia axillaris N	Petunia_axillaris_v1.6.2	Petunia_axillaris_v1.6.2_proteins	[2]
Petunia inflata S6	Petunia_inflata_v1.0.1	Petunia_inflata_v1.0.1_proteins	[2]
Solanum lycopersicum cv. Heinz 1706	GCF_000188115.4	NCBI Solanum lycopersicum Annotation Release 103	[8]
Solanum melongena L. cv. Nakate-Shinkuro	GCA_000787875.1	SME_r2.5.1_pep	[20]
Solanum pennellii	GCF_001406875.1	NCBI Solanum pennellii Annotation Release 101	[60]
Solanum tuberosum cv. DM 1–3516 R44	GCF_000226075.1	NCBI Solanum tuberosum Annotation Release 101	[7]

a Where available the NCBI assembly accession and annotation was taken

Genomes and gene annotations useda a Where available the NCBI assembly accession and annotation was taken

Annotation

In order to extract gene-models from the published N. benthamiana draft genomes we combined NCBI Nicotiana RefSeq protein sequences in one dataset, removed all partial proteins and those containing undetermined residues, added 110 genes which we had previously been manually curated and 565 full-length N. benthamiana proteins from GenBank leading to a dataset containing 226,543 protein sequences. We used CD-HIT (v4.6.8) [16] to cluster these sequences at a 95% identity threshold and reduce the redundancy in our dataset while removing partials (Nicotiana_db95; 85,453 sequences). This dataset was used to annotate the gene-models in the different N. benthamiana genome builds using Scipio version 1.4.1 [23] which was run to allow for duplicated genes. After running Scipio we used Augustus (v3.3) [49] to extract complete and partial gene models. Transdecoder (v5.5.0) [19] was used to retrieve the single-best ORF containing homology to the Nicotiana_db95 dataset as determined by BLASTP for gene models containing internal stop codons. The predicted proteins were aligned to the NCBI Nicotiana RefSeq protein sequences using BLASTP (NCBI-BLAST v2.8.1+;), and only those with a coverage ≥60% in the Niben1.0.1 genome assembly, or ≥ 90% in the Niben0.4.4, Nbv0.3, and Nbv0.5 genome assemblies were maintained. A custom R script was used to convert extracted Niben1.0.1 gene models in a full genome annotation, which were manually inspected resulting in NbD. CD-HIT-2D was used at 100% identity to identify gene models missing in NbD, but present in the extracted gene models from the other genome assemblies (the NbE set). The addition of these sequences to NbD resulted in the NbDE annotation. This dataset was annotated using SignalP (v4.1) [35], PFAM (v32) [13] and annotated the with GO terms and UniProt identifiers using Sma3s v2 [4]. We ran BUSCO (v3.0.2; dependencies: NCBI-BLAST v2.8.1+; HMMER v3.1; Augustus v3.3) [47, 57] on the different N. benthamiana predicted proteomes using the plants set (Embryophyta_odb9), to validate their completeness. Additionally, we used BLASTP to compare these proteins with the NCBI Solanum lycopersicum RefSeq protein sequences.

Sample preparation for proteomics and definition of biological replicates

Four-week old N. benthamiana plants were used. The AF was extracted by vacuum infiltrating N. benthamiana leaves with ice-cold MilliQ. Leaves were dried to remove excess liquid, and apoplastic fluid was extracted by centrifugation of the leaves in a 20 ml syringe barrel (without needle or plunger) in a 50 ml falcon tube at 2000×g, 4 °C for 25 min. Samples were snap-frozen in liquid nitrogen and stored at − 80 °C prior to use. TE was collected by removing the central vein and snap-freezing the leaves in liquid nitrogen followed by grinding in a pestle and mortar and addition of three volumes of phosphate-buffered saline (PBS) (w/v). One biological replicate was defined as a sample, AF or TE, consisting of one leaf from three independent plants (3 leaves total). Four independent biological replicates were taken for AF and TE.

Protein digestion and sample clean-up

AF and TE sample corresponding to 15 μg of protein was taken for each sample (based on Bradford assay). Dithiothreitol (DTT) was added to a concentration of 40 mM, and the volume adjusted to 250 μl with MS-grade water (Sigma). Proteins were precipitated by the addition of 4 volumes of ice-cold acetone, followed by a 1 h incubation at − 20 °C and subsequent centrifugation at 18,000×g, 4 °C for 20 min. The pellet was dried at room temperature (RT) for 5 min and resuspended in 25 μL 8 M urea, followed by a second chloroform/methanol precipitation. The pellet was dried at RT for 5 min and resuspended in 25 μL 8 M urea. Protein reduction and alkylation was achieved by sequential incubation with DTT (final 5 mM, 30 min, RT) and iodoacetamide (IAM; final 20 mM, 30 min, RT, dark). Non-reacted IAM was quenched by raising the DTT concentration to 25 mM. Protein digestion was started by addition of 1000 ng LysC (Wako Chemicals GmbH) and incubation for 3 h at 37 °C while gently shaking (800 rpm). The samples were then diluted with ammoniumbicarbonate (final concentration 80 mM) to a final urea concentration of 1 M. 1000 ng Sequencing grade Trypsin (Promega) was added and the samples were incubated overnight at 37 °C while gently shaking (800 rpm). Protein digestion was stopped by addition of formic acid (FA, final 5% v/v). Tryptic digests were desalted on home-made C18 StageTips [39] by passing the solution over 2 disc StageTips in 150 μL aliquots by centrifugation (600–1200×g). Bound peptides were washed with 0.1% FA and subsequently eluted with 80% Acetonitrile (ACN). Using a vacuum concentrator (Eppendorf) samples were dried, and the peptides were resuspended in 20 μL 0.1% FA solution.

LC-MS/MS

The samples were analysed as in [18]. Briefly, samples were run on an Orbitrap Elite instrument (Thermo) [30] coupled to an EASY-nLC 1000 liquid chromatography (LC) system (Thermo) operated in the one-column mode. Peptides were directly loaded on a fused silica capillary (75 μm × 30 cm) with an integrated PicoFrit emitter (New Objective) analytical column packed in-house with Reprosil-Pur 120 C18-AQ 1.9 μm resin (Dr. Maisch), taking care to not exceed the set pressure limit of 980 bar (usually around 0.5–0.8 μl/min). The analytical column was encased by a column oven (Sonation; 45 °C during data acquisition) and attached to a nanospray flex ion source (Thermo). Peptides were separated on the analytical column by running a 140 min gradient of solvent A (0.1% FA in water;; Ultra-Performance Liquid Chromatography (UPLC) grade) and solvent B (0.1% FA in ACN; UPLC grade) at a flow rate of 300 nl/min (gradient: start with 7% B; gradient 7 to 35% B for 120 min; gradient 35 to 100% B for 10 min and 100% B for 10 min) at a flow rate of 300 nl/min.). The mass spectrometer was operated using Xcalibur software (version 2.2 SP1.48) in positive ion mode. Precursor ion scanning was performed in the Orbitrap analyzer (FTMS; Fourier Transform Mass Spectrometry) in the scan range of m/z 300–1800 and at a resolution of 60,000 with the internal lock mass option turned on (lock mass was 445.120025 m/z, polysiloxane) [34]. Product ion spectra were recorded in a data-dependent manner in the ion trap (ITMS) in a variable scan range and at a rapid scan rate. The ionization potential was set to 1.8 kV. Peptides were analysed by a repeating cycle of a full precursor ion scan (1.0 × 106 ions or 50 ms) followed by 15 product ion scans (1.0 × 104 ions or 50 ms). Peptides exceeding a threshold of 500 counts were selected for tandem mass (MS2) spectrum generation. Collision induced dissociation (CID) energy was set to 35% for the generation of MS2 spectra. Dynamic ion exclusion was set to 60 s with a maximum list of excluded ions consisting of 500 members and a repeat count of one. Ion injection time prediction, preview mode for the Fourier transform mass spectrometer (FTMS, the orbitrap), monoisotopic precursor selection and charge state screening were enabled. Only charge states higher than 1 were considered for fragmentation.

Peptide and protein identification

Peptide spectra were searched in MaxQuant (version 1.5.3.30) using the Andromeda search engine [11] with default settings and label-free quantification and match-between-runs activated [9, 10] against the datasets specified in the text including a known contaminants dataset. Included modifications were carbamidomethylation (static) and oxidation and N-terminal acetylation (dynamic). Precursor mass tolerance was set to ±20 ppm (first search) and ± 4.5 ppm (main search), while the MS/MS match tolerance was set to ±0.5 Da. The peptide spectrum match FDR and the protein FDR were set to 0.01 (based on a target-decoy approach) and the minimum peptide length was set to 7 amino acids. Protein quantification was performed in MaxQuant [52], based on unique and razor peptides including all modifications.

Proteomics processing in R

Identified protein groups were filtered for reverse and contaminants proteins and those only identified by matching, and only those protein groups identified in 3 out of 4 biological replicates of either AF or TE were selected. The LFQ values were log2 transformed, and missing values were imputed using a minimal distribution as implemented in imputeLCMD (v2.0) [26]. A moderated t-test was used as implemented in Limma (v3.34.3) [36, 41] and adjusted using Benjamini–Hochberg (BH) adjustment to identify protein groups significantly differing between AF and TE. Bonafide apoplastic protein groups were those only detected in AF and those significantly (p ≤ 0.01) log2 fold change ≥1.5 in AF samples. Protein groups only detected in TE and those significantly (p ≤ 0.01) log2 fold change ≤ − 1.5 depleted in AF samples were considered intracellular. The remainder was considered both apoplastic and intra-cellular. Majority proteins were annotated with SignalP, PFAM [16], MEROPS (v12) [40], GO, and UniProt keywords identifiers. A BH-adjusted Hypergeometric test was used to identify those terms that were either depleted or enriched (p ≤ 0.05) in the bonafide AF protein groups as compared to bonafide AF depleted proteins or protein groups present both in the AF and TE.

Phylogenetic analysis

Predicted proteomes were annotated with PFAM identifiers, and all sequences containing a Peptidase S8 (PF00082) domain were extracted from the different datasets. Additionally, we manually curated the subtilase gene-family in the Niben1.0.1 draft genome, identifying truncated proteins that were annotated as protein-encoding genes, as well as missing genes and incorrect gene models or genes in which the reference sequence was absent in Niben1.0.1. Tomato subtilases were retrieved from Solgenomics, and other previously characterized subtilases [51] were retrieved from NCBI. Clustal Omega [28, 46] was used to align these sequences. The truncated-gene products sequences were substituted with the best blast hit in NCBI to visualize subtilase inactivation in the alignment and phylogenetic tree. Determining the best model for maximum likelihood phylogenetic analysis and the phylogenetic analysis was performed in MEGA X [25]. The evolutionary history was inferred by using the Maximum Likelihood method based on the Whelan and Goldman model. A discrete Gamma distribution was used to model evolutionary rate differences among sites, and the rate variation model allowed for some sites to be evolutionarily invariable. All positions with less than 80% site coverage were eliminated. NbD005386.1 was used to root the phylogenetic trees. Figure S1. Comparison of Solanaceae proteomes. Figure S2. Phylogenetic analysis of the subtilase gene-family with names. Figure S3. Phylogenetic analysis of the subtilase gene-family of tomato and Arabidopsis and including other previously characterized subtilases. (PDF 518 kb) Table S1. Gene-model comparison. (XLSX 18 kb) Table S2. GO-SLIM term enrichment complete at p ≤ 0.05. (XLSX 23 kb) Table S3. MEROPS family term enrichment complete. (XLSX 14 kb) Dataset S1. New Niben1.0.1 gff3 annotation. (GFF 58963 kb) Dataset S2. FASTA file of NbE genomic sequence ±1 kb. (FASTA 222960 kb) Dataset S3. gff3 annotation of NbE gene-models. (GFF 33474 kb) Dataset S4. NbD proteome. (FASTA 26317 kb) Dataset S5. NbD transcriptome. (FASTA 71108 kb) Dataset S6. NbDE proteome. (FASTA 36990 kb) Dataset S7. NbDE transcriptome. (FASTA 99903 kb) Dataset S8. Sma3s v2 annotation of NbDE. (TSV 158675 kb) Dataset S9. PFAM32 annotation of NbDE. (TSV 18754 kb) Dataset S10. SignalP4.1 annotation of NbDE. (TSV 5130 kb)

58 in total

1. NLR network mediates immunity to diverse plant pathogens.

Authors: Chih-Hang Wu; Ahmed Abd-El-Haliem; Tolga O Bozkurt; Khaoula Belhaj; Ryohei Terauchi; Jack H Vossen; Sophien Kamoun
Journal: Proc Natl Acad Sci U S A Date: 2017-07-11 Impact factor: 11.205

2. MEGA X: Molecular Evolutionary Genetics Analysis across Computing Platforms.

Authors: Sudhir Kumar; Glen Stecher; Michael Li; Christina Knyaz; Koichiro Tamura
Journal: Mol Biol Evol Date: 2018-06-01 Impact factor: 16.240

3. A high-quality carrot genome assembly provides new insights into carotenoid accumulation and asterid genome evolution.

Authors: Massimo Iorizzo; Shelby Ellison; Douglas Senalik; Peng Zeng; Pimchanok Satapoomin; Jiaying Huang; Megan Bowman; Marina Iovene; Walter Sanseverino; Pablo Cavagnaro; Mehtap Yildiz; Alicja Macko-Podgórni; Emilia Moranska; Ewa Grzebelus; Dariusz Grzebelus; Hamid Ashrafi; Zhijun Zheng; Shifeng Cheng; David Spooner; Allen Van Deynze; Philipp Simon
Journal: Nat Genet Date: 2016-05-09 Impact factor: 38.330

4. The EMBL-EBI bioinformatics web and programmatic tools framework.

Authors: Weizhong Li; Andrew Cowley; Mahmut Uludag; Tamer Gur; Hamish McWilliam; Silvano Squizzato; Young Mi Park; Nicola Buso; Rodrigo Lopez
Journal: Nucleic Acids Res Date: 2015-04-06 Impact factor: 16.971

5. Draft genome sequence of eggplant (Solanum melongena L.): the representative solanum species indigenous to the old world.

Authors: Hideki Hirakawa; Kenta Shirasawa; Koji Miyatake; Tsukasa Nunome; Satomi Negoro; Akio Ohyama; Hirotaka Yamaguchi; Shusei Sato; Sachiko Isobe; Satoshi Tabata; Hiroyuki Fukuoka
Journal: DNA Res Date: 2014-09-18 Impact factor: 4.458

6. CD-HIT: accelerated for clustering the next-generation sequencing data.

Authors: Limin Fu; Beifang Niu; Zhengwei Zhu; Sitao Wu; Weizhong Li
Journal: Bioinformatics Date: 2012-10-11 Impact factor: 6.937

7. 2016 update of the PRIDE database and its related tools.

Authors: Juan Antonio Vizcaíno; Attila Csordas; Noemi del-Toro; José A Dianes; Johannes Griss; Ilias Lavidas; Gerhard Mayer; Yasset Perez-Riverol; Florian Reisinger; Tobias Ternent; Qing-Wei Xu; Rui Wang; Henning Hermjakob
Journal: Nucleic Acids Res Date: 2015-11-02 Impact factor: 16.971

8. The transcriptome, extracellular proteome and active secretome of agroinfiltrated Nicotiana benthamiana uncover a large, diverse protease repertoire.

Authors: Friederike Grosse-Holz; Steven Kelly; Svenja Blaskowski; Farnusch Kaschani; Markus Kaiser; Renier A L van der Hoorn
Journal: Plant Biotechnol J Date: 2017-12-17 Impact factor: 9.803

9. High-throughput sequencing data and the impact of plant gene annotation quality.

Authors: Aleksia Vaattovaara; Johanna Leppälä; Jarkko Salojärvi; Michael Wrzaczek
Journal: J Exp Bot Date: 2019-02-20 Impact factor: 6.992

10. The Pfam protein families database in 2019.

Authors: Sara El-Gebali; Jaina Mistry; Alex Bateman; Sean R Eddy; Aurélien Luciani; Simon C Potter; Matloob Qureshi; Lorna J Richardson; Gustavo A Salazar; Alfredo Smart; Erik L L Sonnhammer; Layla Hirsh; Lisanna Paladin; Damiano Piovesan; Silvio C E Tosatto; Robert D Finn
Journal: Nucleic Acids Res Date: 2019-01-08 Impact factor: 16.971

13 in total

1. Naphthylphthalamic acid associates with and inhibits PIN auxin transporters.

Authors: Lindy Abas; Martina Kolb; Johannes Stadlmann; Dorina P Janacek; Kristina Lukic; Claus Schwechheimer; Leonid A Sazanov; Lukas Mach; Jiří Friml; Ulrich Z Hammes
Journal: Proc Natl Acad Sci U S A Date: 2020-12-21 Impact factor: 11.205

2. Comparison of the Distinct, Host-Specific Response of Three Solanaceae Hosts Induced by Phytophthora infestans.

Authors: Jie Lu; Tingli Liu; Xiong Zhang; Jie Li; Xun Wang; Xiangxiu Liang; Guangyuan Xu; Maofeng Jing; Zhugang Li; Ingo Hein; Daolong Dou; Yanju Zhang; Xiaodan Wang
Journal: Int J Mol Sci Date: 2021-10-12 Impact factor: 5.923

3. Effector-mediated relocalization of a maize lipoxygenase protein triggers susceptibility to Ustilago maydis.

Authors: Indira Saado; Khong-Sam Chia; Ruben Betz; André Alcântara; Aladár Pettkó-Szandtner; Fernando Navarrete; John C D'Auria; Michael V Kolomiets; Michael Melzer; Ivo Feussner; Armin Djamei
Journal: Plant Cell Date: 2022-07-04 Impact factor: 12.085

4. The Arabidopsis SAC9 enzyme is enriched in a cortical population of early endosomes and restricts PI(4,5)P₂ at the plasma membrane.

Authors: Alexis Lebecq; Mehdi Doumane; Aurelie Fangain; Vincent Bayle; Jia Xuan Leong; Frédérique Rozier; Maria Del Marques-Bueno; Laia Armengot; Romain Boisseau; Mathilde Laetitia Simon; Mirita Franz-Wachtel; Boris Macek; Suayib Üstün; Yvon Jaillais; Marie-Cécile Caillaud
Journal: Elife Date: 2022-08-31 Impact factor: 8.713

5. Evolution of a guarded decoy protease and its receptor in solanaceous plants.

Authors: Jiorgos Kourelis; Shivani Malik; Oliver Mattinson; Sonja Krauter; Parvinderdeep S Kahlon; Judith K Paulus; Renier A L van der Hoorn
Journal: Nat Commun Date: 2020-09-02 Impact factor: 14.919

6. Barley stripe mosaic virus γb protein targets thioredoxin h-type 1 to dampen salicylic acid-mediated defenses.

Authors: Zhihao Jiang; Xuejiao Jin; Meng Yang; Qinglin Pi; Qing Cao; Zhenggang Li; Yongliang Zhang; Xian-Bing Wang; Chenggui Han; Jialin Yu; Dawei Li
Journal: Plant Physiol Date: 2022-06-27 Impact factor: 8.005

7. Metabolic effects of agro-infiltration on N. benthamiana accessions.

Authors: Margit Drapal; Eugenia M A Enfissi; Paul D Fraser
Journal: Transgenic Res Date: 2021-04-28 Impact factor: 2.788

8. Identification of two subtilisin-like serine proteases engaged in the degradation of recombinant proteins in Nicotiana benthamiana.

Authors: Alejandro A Puchol Tarazona; Daniel Maresch; Annette Grill; Janet Bakalarz; Juan A Torres Acosta; Alexandra Castilho; Herta Steinkellner; Lukas Mach
Journal: FEBS Lett Date: 2020-12-11 Impact factor: 3.864

9. Agromonas: a rapid disease assay for Pseudomonas syringae growth in agroinfiltrated leaves.

Authors: Pierre Buscaill; Nattapong Sanguankiattichai; Yoon Joo Lee; Jiorgos Kourelis; Gail Preston; Renier A L van der Hoorn
Journal: Plant J Date: 2021-01-12 Impact factor: 6.417

Review 10. Multi-Omics-Based Discovery of Plant Signaling Molecules.

Authors: Fei Luo; Zongjun Yu; Qian Zhou; Ancheng Huang
Journal: Metabolites Date: 2022-01-13