| Literature DB >> 35365907 |
Giuseppe Andolfo1, Juliane C Dohm2, Heinz Himmelbauer2.
Abstract
The activation of plant immunity is mediated by resistance (R)-gene receptors, also known as nucleotide-binding leucine-rich repeat (NB-LRR) genes, which in turn trigger the authentic defense response. R-gene identification is a crucial goal for both classic and modern plant breeding strategies for disease resistance. The conventional method identifies NB-LRR genes using a protein motif/domain-based search (PDS) within an automatically predicted gene set of the respective genome assembly. PDS proved to be imprecise since repeat masking prior to automatic genome annotation unwittingly prevented comprehensive NB-LRR gene detection. Furthermore, R-genes have diversified in a species-specific manner, so that NB-LRR gene identification cannot be universally standardized. Here, we present the full-length Homology-based R-gene Prediction (HRP) method for the comprehensive identification and annotation of a genome's R-gene repertoire. Our method has substantially addressed the complex genomic organization of tomato (Solanum lycopersicum) NB-LRR gene loci, proving to be more performant than the well-established RenSeq approach. HRP efficiency was also tested on three differently assembled and annotated Beta sp. genomes. Indeed, HRP identified up to 45% more full-length NB-LRR genes compared to previous approaches. HRP also turned out to be a more refined strategy for R-gene allele mining, testified by the identification of hitherto undiscovered Fom-2 homologs in five Cucurbita sp. genomes. In summary, our high-performance method for full-length NB-LRR gene discovery will propel the identification of novel R-genes towards development of improved cultivars.Entities:
Keywords: zzm321990Beta vulgariszzm321990; zzm321990Cucurbitazzm321990; zzm321990Solanum lycopersicumzzm321990; disease resistance genes; gene prediction; genome annotation; plant breeding; repeat masking
Mesh:
Substances:
Year: 2022 PMID: 35365907 PMCID: PMC9322396 DOI: 10.1111/tpj.15756
Source DB: PubMed Journal: Plant J ISSN: 0960-7412 Impact factor: 7.091
Number of tomato NB‐LRR genes as identified by HRP, RGAugury and RenSeq in S. lycopersicum Heinz‐1706 v2.4
| Protein domains | RenSeq | RGAugury | HRP |
|---|---|---|---|
| Full‐length | |||
| CC‐NB‐LRR | 193 | 151 | 198 |
| RPW8‐NB‐LRR | 2 | – | 2 |
| TIR‐NB‐LRR | 26 | 19 | 31 |
| Total full‐length | 221 | 170 | 231 |
| CC | – | – | 3 |
| Partial | |||
| CC‐NB | 14 | 8 | 4 |
| RPW8 | – | – | 1 |
| TIR‐LRR | 1 | – | – |
| TIR‐NB | 3 | 7 | 5 |
| NB | 57 | 63 | 64 |
| TIR | 10 | – | 7 |
| LRR | 20 | 2 | 48 |
| Total partial | 102 | 80 | 132 |
| Total | 326 | 250 | 363 (324) |
CC: coiled coil; LRR: leucine‐rich repeat; NB: nucleotide binding; RPW8: resistance to powdery mildew 8; TIR: Toll/interleukin receptor.
Tomato R‐gene enrichment and sequencing annotation (Andolfo et al., 2014).
NB‐LRR genes annotated by RGAugury (Li et al., 2016) in S. lycopersicum Heinz‐1706 v2.3 proteome.
Number of confirmed NB‐LRR genes previously identified by the RenSeq approach in brackets.
Figure 1Re‐annotation of two erroneously annotated NB‐LRR genes located on chromosome 10 of the S. lycopersicum Heinz‐1706 genome. (a) Mapping of tomato full‐length NB‐LRR gene Solyc10g085460 identified a single gene encompassing two previously (Andolfo et al., 2014) annotated gene models (green arrows). (b) The Solyc10g055050 gene (red arrow) was confirmed by RenSeq as partial gene (green arrow). The corrected exon annotation identified by the HRP method was confirmed in an InterProScan analysis. Domains are shown as gold (CC), rose (NB) and spring green (LRR) arrows. [Colour figure can be viewed at wileyonlinelibrary.com]
Gene sets used to generate the input set of protein sequences for the HRP method and number of full‐length NB‐LRR genes identified by PDS
| Species | Accession | Protein‐coding genes | References | R‐genes |
|---|---|---|---|---|
|
| KWS2320 | 25 813 | Dohm et al., | 146 |
| KWS230/DH1440 | 25 368 | Dohm et al., | 138 | |
| STR06A6001 | 31 355 | Dohm et al., | 139 | |
| YMoBv | 25 927 | Dohm et al., | 140 | |
| YTiBv | 25 626 | Dohm et al., | 140 | |
| C869/EL10 | 24 255 | Funk et al., | 143 | |
|
| M4021 | 27 513 | Lehner et al., | 76 |
|
| WB42 | 27 617 | Rodríguez del Río et al., ( | 84 |
|
| BETA548 | 25 069 | Rodríguez del Río et al., ( | 84 |
|
| Viroflay | 21 703 | Minoche et al., | 49 |
| Total | 1139 | |||
Number of protein‐coding genes predicted in the RefBv‐1.0 genome assembly.
Number of full‐length NB‐LRR genes.
Comparison of NB‐LRR genes identified in two different isogenic B. vulgaris ssp. vulgaris genome assemblies by R‐protein motif/domain‐based search (PDS) and by full‐length Homology‐based R‐gene Prediction (HRP)
| Protein domains | PDS | HRP | ||
|---|---|---|---|---|
| RefBeet‐1.2 | RefBv‐1.0 | RefBeet‐1.2 | RefBv‐1.0 | |
| Full‐length | ||||
| CC‐NB‐LRR | 95 | 144 | 176 | 170 |
| RPW8‐NB‐LRR | 1 | 1 | 1 | 1 |
| TIR‐NB‐LRR | 1 | 1 | 1 | 1 |
| Total full‐length | 97 | 146 | 178 | 172 |
| Partial | ||||
| CC | 10 | 9 | 2 | 6 |
| CC‐NB | 5 | 4 | 1 | 4 |
| CC‐LRR | 14 | – | – | – |
| TIR‐NB | 2 | 2 | 2 | 2 |
| NB | 30 | 31 | 26 | 29 |
| TIR | 2 | 1 | 2 | 1 |
| LRR | 43 | 42 | 20 | 40 |
| Total partial | 106 | 89 | 53 | 82 |
| Total | 203 | 235 | 231 | 254 |
CC: coiled coil; LRR: leucine‐rich repeat; NB: nucleotide binding; RPW8: resistance to powdery mildew 8; TIR: Toll/interleukin receptor.
Figure 2Homology‐based NB‐LRR gene prediction in sugar beet. Detailed analysis (a) of an NB‐LRR gene cluster on RefBv‐1.0 scaffold 549 and re‐annotation of two erroneously (b) fused and (c) split NB‐LRR genes located on RefBv‐1.0 scaffolds 7674 and 4947, respectively. The Beta vulgaris genomic regions with annotated R‐genes by PDS (red arrows) and the novel full‐length NB‐LRRs predicted by full‐length Homology‐based R‐gene Prediction (HRP) (blue arrows) are shown. The corrected annotation was confirmed in an InterProScan analysis. Domains are shown as gold (CC), rose (NB) and spring green (LRR) arrows. [Colour figure can be viewed at wileyonlinelibrary.com]
Number of full‐length Fom‐2 homologs identified in five Cucurbita genome assemblies using HRP or PDS. All NB‐LRR genes previously annotated by PDS were confirmed by HRP using one Fom‐2 homolog as input sequence
| Cucurbit genome assemblies | |||||
|---|---|---|---|---|---|
|
|
|
|
|
| |
| (v2.0) | (v1.1) | (v1.0) | (v4.1) | (v1.0) | |
| PDS | – | 12 | 26 | 2 | 6 |
| HRP | 37 | 14 | 35 | 26 | 36 |
Number of Fom‐2 homologs identified by conventional domain search (Andolfo et al., 2021b).
Figure 3Clusters of Fom‐2 homologs in Cucurbita missed by automatic gene annotation pipelines. NB‐LRR gene clusters composed of four full‐length (blue arrows) and one partial (turquoise arrow) R‐genes predicted by HRP on chromosome 6 of (a) C. argyrosperma and (b) its wild relative, C. sororia. The Cucurbita genome annotation released by Barrera‐Redondo et al. (2019) is depicted as burgundy arrows. The corrected annotation was confirmed in an InterProScan analysis. Domains are shown as gold (CC), rose (NB) and spring green (LRR) arrows. [Colour figure can be viewed at wileyonlinelibrary.com]
Figure 4Step‐by‐step workflow for NB‐LRR gene prediction and annotation. The HRP pipeline was designed to detect conserved domains (CC: PF18052; LRR: SSF52058 and SSF52047; NB: PF00931; TIR: PF01582) found in well‐characterized plant R‐genes by integrating results generated from protein motif/domain‐based search (MEME/MAST and InterProScan software), combined with an R‐gene prediction based on homology to full‐length NB‐LRR genes by GenBlastG within assembled genome sequences. [Colour figure can be viewed at wileyonlinelibrary.com]