| Literature DB >> 31553100 |
Aria Dolatabadian1, Philipp E Bayer1, Soodeh Tirnaz1, Bhavna Hurgobin1, David Edwards1, Jacqueline Batley1.
Abstract
Methods based on single nucleotide polymorphism (SNP), copy number variation (CNV) and presence/absence variation (PAV) discovery provide a valuable resource to study gene structure and evolution. However, as a result of these structural variations, a single reference genome is unable to cover the entire gene content of a species. Therefore, pangenomics analysis is needed to ensure that the genomic diversity within a species is fully represented. Brassica napus is one of the most important oilseed crops in the world and exhibits variability in its resistance genes across different cultivars. Here, we characterized resistance gene distribution across 50 B. napus lines. We identified a total of 1749 resistance gene analogs (RGAs), of which 996 are core and 753 are variable, 368 of which are not present in the reference genome (cv. Darmor-bzh). In addition, a total of 15 318 SNPs were predicted within 1030 of the RGAs. The results showed that core R-genes harbour more SNPs than variable genes. More nucleotide binding site-leucine-rich repeat (NBS-LRR) genes were located in clusters than as singletons, with variable genes more likely to be found in clusters. We identified 106 RGA candidates linked to blackleg resistance quantitative trait locus (QTL). This study provides a better understanding of resistance genes to target for genomics-based improvement and improved disease resistance.Entities:
Keywords: zzm321990Brassica napuszzm321990; RGAugury; pangenome; presence/absence variation; resistance gene
Year: 2019 PMID: 31553100 PMCID: PMC7061875 DOI: 10.1111/pbi.13262
Source DB: PubMed Journal: Plant Biotechnol J ISSN: 1467-7644 Impact factor: 9.803
The number of different RGA candidates and subfamilies found on the reference genomes, pangenome additional contigs and reference genome unplaced contigs
| RGAs | Reference genome | Pangenome additional contigs | Reference genome unplaced contigs | Pangenome | ||
|---|---|---|---|---|---|---|
| A genome | C genome | A and C | ||||
| CN | 13 (4–9) | 5 (2–3) | 18 (6–12) | 10 (0–10) | 1 (1–0) | 29 (7–22) |
| CNL | 3 (2–1) | 10 (8–2) | 13 (10–3) | 17 (0–17) | 0 (0–0) | 30 (10–20) |
| NBS | 10 (5–5) | 20 (10–10) | 30 (15–15) | 43 (0–43) | 2 (0–2) | 75 (15–60) |
| NL | 34 (19–15) | 34 (20–14) | 68 (39–29) | 76 (0–76) | 2 (2–0) | 146 (41–105) |
| OTHER | 7 (5–2) | 12 (7–5) | 19 (12–7) | 6 (0–6) | 2 (1–1) | 27 (13–14) |
| RN | 2 (2–0) | 1 (0–1) | 3 (2–1) | 2 (0–2) | 0 (0–0) | 5 (2–3) |
| RNL | 4 (3–1) | 3 (0–3) | 7 (3–4) | 0 (0–0) | 0 (0–0) | 7 (3–4) |
| TN | 8 (8–0) | 7 (4–3) | 15 (12–3) | 12 (0–12) | 0 (0–0) | 27 (12–15) |
| TNL | 12 (6–6) | 18 (8–10) | 30 (14–16) | 13 (0–13) | 0 (0–0) | 43 (14–29) |
| TX | 29 (12–17) | 51 (25–26) | 80 (37–43) | 31 (0–31) | 3 (0–3) | 114 (37–77) |
| Total | 122 (66–56) | 161 (84–77) | 283 (150–133) | 210 (0–210) | 10 (4–6) | 503 (154–349) |
| RLP | 37 (34–3) | 39 (28–11) | 76 (62–14) | 70 (0–70) | 2 (1–1) | 148 (63–85) |
| RLK | 485 (393–92) | 500 (372–128) | 985 (765–220) | 88 (0–88) | 25 (14–11) | 1098 (779–319) |
| Total | 522 (427–95) | 539 (400–139) | 1061 (827–234) | 158 (0–158) | 27 (15–12) | 1246 (842–404) |
| Grand total | 644 (493–151) | 700 (484–216) | 1344 (977–367) | 368 (0–368) | 37 (19–18) | 1749 (996–753) |
The numbers in parentheses represent the number of core and variable genes, respectively.
The total number of RGAs across the 50 lines on the reference genome, pangenome additional contigs and reference genome unplaced contigs
| RGAs | Reference genomes | Pangenome additional contigs | Reference genome unplaced contigs | Pangenome | ||
|---|---|---|---|---|---|---|
| A genome | C genome | A and C | ||||
| CN | 602 | 241 | 843 | 156 | 50 | 1049 |
| CNL | 145 | 498 | 643 | 164 | 0 | 807 |
| NBS | 455 | 919 | 1374 | 635 | 94 | 2103 |
| NL | 1581 | 1611 | 3192 | 1147 | 100 | 4439 |
| OTHER | 332 | 563 | 895 | 19 | 99 | 1013 |
| RN | 100 | 49 | 149 | 31 | 0 | 180 |
| RNL | 199 | 142 | 341 | 0 | 0 | 341 |
| TN | 400 | 319 | 719 | 124 | 0 | 843 |
| TNL | 544 | 836 | 1380 | 129 | 0 | 1509 |
| TX | 1374 | 2332 | 3706 | 297 | 143 | 4146 |
| Total | 5732 | 7510 | 13 242 | 2702 | 486 | 16 430 |
| RLP | 1823 | 1928 | 3751 | 1380 | 98 | 5229 |
| RLK | 23 989 | 24 662 | 48 651 | 1748 | 1212 | 51 611 |
| Total | 25 812 | 26 590 | 52 402 | 3128 | 1310 | 56 840 |
| Grand total | 31 544 | 34 100 | 65 644 | 5830 | 1796 | 73 270 |
| Non‐synthetics | 19 727 | 21 256 | 40 983 | 2848 | 1126 | 44 957 |
| Synthetics | 11 817 | 12 844 | 24 661 | 2982 | 670 | 28 313 |
Figure 1The distribution of variable genes and RLP, RLK and NBS domains across the reference genomes. These densities were normalized by the genome‐wide maximum of each measurement so that they peak at 1.
The number of different RGA candidates and subfamilies on the reference genome
| Class | A01 | A02 | A03 | A04 | A05 | A06 | A07 | A08 | A09 | A10 | C01 | C02 | C03 | C04 | C05 | C06 | C07 | C08 | C09 |
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| RLK | 44 (23–21) | 50 (36–14) | 65 (60–5) | 41 (29–12) | 35 (29–6) | 64 (56–8) | 54 (48–6) | 32 (30–2) | 55 (42–13) | 45 (40–5) | 43 (17–26) | 60 (26–34) | 87 (69–18) | 55 (47–8) | 35 (32–3) | 60 (53–7) | 60 (56–4) | 35 (29–6) | 65 (43–22) | 985 (765–220) |
| RLP | 5 (5–0) | 3 (3–0) | 5 (4–1) | 1 (0–1) | 3 (3–0) | 2 (2–0) | 3 (3–0) | 7 (7–0) | 5 (4–1) | 3 (3–0) | 2 (2–0) | 7 (4–3) | 7 (4–3) | 3 (2–1) | 6 (5–1) | 1 (1–0) | 2 (2–0) | 7 (5–2) | 4 (3–1) | 76 (62–14) |
| RN | 1 (1–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 1 (1–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 1 (0–1) | 0 (0–0) | 0 (0–0) | 3 (2–1) |
| TNL | 2 (0–2) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 2 (0–2) | 0 (0–0) | 0 (0–0) | 3 (1–2) | 5 (5–0) | 0 (0–0) | (1–0) | 3 (1–2) | 7 (2–5) | 0 (0–0) | 2 (2–0) | 0 (0–0) | 2 (1–1) | 0 (0–0) | 3 (1–2) | 30 (14–16) |
| RNL | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 2 (2–0) | 1 (1–0) | 0 (0–0) | 1 (0–1) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 1 (0–1) | 0 (0–0) | 2 (0–2) | 7 (3–4) |
| NL | 3 (0–3) | 12 (8–4) | 4 (3–1) | 0 (0–0) | 4 (4–0) | 1 (1–0) | 2 (0–2) | 3 (1–2) | 5 (2–3) | 0 (0–0) | 5 (1–4) | 2 (1–1) | 6 (4–2) | 0 (0–0) | 1 (1–0) | 1 (0–1) | 8 (7–1) | 4 (3–1) | 7 (3–4) | 68 (39–29) |
| NBS | 2 (0–2) | 2 (1–1) | 1 (1–0) | 0 (0–0) | 1 (0–1) | 1 (1–0) | 0 (0–0) | 1 (1–0) | 1 (0–1) | 1 (1–0) | 2 (1–1) | 5 (1–4) | 4 (2–2) | 0 (0–0) | 0 (0–0) | 1 (1–0) | 2 (2–0) | 2 (1–1) | 4 (2–2) | 30 (15–15) |
| CN | 2 (0–2) | 3 (2–1) | 1 (0–1) | 0 (0–0) | 4 (1–3) | 2 (0–2) | 0 (0–0) | 1 (1–0) | 0 (0–0) | 0 (0–0) | 2 (0–2) | 1 (1–0) | 2 (1–1) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 18 (6–12) |
| CNL | 1 (0–1) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 1 (1–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 1 (1–0) | 0 (0–0) | 2 (1–1) | 0 (0–0) | 2 (2–0) | 1 (1–0) | 1 (1–0) | 0 (0–0) | 2 (2–0) | 1 (1–0) | 1 (0–1) | 13 (10–3) |
| TN | 0 (0–0) | 1 (1–0) | 1 (1–0) | 0 (0–0) | 0 (0–0) | 1 (1–0) | 0 (0–0) | 0 (0–0) | 3 (3–0) | 2 (2–0) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 1 (0–1) | 0 (0–0) | 2 (1–1) | 2 (2–0) | 1 (1–0) | 1 (0–1) | 15 (12–3) |
| OTHER | 0 (0–0) | 0 (0–0) | 1 (0–1) | 0 (0–0) | 0 (0–0) | 2 (1–1) | 0 (0–0) | 1 (1–0) | 2 (2–0) | 1 (1–0) | 0 (0–0) | 2 (1–1) | 2 (0–2) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 6 (4–2) | 1 (1–0) | 1 (1–0) | 19 (12–7) |
| TX | 6 (2–4) | 8 (3–5) | 0 (0–0) | 0 (0–0) | 0 (0–0) | 3 (1–2) | 2 (2–0) | 2 (0–2) | 6 (4–2) | 2 (0–2) | 7 (2–5) | 11 (1–10) | 5 (2–3) | 1 (0–1) | 3 (3–0) | 7 (3–4) | 8 (7–1) | 1 (1–0) | 8 (6–2) | 80 (37–43) |
| Core | 31 | 54 | 69 | 29 | 38 | 64 | 55 | 43 | 63 | 47 | 25 | 36 | 86 | 50 | 44 | 59 | 83 | 42 | 59 | 977 |
| Variable | 35 | 25 | 9 | 13 | 12 | 13 | 8 | 8 | 20 | 8 | 39 | 55 | 36 | 11 | 4 | 13 | 11 | 10 | 37 | 367 |
| Core (%) | 46.97 | 68.35 | 88.46 | 69.05 | 76.00 | 83.12 | 87.30 | 84.31 | 75.90 | 85.45 | 39.06 | 39.56 | 70.49 | 81.97 | 91.67 | 81.94 | 88.30 | 80.77 | 61.46 | 72.69 |
| Variable (%) | 53.03 | 31.65 | 11.54 | 30.95 | 24.00 | 16.88 | 12.70 | 15.69 | 24.10 | 14.55 | 60.94 | 60.44 | 29.51 | 18.03 | 8.33 | 18.06 | 11.70 | 19.23 | 38.54 | 27.31 |
| Total | 66 | 79 | 78 | 42 | 50 | 77 | 63 | 51 | 83 | 55 | 64 | 91 | 122 | 61 | 48 | 72 | 94 | 52 | 96 | 1344 |
The numbers in parentheses represent the number of core and variable genes, respectively.
The number of SNPs and their effects in the reference genome, pangenome additional contigs and reference genome unplaced contigs
| Variants | Reference genome | A and C | Pangenome additional contigs | Reference genome unplaced contigs | Total number | ||
|---|---|---|---|---|---|---|---|
| A genome | C genome | ||||||
| Non‐synonymous | Variants_impact_HIGH | 33 | 82 | 115 | 13 | 3 | 131 |
| Variants_impact_LOW | 2963 | 2114 | 5077 | 174 | 80 | 5331 | |
| Variants_impact_MODERATE | 2015 | 2555 | 4570 | 245 | 98 | 4913 | |
| Sum | 5011 | 4751 | 9762 | 432 | 181 | 10 375 | |
| Total non‐synonymous | 10 375 (7027–3348) | ||||||
| Synonymous | Variants_effect_synonymous_variant | 2782 | 1923 | 4705 | 162 | 76 | 4943 |
| Total synonymous | 4943 (3557–1386) | ||||||
| Total SNPs | 15 318 (10 584–4734) | ||||||
| Mis‐sense | Variants_effect_mis‐sense_variant | 2015 | 2555 | 4570 | 245 | 98 | 4913 |
| Total mis‐sense | 4913 (3099–1814) | ||||||
| Non‐sense | Variants_effect_stop_gained | 18 | 46 | 64 | 5 | 1 | 70 |
| Variants_effect_stop_lost | 4 | 5 | 9 | 3 | 1 | 13 | |
| Variants_effect_stop_retained_variant | 2 | 1 | 3 | 0 | 0 | 3 | |
| Sum | 76 | 8 | 2 | 86 | |||
| Total non‐sense | 86 (48–38) | ||||||
| Other effects | Variants_effect_5_prime_UTR_premature_start_codon_gain_variant | 0 | 0 | 0 | 0 | 0 | 0 |
| Variants_effect_initiator_codon_variant | 1 | 2 | 3 | 0 | 0 | 3 | |
| Variants_effect_splice_acceptor_variant | 1 | 18 | 19 | 2 | 1 | 22 | |
| Variants_effect_splice_donor_variant | 8 | 11 | 19 | 2 | 0 | 21 | |
| Variants_effect_splice_region_variant | 237 | 268 | 505 | 27 | 7 | 539 | |
| Variants_effect_start_lost | 2 | 2 | 4 | 1 | 0 | 5 | |
| Sum | 550 | 32 | 8 | 590 | |||
| Total | 590 (422–168) | ||||||
The numbers in parentheses represent the number of SNPs on core and variable genes, respectively.
Figure 2The density of NBS genes vs. variable NBS genes vs. STOP, synonymous and non‐synonymous SNPs across the reference genomes. These densities were normalized by the genome‐wide maximum of each measurement so that they peak at 1.
Figure 3Waterfall plot of the blackleg‐linked QTL (Rlm4 locus). Gene order is determined by the position in the reference assembly.