| Literature DB >> 25523007 |
Víctor M González, Núria Aventín, Emilio Centeno, Pere Puigdomènech1.
Abstract
BACKGROUND: Plant NBS-LRR -resistance genes tend to be found in clusters, which have been shown to be hot spots of genome variability. In melon, half of the 81 predicted NBS-LRR genes group in nine clusters, and a 1 Mb region on linkage group V contains the highest density of R-genes and presence/absence gene polymorphisms found in the melon genome. This region is known to contain the locus of Vat, an agronomically important gene that confers resistance to aphids. However, the presence of duplications makes the sequencing and annotation of R-gene clusters difficult, usually resulting in multi-gapped sequences with higher than average errors.Entities:
Mesh:
Year: 2014 PMID: 25523007 PMCID: PMC4378003 DOI: 10.1186/1471-2164-15-1131
Source DB: PubMed Journal: BMC Genomics ISSN: 1471-2164 Impact factor: 3.969
Melon genome sequence improvement metrics
| Whole genome | |||
|---|---|---|---|
| v3.5 | Post-PAGIT1 | ||
|
| 1,599 | 1,599 | |
|
| 29,865 | 20,406 | |
|
| 375,485,313 | 375,516,019 | |
|
| 336,097,046 | 337,325,315 | |
|
| 28,306 | 20,295 | |
|
| 9,133 | 2,581 | |
|
| 39,388,267 (10.5%) | 38,190,704 (10.2%) | |
|
| 361,983,232 | 362,182,953 | |
|
| 322,595,151 | 323,992,316 | |
|
| 13,502,081 | 13,333,066 | |
|
| 13,501,895 | 13,332,999 | |
|
| 18.3 | 13.9 | |
|
| |||
|
| 53.771 | ||
|
| 46.659 | ||
|
| 121.178 | ||
|
| |||
|
|
|
| |
|
| 1,118,599 | 1,120,734 | 1,066,373 |
|
| 120 | 84 | 27 |
|
| 202,018 (18%) | 192,656 (17.1%) | 57,617 (5.4%) |
|
| 1,656 | 2,007 | 1,859 |
1Immediately after prinseq processing step performed following iCORN/redundancy removal steps.
2Only A, T, C, and G, not N, are counted.
3Contiguous strings of Ns.
4Most 20 bp-long gaps produced by the assembler while building the reference assembly mark adjacent contigs that overlap but for the presence of short, low quality/wrong sequences at their ends. These gaps are good targets for IMAGE, the software responsible for closing gaps in the PAGIT toolkit, which removes end-sequence from contigs while attempting to extend them with re-sequencing data and to overlap adjacent contigs.
5Unmapped reads after aligning the set of DHL92 high-quality Illumina PE (see Methods section).
6Based on the output reports of the iCORN software.
Figure 1A graphical view of the melon genome sequence improvement. A Map of the DHL92 Illumina PE reads against the reference genome (DHL92, CM3.5 version) on the region containing genes MELO3C004311 to MELO3C004313. Three major gaps mark the position of stretches of Ns in the reference genome, one of which causes gene MELO3C004312 to be partially annotated. Colored vertical lines above the coverage track signal discrepancies between the aligned reads and the reference genome. B Mapping of DHL92 Illumina PE reads against the improved reference genome. The main gaps were closed and the number of discrepancy lines greatly reduced. Figure produced using IGV software.
Effects of sequence improvement on the annotation of R-genes
| Gene Id1 | Class2 | Sequence changes3 | Notes |
|---|---|---|---|
|
| TN | No | 3′-partial |
|
| TNL | (T)14 -- > (T)11 (I) | |
| (T)10 -- > (T)9 (I) | |||
|
| TN | No | 3′-partial |
|
| TNL | (T)7 -- > (T)8 (I) | Additional a.a.s (gap closed) |
| (T)4 -- > (T)3 (I) | |||
| 1 T deleted (I) | |||
| 20 N - > [559 bp (I + E) + 30 N (in poly(AT), I)] | |||
|
| TNL | (N)5054 gap closed (E) | Additional a.a.s (gap closed) |
| (CTATAATTG) -- > (CATTG) (E) | |||
| (TGTCGTTTA) -- > (T9GCGTTA) (I) | |||
| (TTATA9T) -- > (TA9T) (I) | |||
| (A6CA5G) -- > (A7CA6G) (I) | |||
|
| N | No | 5′- and 3′-partial |
|
| TNL | (N)767 gap closed (E) | Additional a.a.s (gap closed) |
| (A)20 -- > (A)26 (I) | |||
| (A)16 -- > (A)24 (I) | |||
| (T)7(A)11 -- > (T)8(A)1o (I) | |||
|
| TN | (N)20 - > (N)1 (E) | 3′-partial |
|
| TN | (N)2153 gap closed (E) | |
| (CA5CCT12) -- > (CAAT10) (I) | 3′-partial | ||
| (CTTTTA13) -- > (CTTTA11) (I) | Additional a.a.s (gap closed) | ||
| (GAAAG) -- > (GAG) (E) | |||
|
| TNL | (A)14 -- > (A)7 (I) | |
| (AA) deleted (I) | |||
| (A)21 -- > (A)15 (I) | |||
|
| TNL | (AA) deleted (I) | |
|
| TNL | (N)73 - > (N)28 (I) | (N)28 in poly(TA) (intron) |
| (TACCA28GAAA) -- > (TCCA28GA) (I) | |||
| (A10CCAAAG) -- > (A13CCAAAAG) (I) | |||
| (AAAAGTTTA6) -- > (AAAAGTTTCA6) (I) | |||
| (GAAAGTATA5TCA) -- > (GAAAGTATAAATCA) (I) | |||
| (TTA10TTTGAAAA) -- > (A11TGAAAA) (I) | |||
|
| NL | (CTTGAATGAAACTTA) -- > (CTTGAAC) (E) | 5′-partial5 |
| (GTGAAACTTA11CATGG) -- > (GTA8CATGG) (I) | |||
| (TACAATT) -- > (TACTT) (E) | |||
| (CTA11CAT) -- > (CTA8CAT) (E) | |||
|
| TNL | (N)78 gap closed (I) | |
| (GTTCCGTATTCTAATTAT4ACT3) -- > (GTCCGATCAATAT3ACT) (I) | |||
| (TAT6C) -- > (TATTC) (E) | |||
|
| TN | (N)30 gap closed (E) | 3′-partial |
| (GA4TTTACT) -- > (GA5TACT) (I) | |||
|
| TN | (N)468 gap closed (introns + exons) | 3′-partial |
| Additional a.a.s (gap closed) | |||
|
| TN | (N)20 gap closed (E) | 3′-partial |
| Additional a.a.s (gap closed) | |||
|
| CNL | (N)1414 gap closed (E + I) | Additional a.a.s (gap closed) |
| (N)4442 gap closed (exons + introns) | |||
|
| CN | (N)4962 gap closed (exons + introns) | 3′-partial |
| Additional a.a.s (gap closed) | |||
|
| NL | (N)4489 -- > (N)2460 (introns + exon) | Add. a.a.s (gap partially solved) |
|
| CNL | (N)3747 gap closed (E) | Additional a.a.s (gap closed) |
|
| CN | No | 3′-partial |
|
| NL | No |
1MELO3C004261, MELO3C004265, MELO3C004295, MELO3C004304, MELO3C004319 are ORFs that code for, respectively, 141-, 220-, 98-, 109-, and 233-aa peptides with homology to TNL proteins. Due to their short length, they are most probably not R-genes and, therefore, were not considered for further analysis.
2TN: TIR-NBS; TNL: TIR-NBS-LRR; N: NBS; NL: NBS-LRR; CNL: CC-NBS-LRR; CN: CC-NBS.
3Sequence changes between assemblies CM3.5 and CM3.6.1 in each R-gene region (start to stop codon, including introns). I, intron; E, exon.
4Previous MELO3C004302 gene does belong to the MELO3C004303 ORF, as deduced by sequence comparison with other TNL R-proteins.
5Gene MELO3C004304 codes for a 109-aa peptide with a TIR domain, and may represent a 5′ fragment of gene MELO3C004302-4303.
6Previous MELO3C004310 gene does belong to the MELO3C004309 ORF, as deduced by sequence comparison with other TNL R-proteins.
Genetic variants and their effects on genes in melon cultivars DHL92 and C-836
| DHL921 | C-8362 | |||||||
|---|---|---|---|---|---|---|---|---|
| Whole genome | R-genes region | Whole genome | R-genes region | |||||
| CM3.5 | Improved | CM3.5a | Improvedb | CM3.5 | Improved | CM3.5a | CM3.6.1b | |
|
| 1 | 0.5 | 4.1 | 0.25 | 12.2 | 12 | 8.9 | 9.5 |
|
| 365,012 | 193,146 | 3,085 | 180 | 4,482,184 | 4,455,971 | 6,761 | 7,219 |
|
| ||||||||
|
| 18,606 | 15,867 | 2,106 | 21 | 3,463,344 | 3,488,405 | 5,311 | 5,631 |
|
| 77,841 | 63,846 | 261 | 112 | 424,469 | 416,645 | 522 | 685 |
|
| ||||||||
|
| 90,809 | 45,563 | 254 | 31 | 244,983 | 242,498 | 390 | 415 |
|
| 1,992 | 2,115 | 12 | 2 | 11,240 | 12,037 | 16 | 23 |
|
| ||||||||
|
| 172,138 | 62,162 | 429 | 7 | 323,714 | 271,219 | 510 | 442 |
|
| 3,570 | 3,576 | 23 | 7 | 14,380 | 15,136 | 12 | 23 |
|
| ||||||||
|
| 1,469 | -c | 125d | 2d | 62,321 | -c | 393d | 520d |
|
| 54 | -c | 4d | 0d | 1,461 | -c | 2d | 6d |
|
| 934 | -c | 97d | 5d | 62,200 | -c | 210d | 249d |
|
| 4,317 | -c | 232d | 7d | 131,515 | -c | 625d | 800d |
|
| 50,688 | -c | 118d | 12d | 526,990 | -c | 309d | 496d |
|
| 700 | -c | 5d | 0d | 8,809 | -c | 8d | 15d |
1DHL92 Illumina PE reads mapped either to the melon reference assembly (CM3.5) or to the improved sequence.
2C-836 Illumina PE reads mapped either to the melon reference assembly (CM3.5) or to the improved sequence.
aRegion comprising MELO3C004258 to MELO3C004324 genes (CM3.5, scaffold0003: 5,478,521 - 6,236,414).
bRegion comprising MELO3C004258 to MELO3C004324 genes (improved scaffold0003: 5,478,926 - 6,186,068).
cGene annotation not available.
dOnly TIR-NBS-LRR and CC-NBS-LRR genes are considered.
Figure 2Clusters of R-genes in the MELO04258-MELO3C004324 region, phylogeny relationships, and synteny with cucumber and watermelon. Melon genes, represented by vertical lines accompanied by gene Ids, are grouped in four sub-clusters based on proximity: TIR-NBS-LRR genes in clusters A (red lines), B (yellow lines), and C (green lines); CC-NBS-LRR genes in cluster D (blue lines). Syntenic genes in cucumber and watermelon are in the same colour as their melon orthologues. Colored lines connect pairs of genes in two different species showing the highest protein identity/blastp e-values; however, when the genes in a cluster are so similar as to unequivocally establish one-to-one orthology relationships, a colored line connects a gene to a cluster of genes instead. ORFs with homology to NBS-LRR R-genes but too short as to be considered true genes are in grey. Gene Ids in red indicate putatively expressed genes (on the basis of information from EST databases). Direction of transcription of all genes is left to right, but for the few genes marked with red arrows. For each species, phylogeny relationships based on the alignment of the conserved NBS regions are also shown. Note that branch lengths have no phylogeny meaning due to the original trees having been distorted to project them on the actual disposition of genes on the genome.
Expression data support of R-genes
| Gene | ICUGI Genea | ICUGI Unigene | Cultivar | Library |
|---|---|---|---|---|
|
| ||||
| 91460.1 | Csa2M022790.1 | - | ||
| 91470.1 | Csa2M022270.1 | CU122153 | Vlaspik | Fruit (mixed, 1–50 DAP) |
| CU171360 | WI 1983H | |||
| 91680.1 | Csa2M021540.1 | - | ||
| 91690.1 | Csa2M021520.1 | - | ||
| 91710.1 | Csa2M021510.1 | - | ||
| 91780.1 | Csa2M020940.1 | CU141826 | WI 1983H | Hermaphrodite flower |
| CU173837 | WI 1983H | Hermaphrodite flower | ||
| 91820.1 | Csa2M020890.1 | CU134688 | WI 1983G | Gynoecious flower |
| 91840.1 | Csa2M020870.1 | CU162592 | WI 1983H | Hermaphrodite flower |
| 91880.1 | Csa2M014830.1 | - | ||
|
| ||||
| MELO3C004258 | - | |||
| MELO3C004259 | MU66045 | T-111 | Callus | |
| MELO3C004260 | - | |||
| MELO3C004262 | - | |||
| MELO3C004266 | - | |||
| MELO3C004288 | - | |||
| MELO3C004289 | MU63434 | PI161375 | Callus | |
| MELO3C004290 | - | |||
| MELO3C004291 | - | |||
| MELO3C004292 | - | |||
| MELO3C004294 | MU53509 | pat81 | Root (healthy) | |
| Root (infected, | ||||
| MELO3C004295 | - | |||
| MELO3C004301 | MU66172 | T-111 | Callus | |
| MELO3C004302-3 | MU53927 | PI161375 | Callus | |
| MELO3C004309-10 | - | |||
| MELO3C004311 | MU43621 | PI161375 | Callus | |
| Cantaloupe C-35 | CMV Cotyledon infected | |||
| MU48550 | PI161375 | Callus | ||
| MELO3C004312 | MU52721 | PI161375 | Callus | |
| Dulce | Mixed Fruit (4 devel. stages) | |||
| MU54890 | T-111 | Fruit (15 DAP) | ||
| MELO3C004313 | MU67511 | Védrantais | Callus | |
| MELO3C004317 | - | |||
| MELO3C004318 | MU61424 | Cantaloupe C-35 | Healthy leaf | |
| MELO3C004319 | - | |||
| MELO3C004320 | MU55681 | T-111 | Fruit (46 DAP) | |
| MELO3C004321 | MU63490 | PI161375 | Callus | |
| MELO3C004323 | - | |||
| MELO3C004324 | - | |||
|
| ||||
| Cla019863 | WMU41608 | 97103 | Fruit (10 DAP) | |
| WMU77867 | 97103 | Fruit (34 DAP) | ||
| WMU45091 | 97103 | Fruit (18 DAP) & Fruit (26 DAP) | ||
| Cla019857 | WMU48640 | 97103 | Fruit (10 DAP) | |
| Cla019856 | - | |||
| Cla019855 | - | |||
| Cla019854 | - | |||
| Cla019850 | WMU79003 | Illinwake Red | Norm. and subst. library | |
| (Mix 12, 24, 36 DAP flesh fruit; driver: leaf) |
aCucumber gene annotation used is taken from Phytozome (Gy14 cultivar draft genome) while the genome assembly deposited at ICUGI is that of the cucumber 9930 inbred line.
Figure 3Density of SNPs in the cluster of R-genes across five melon cultivars. A Vertical colored lines mark the position where a SNP is detected in any of the following melon cultivars: CV: C-836, IQ: C-1012, PI: PI 161375, DHL: DHL92, PS: T-111 “Piel de Sapo”. Reference: DHL92 MELO3C004258-MELON3C004324 improved region. Blue lines: same nucleotide as in the reference; red lines: alternative nucleotide (two alleles); yellow lines: alternative nucleotide (one allele only). The “R-Genes” track shows the position of the TIR- and CC-NBS-LRR genes. B As in A, but only SNPs in the coding sequences of the R-genes are shown. Figure produced with SVAMP software.
Figure 4Sequence comparison between the t region of the PI 161376 variety and gene MELO3C04317. Dissimilar DNA regions are marked with horizontal red lines. Putative protein products are also shown: Orange sectors, CC domains; red sectors, NBS domains; yellow sectors, LRR domains; green sectors, LRR C-terminus.