| Literature DB >> 29149280 |
Valentina Tranchida-Lombardo1, Riccardo Aiese Cigliano2, Irantzu Anzar2, Simone Landi1, Samuela Palombieri1, Chiara Colantuono3, Hamed Bostan3, Pasquale Termolino1, Riccardo Aversano3, Giorgia Batelli1, Maria Cammareri1, Domenico Carputo3, Maria Luisa Chiusano3, Clara Conicella1, Federica Consiglio1, Nunzio D'Agostino4, Monica De Palma1, Antonio Di Matteo3, Silvana Grandillo1, Walter Sanseverino2, Marina Tucci1, Stefania Grillo1.
Abstract
Tomato is a high value crop and the primary model for fleshy fruit development and ripening. Breeding priorities include increased fruit quality, shelf life and tolerance to stresses. To contribute towards this goal, we re-sequenced the genomes of Corbarino (COR) and Lucariello (LUC) landraces, which both possess the traits of plant adaptation to water deficit, prolonged fruit shelf-life and good fruit quality. Through the newly developed pipeline Reconstructor, we generated the genome sequences of COR and LUC using datasets of 65.8 M and 56.4 M of 30-150 bp paired-end reads, respectively. New contigs including reads that could not be mapped to the tomato reference genome were assembled, and a total of 43, 054 and 44, 579 gene loci were annotated in COR and LUC. Both genomes showed novel regions with similarity to Solanum pimpinellifolium and Solanum pennellii. In addition to small deletions and insertions, 2, 000 and 1, 700 single nucleotide polymorphisms (SNPs) could exert potentially disruptive effects on 1, 371 and 1, 201 genes in COR and LUC, respectively. A detailed survey of the SNPs occurring in fruit quality, shelf life and stress tolerance related-genes identified several candidates of potential relevance. Variations in ethylene response components may concur in determining peculiar phenotypes of COR and LUC.Entities:
Mesh:
Year: 2018 PMID: 29149280 PMCID: PMC5909465 DOI: 10.1093/dnares/dsx045
Source DB: PubMed Journal: DNA Res ISSN: 1340-2838 Impact factor: 4.458
Figure 1.Main morpho-physiological characteristics of the tested genotypes, Corbarino (COR) and Lucariello (LUC). Brix degrees values represent average ± SD (n = 6). % of sound fruits is defined as the percentage of fruits which do not present signs of decay (e.g. pathogen attacks or injuries) or wrinkling after 30 or 60 days of storage (n = 100).
Resequencing data from Corbarino (COR) and Lucariello (LUC) were used to perform an iterative variant calling analysis to identify polymorphisms with respect to the reference tomato genome Heinz 1706
| Variant calling | COR | LUC | Common |
|---|---|---|---|
| SNPs | 129,426 | 120,112 | 85,578 |
| Deletions | 16,847 | 14,894 | 10,958 |
| Insertions | 44,152 | 39,289 | 27,651 |
| Total | 190,425 | 174,295 | 124,187 |
Here the number of high quality SNPs and DIPs across COR and LUC are showed.
Validation of the putative structural variants through reciprocal alignments of Corbarino (COR) and Lucariello (LUC) genomes to the assembled and unassembled (BAC sequences) Heinz 1706 reference genome and verification by PCR
| A. Deletions | COR | LUC | Common |
|---|---|---|---|
| Size range, bp | 256–19,538 | 229–18,654 | |
| Sequences including > 87% of Ns | 180 | 173 | |
| Sequences including < 87% of Ns | |||
| in genic regions | 12 | 7 | 2 |
| in intergenic regions | 13 | 16 | 3 |
| in highly repetitive regions | 60 | 59 | 15 |
| Total | 265 | 255 | |
| B. Insertions | COR | LUC | Common |
| Size range, bp | 446–1012 | 441–1032 | |
| Unconfirmed insertions | 3 | 3 | 1 |
| Confirmed total length insertions | 6 | 5 | 0 |
| Confirmed fragmented insertions | |||
| in genic regions | 4 | 4 | 2 |
| in intergenic regions | 10 | 13 | 6 |
| in highly repetitive regions | 6 | 11 | 3 |
| Total | 29 | 36 | |
The size range of the variants are indicated. A: % of unspecified nucleotides (Ns) and genomic context of the COR and LUC deletions. B: number of unconfirmed and confirmed insertions and genomic context of the scattered insertions.
aVerified also by PCR.
Number and classification of genes annotated in Corbarino (COR) and Lucariello (LUC)
| Annotation | COR | LUC |
|---|---|---|
| Genes | 43,054 | 44,579 |
| Transcripts | 62,369 | 65,047 |
| Protein coding transcripts | 46,065 | 45,815 |
| High confidence lncRNAs | 14,091 | 16,838 |
| Low confidence lncRNAs | 2,213 | 2,394 |
aAccording to the criteria listed in reference 33.
Figure 2.Barplot showing the changes of Heinz 1706 gene annotations in Corbarino (COR) and Lucariello (LUC). Loci are described as (1) ‘unchanged’ if they are unchanged between Heinz 1706 and the two ecotypes; (2) ‘merged’ if they are originated from the fusion of two or more Heinz 1706 genes; (3) ‘split’ if they derive from a Heinz 1706 gene that was split in two and (4) ‘updated’ if the locus in COR and/or LUC showed a different exon/UTR structure.
Mapping of one genome transcripts versus the other genome considering multiple hits
| Mapping | Starting number of transcripts | Number of mapped transcripts | Number of identified loci |
|---|---|---|---|
| COR-Trans_vs_Heinz250 | 62,369 | 62,206 | 77,220 |
| COR-Trans_vs_LUC250 | 62,369 | 62,200 | 77,227 |
| Heinz-Trans_vs_COR250 | 34,725 | 34,484 | 42,337 |
| Heinz-Trans_vs_LUC250 | 34,725 | 34,489 | 42,352 |
| LUC-Trans_vs_COR250 | 65,094 | 64,820 | 79,259 |
| LUC-Trans_vs_Heinz250 | 65,094 | 64,813 | 79,277 |
Types and number of SNPs with potential disruptive effect on gene function in Corbarino (COR) and Lucariello (LUC)
| Type | Subtype | COR | Number of genes | LUC | Number of genes |
|---|---|---|---|---|---|
| Mutations creating reading frame shifts | Disruptive inframe deletions | 24 | 24 | 18 | 18 |
| Disruptive inframe insertions | 13 | 12 | 9 | 9 | |
| Frameshift variants | 146 | 140 | 100 | 96 | |
| Inframe deletions | 9 | 9 | 11 | 11 | |
| Inframe insertions | 15 | 14 | 14 | 14 | |
| Mutations affecting ORF length | Start codon loss | 2 | 2 | 5 | 5 |
| Stop codon gain | 29 | 27 | 28 | 27 | |
| Stop codon loss | 1 | 1 | 1 | 1 | |
| Mutations affecting protein sequence | Missense variants | 1,749 | 1,186 | 1,477 | 1,055 |
| Mutations affecting splicing sites | Stop retained variants | 1 | 1 | 2 | 2 |
| Complex mutations creating reading frame shifts and affecting ORF length | Frameshift variants and start loss | 13 | 13 | 11 | 11 |
| Frameshift variants and stop gain | 3 | 3 | 2 | 2 | |
| Frameshift variants and stop loss | 1 | 1 | 1 | 1 | |
| Complex mutations affecting splicing sites and ORF length | Missense variants and splice region variants | 1 | 1 | 1 | 1 |
| Splice region variants and stop retained variants | 2 | 2 | 1 | 1 | |
| Stop loss and splice region variant | 6 | 6 | 6 | 6 |
Figure 3.A gene ontology enrichment analysis (GOEA) was performed to identify enriched functions among the LUC and COR polymorphic genes. The barplot shows the mostly enriched GO categories (molecular function and biological process domains) of the genes affected by missense and non-sense mutations.