| Literature DB >> 21073723 |
Víctor M González1, Andrej Benjak, Elizabeth Marie Hénaff, Gisela Mir, Josep M Casacuberta, Jordi Garcia-Mas, Pere Puigdomènech.
Abstract
BACKGROUND: Cucumis melo (melon) belongs to the Cucurbitaceae family, whose economic importance among horticulture crops is second only to Solanaceae. Melon has a high intra-specific genetic variation, morphologic diversity and a small genome size (454 Mb), which make it suitable for a great variety of molecular and genetic studies. A number of genetic and genomic resources have already been developed, such as several genetic maps, BAC genomic libraries, a BAC-based physical map and EST collections. Sequence information would be invaluable to complete the picture of the melon genomic landscape, furthering our understanding of this species' evolution from its relatives and providing an important genetic tool. However, to this day there is little sequence data available, only a few melon genes and genomic regions are deposited in public databases. The development of massively parallel sequencing methods allows envisaging new strategies to obtain long fragments of genomic sequence at higher speed and lower cost than previous Sanger-based methods.Entities:
Mesh:
Substances:
Year: 2010 PMID: 21073723 PMCID: PMC3095328 DOI: 10.1186/1471-2229-10-246
Source DB: PubMed Journal: BMC Plant Biol ISSN: 1471-2229 Impact factor: 4.215
Figure 1Distribution of the genetic markers used to anchor the sequenced BACs to the . Linkage groups are numbered according to the C. melo map of Deleu et al. [24]. Map distances are indicated on the left in cM. Markers in italics have been placed in an approximate position from Oliver et al. [21].
Correspondence between sequenced BAC clones, genetic markers and assembled contigs/scaffolds.*
| Scaffold | ||||||||
|---|---|---|---|---|---|---|---|---|
| Linkage group | BAC name | Name | GenBank ID | Length (bp) | Stretches of Ns | BAC-ends | ||
| Length (bp) | ||||||||
| I | MC216 | Cm57_M112 | Contig311 | 626 | 0 | 0 | 0 | |
| I | MC279 | Cm31_J021 | Scaffold00087 | 126,619 | 3 | 1,334 | 2 | |
| I | EST1.16 | Cm33_F232 | Scaffold00078 | 113,787 | 11 | 9,016 | 2 | |
| I | EST5.27 | Cm43_O211 | Scaffold00052a | 131,697 | 10 | 3,452 | 2 | |
| II | MC313 | Cm05_B012 | Scaffold0006 | 126,054 | 4 | 4,978 | 2 | |
| II | 52B5SP6 | Cm52_B051 | Scaffold52B05 | 138,922 | 38 | 29,090 | 2 | |
| II | MC252 | Cm46_G131 | Scaffold0009 | 151,031 | 2 | 529 | 2 | |
| III | MC127 | Cm05_P102 | Scaffold05P10 | 114,263 | 7 | 5,100 | 1 | |
| III | MC148 | Cm45_K102 | Scaffold00035 | 105,652 | 2 | 630 | 2 | |
| III | CmEXP2 | Cm24_H211 | Scaffold00033 | 86,310 | 3 | 1,125 | 2 | |
| III | MC054 | Cm52_C091 | Scaffold00024b | 61,053 | 2 | 666 | 1 | |
| III | MC032 | Cm55_F191 | Scaffold55F19 | 110,853 | 11 | 9,792 | 2 | |
| IV | MC344 | Cm33_M052 | Scaffold00077 | 148,622 | 6 | 2,636 | 2 | |
| IV | CmelF4A-2 | Cm59_B111 | Scaffold59B11 | 100,000 | 5 | 5,558 | 2 | |
| IV | MC275 | Cm11_I121 | Scaffold11I12 | 110,000 | 9 | 6,210 | 2 | |
| IV | MC239 | Cm06_A031 | Scaffold00012a | 75,205 | 0 | 0 | 1 | |
| IV | MC060 | Cm46_O062 | Scaffold000413 | 103,741 | 2 | 668 | 2 | |
| IV | CmEthInd | Cm14_C181 | Scaffold0001 | 108,322 | 1 | 478 | 2 | |
| V | MC007 | Cm52_M232 | Scaffold00070 | 112,968 | 2 | 567 | 2 | |
| V | MC233 | Cm24_G051 | Scaffold00017 | 82,645 | 1 | 247 | 2 | |
| Contig00219 | 810 | 0 | 0 | 0 | ||||
| V | EST2.22 | Cm46_I241 | Scaffold00044 | 12,974 | 4 | 4,832 | 1 | |
| Scaffold00071 | 20,426 | 7 | 9,835 | 1 | ||||
| V | MRGH63 | ScaffoldMRGH63 | 302,015 | 9 | 4,457 | |||
| Cm13_J041,4 | 2 | |||||||
| Cm14_M221 | 2 | |||||||
| Cm43_H201,2 | 2 | |||||||
| V | MC276 | Cm01_N31 | Scaffold00015 | 180,444 | 5 | 2,607 | 1 | |
| VI | MC268 | Cm02_C042 | Scaffold00031 | 105,693 | 4 | 1,877 | 2 | |
| VI | MC008 | Cm31_G082 | Scaffold00033 | 109,145 | 3 | 764 | 2 | |
| VI | MC251 | Cm02_K141 | Scaffold00058 | 121,212 | 8 | 2,788 | 2 | |
| VI | CI_56-B01 | Cm27_F031 | Scaffold27F03 | 96,265 | 1 | 506 | 2 | |
| VI | MC042 | Cm20_H141 | Scaffold00018 | 96,294 | 3 | 891 | 2 | |
| VII | MC373 | Cm55_C151 | Scaffold00057 | 98,578 | 1 | 316 | 2 | |
| VII | F271 | Cm45_K011 | Scaffold45K01 | 100,000 | 11 | 7,803 | 2 | |
| VII | F149 | Cm47_C022 | - | - | - | - | - | - |
| VII | EST5.15 | Cm47_A051 | Scaffold0004 | 101,589 | 3 | 1,515 | 2 | |
| VIII | F080 | Cm22_K191 | Scaffold00081 | 99,638 | 6 | 2,137 | 2 | |
| VIII | Cfd9 | Cm06_D161 | Scaffold00025 | 102,876 | 13 | 10,731 | 1 | |
| VIII | MC208 | Cm19_K172 | Scaffold00023 | 125,428 | 1 | 242 | 2 | |
| IX | MC092 | Cm24_H032 | Scaffold24H03 | 106,131 | 3 | 1,282 | 2 | |
| IX | F036 | Cm34_G201 | Scaffold00069 | 125,129 | 2 | 1,825 | 2 | |
| IX | MC203 | Cm54_J042 | Scaffold54J04 | 100,000 | 6 | 13,775 | 2 | |
| IX | CmERF1 | Cm54_I131 | Scaffold54I13 | 94,153 | 3 | 1,036 | 2 | |
| IX | EST1.17 | Cm10_D041 | Scaffold10D04 | 154,039 | 45 | 28,530 | 2 | |
| X | EST1.10 | Cm03_A211 | Scaffold00079 | 126,557 | 2 | 482 | 2 | |
| X | CmXTH5 | Cm41_H091 | Scaffold0005 | 136,275 | 2 | 696 | 2 | |
| X | EST5.29 | Cm19_G012 | Scaffold00013 | 100,283 | 14 | 8,175 | 2 | |
| X | CmEXP3 | Cm54_E012 | Scaffold54E01 | 100,000 | 18 | 13,832 | 2 | |
| XI | MC337 | Cm12_F092 | Scaffold00028 | 118,830 | 3 | 1,840 | 2 | |
| XI | MC375 | Cm03_C121 | Scaffold00014 | 128,906 | 13 | 9,538 | 1 | |
| XI | EST6.79 | Cm59_N091 | Scaffold00085 | 102,799 | 1 | 338 | 1 | |
| XI | A_08-D10 | Cm24_I032 | Scaffold24I03 | 121,276 | 16 | 8,734 | 1 | |
| XI | EST2.75 | Cm33_O171 | Scaffold00051 | 123,309 | 3 | 1,296 | 2 | |
| XII | MC123 | Cm59_C102 | Scaffold59C10 | 10,343 | 3 | 1,861 | 0 | |
| XII | MC132 | Cm03_I021 | Scaffold00086 | 79,495 | 4 | 3,249 | 1 | |
| XII | MC330 | Cm09_A171 | Scaffold00034 | 96,336 | 1 | 271 | 2 | |
| XII | MC286 | Cm05_O102 | Scaffold00020 | 142,670 | 8 | 2,708 | 2 | |
| - | - | Cm21_I082 | Scaffold00061 | 146,020 | 12 | 5,169 | 2 | |
| - | - | Cm12_I232 | Scaffold00010 | 114,336 | 9 | 8,517 | 2 | |
*Additional information regarding sequence and annotation characteristics of the assembled sequence can be found in the Additional file 3 Table S2
aGenetic marker information can be found in the Additional file 2 Table S1
bOne (1), both (2) or none (0) BAC-ends found on the scaffold/contig sequence
1First pool of BACs
2Second pool of BACs
3Marker sequence not found. Scaffold assignment based on information derived from the C. melo physical map http://melonomics.upv.es/static/files/public/physical_map/ and BAC-end information
4Sequenced previously by Shotgun-Sanger [35], Acc. No. EF657230
Details of the 454 FLX runs from which sequence data were obtained.
| Pool | Sequencing | Library | No. of reads | No. of | Total length | Average |
|---|---|---|---|---|---|---|
| Shotgun | 445,232 | - | 110,498,601 | 248 | ||
| 2/4 | Paired end | 89,392 | 3,152 | 23,214,413 | 260 | |
| 2/2 | Paired end | 557,452 | 126,681 | 139,772,537 | 251 | |
| 2/2 | Shotgun | 261,304 | - | 64,679,158 | 247 | |
| 3/8 | Paired end | 155,166 | 56,990 | 40,110,640 | 259 | |
aIncludes 8,046 reads obtained from the titration process of the samples as well as 20,627 reads from a 1/2 region that was poorly sequenced
Metrics for BAC assemblies and final results after manual correction.*
| 35 BACs | 23 BACs | Manual correction | ||
|---|---|---|---|---|
| 514 | 247 | 797 | - | |
| 3,936,343 | 2,325,066 | 6,127,262 | - | |
| 7,658 | 9,413 | 7,687 | - | |
| 32,583 | 32,458 | 30,630 | - | |
| 117,242 | 112,451 | 123,360 | - | |
| 99.5% | 99.5% | 99.5% | - | |
| 58 | 32 | 87 | 73 | |
| 41 | 25 | 62 | 57 | |
| 4,040,161 | 2,307,575 | 6,206,490 | 6,340,685 | |
| 69,657 | 72,111 | 71,338 | 86,882 | |
| 107,196 | 113,599 | 107,604 | 113,787 | |
| 222,620 | 200,453 | 212,424 | 303,725b | |
| 479 | 234 | 798 | 744 | |
| 224,871 | 121,734 | 417,982 | 382,726 | |
| 469 | 520 | 524 | 514 | |
| x46 | x25 | x39 | x39 | |
*Reads from all 57 BACs were processed together in one assembly run. Additional assemblies of each BAC pool were independently done and served for comparison purposes and to manually correct some scaffolds in the global assembly
aOnly contigs larger than 500 bp
bTwo previously published BACs were included in this scaffold (see Methods section and Additional file 1 Figure S1)
cContigs larger than 100 bp
Differences between Sanger- and 454-sequences of BAC Cm13_J04.
| Length of Sanger-sequence | 98,716 bp | |||
|---|---|---|---|---|
| Stretches of Ns on 454-sequence | 5 | 3,572 bp (3.6%) | ||
| ≤10 | 847 | 0 | ||
| 11 | 5 | 0 | ||
| 12 | 5 | 3 | (A/T)11 | |
| 13 | 3 | 2 | (A/T)12 | |
| 14 | 2 | 1 | (A/T)13 | |
| 15 | 3 | 2 | (A/T)14 | |
| 1 | (A/T)13 | |||
| 16 | 1 | 1 | (A/T)14 | |
| 17 | 3 | 2 | (A/T)15 | |
| 18 | 1 | 1 | (A/T)17 | |
| 22 | 1 | 1 | (A/T)19 | |
| 24 | 1 | 0 | ||
| 28 | 1 | 1 | A15CA13 | |
| 5-7 | 49 | 0 | ||
| (CT)15 | (CT)15CTACTTACTTACTTACNNNNNNNC(CT)14 | |||
| (GA)21 | (GA)21GTAGTACGTACN23(GA)6 | |||
1Number of homopolymers in the Sanger sequence
2Number of homopolymers in the 454 sequence showing differences with the corresponding homopolymers in the Sanger sequence
C. melo BAC sequences characteristicsa.
| 6,230,040 bp | |
| 5,958,994 bp | |
| 616 | |
| 451 (73.2%) | |
| 11 | |
| 9.9 genes/100 kb (1.5 - 19.7, SD: 4.3) | |
| 238 bp | |
| 393 bp | |
| 4.9 (1-29, SD: 4.4) (74% of genes ≤ 6 exons) | |
| 386 (34-2,156, SD: 268) | |
| 45.6 (4.3 - 95.5, SD: 20.6) | |
| 33 (30.2 - 38.7, SD: 1.34) | |
| 4,430 (74,590 bp, 1.25% of total sequence) | |
| 139 |
aFrom the analysis of all 57 scaffolds plus one contig longer than 20 kb
bGenes from transposons not counted
cPartial genes at BAC borders counted as 0.5 genes
dTransposon proteins not considered
e ORFs without introns not considered
fMinimum repeat lengths considered: 10 bp (mononuc.), 12 bp (di- and trinuc.), 16 bp (tetranuc.), 20 bp (pentanuc.) and 24 bp (hexanuc.)
gSee Table 6 for a more detailed analysis of transposon content
Transposon content in the C. melo sequenced BACs.a
| Family | Copies | Total lenght | Analyzed sequence |
|---|---|---|---|
| 15 | 30,238 | 0.48 | |
| 4 | 8,726 | 0.14 | |
| 6 | 17,836 | 0.28 | |
| 1 | 842 | 0.01 | |
| 1 | 830 | 0.01 | |
| 15 | 49,606 | 0.79 | |
| 18 | 80,452 | 1.28 | |
| 3 | 9,664 | 0.15 | |
| 77 | 313,326 | 5.0 | |
aFrom the analysis of all contigs and scaffolds longer than 2 kb
Figure 2Overview of microsynteny between four melon scaffolds and four regions in the . Genes are represented by square blocks. Homologous genes are illustrated with the same colour and indicated by connecting lines of the corresponding colour. Ab initio predicted genes with no homology to public EST or protein databases are shown in black. Transposable elements are in gray, with red asterisks as an additional mark for retrotransposons. Genes coding for NBS-LRR disease resistance proteins are represented by square block filled with blue vertical lines. Putative pseudogenes are depicted as black edge boxes. The annotation of C. melo scaffoldMRGH63 and scaffold00077 was complemented using information from ca. 57 kb and 96 kb, respectively, of unpublished melon sequence (represented in the figure as red edge boxes). Figure drawn to scale.