Literature DB >> 17381825

The genome of Apis mellifera: dialog between linkage mapping and sequence assembly.

Michel Solignac, Lan Zhang, Florence Mougel, Bingshan Li, Dominique Vautrin, Monique Monnerot, Jean-Marie Cornuet, Kim C Worley, George M Weinstock, Richard A Gibbs.   

Abstract

Two independent genome projects for the honey bee, a microsatellite linkage map and a genome sequence assembly, interactively produced an almost complete organization of the euchromatic genome. Assembly 4.0 now includes 626 scaffolds that were ordered and oriented into chromosomes according to the framework provided by the third-generation linkage map (AmelMap3). Each construct was used to control the quality of the other. The co-linearity of markers in the sequence and the map is almost perfect and argues in favor of the high quality of both.

Entities:  

Mesh:

Year:  2007        PMID: 17381825      PMCID: PMC1868943          DOI: 10.1186/gb-2007-8-3-403

Source DB:  PubMed          Journal:  Genome Biol        ISSN: 1474-7596            Impact factor:   13.583


Most eukaryotic genome sequencing projects are preceded by the construction of physical, genetic and/or cytological maps. For the honey bee genome project there was no physical map, and because of the low resolution of the cytogenetic map, the meiotic map was the only resource for organizing the sequence assembly on the chromosomes. The first generation map AmelMap1 comprised 541 markers on 24 linkage groups for 16 chromosomes [1,2]. Saturation was achieved by addition of 601 markers prepared from cDNAs [3] and bacterial artificial chromosomes (BACs) [4] sequences. AmelMap2 was not published, but was used by the Human Genome Sequencing Center at Baylor College for the first assembly of the Apis mellifera genome in January 2004. From that time a dialog was set up between the map and sequence projects that became interactive, each taking advantage of the progress of the other. The density of the third-generation map, AmelMap3, was doubled and contributed greatly to the ultimate assembly (version 4.0, March 2006) of the honey bee genome [5]. AmelMap3 comprises 2,008 microsatellite markers (see Additional data file 1) and is 4,000 cM long (M.S, F.M, D.V M.M and J-M.C, unpublished work). Improvements in the map between the second and third generation resulted exclusively from addition of markers designed from the sequence: 587 from previously placed scaffolds in assemblies 1.1 and 2.0 to reduce long genetic distances, orient scaffolds and homogenize the marker density along and among chromosomes and 436 in 379 large unplaced scaffolds (GroupUn) which efficiently increased the fraction of the sequence integrated in chromosomes in the later assemblies (Tables 1 and 2). Chromosomes were oriented by half-tetrad analysis [6]. This orientation was later confirmed by positioning telomeric regions [7] and cytogenetic analysis [5].
Table 1

Improvements between assembly versions 1.1 (January 2004) and 4.0 (March 2006)

Map versionAmelMap2AmelMap3
Number of markers1,050*2,013
Assembly version1.14.0


Length (Mb)PercentageLength (Mb)Percentage


Total mapped sequence11053%18679%
Total unmapped sequence (GroupUn)9647%4921%
Total scaffold length (Mb)206-235-

Although the size of the assembled genome increased by 29 Mb (12% of the version 4.0 genome) as a result of additional sequencing reads and better assembly, a total of 76 Mb of sequence (32% of the genome) was mapped to chromosomes with longer scaffolds and additional markers in AmelMap3 compared with AmelMap2. *The number of markers used for the assembly differs from that given in the text (1,142). Markers without accession numbers (92) were omitted. †After the freeze of assembly 4.0, some markers were added and others removed from the AmelMap3, which now comprises 2,008 markers.

Table 2

Number of consistently mapped scaffolds

Assembly version3.04.0
Total number of scaffolds9,8639,868
Consistently mapped scaffolds431626
Number of scaffolds broken22
Number of scaffolds with inconsistency ignored72

The increase of the number of mapped scaffolds (195) between version 3.0 and 4.0 of the genome assembly is less than the total number of unplaced scaffolds (379) in version 3.0 that were mapped in version 4.0 because many scaffolds were merged into previously mapped scaffolds or combined with other previously unmapped scaffolds.

Improvements between assembly versions 1.1 (January 2004) and 4.0 (March 2006) Although the size of the assembled genome increased by 29 Mb (12% of the version 4.0 genome) as a result of additional sequencing reads and better assembly, a total of 76 Mb of sequence (32% of the genome) was mapped to chromosomes with longer scaffolds and additional markers in AmelMap3 compared with AmelMap2. *The number of markers used for the assembly differs from that given in the text (1,142). Markers without accession numbers (92) were omitted. †After the freeze of assembly 4.0, some markers were added and others removed from the AmelMap3, which now comprises 2,008 markers. Number of consistently mapped scaffolds The increase of the number of mapped scaffolds (195) between version 3.0 and 4.0 of the genome assembly is less than the total number of unplaced scaffolds (379) in version 3.0 that were mapped in version 4.0 because many scaffolds were merged into previously mapped scaffolds or combined with other previously unmapped scaffolds. Great care was taken to eradicate errors in the final versions (AmelMap3, assembly 4.0). For single markers with uncertain chromosomal positions, new markers were designed; in three cases, the scaffold moved and in two cases the marker did not amplify the expected product. In three cases, two blocks of markers on the same scaffolds mapped to two different positions; adding markers narrowed the region responsible for the chimerism in which the assembly had to be split. Most of the remaining discrepancies were local marker misordering, eradicated by correction of genotyping errors detected by double crossovers. A few trivial differences persist between the latest versions of the map and the assembly. Sixteen small scaffolds were reversed and the order of eight groups of short scaffolds will also be revisited. This is attributable to the fact that the last map improvements occurred after the freeze of the version 4.0 assembly. Four unresolved discrepancies remain: the map positions of two short scaffolds (1.43 and 3.37), orientation of a long scaffold (10.30) and remnants in a false position of the break of scaffold 6.37. This generally excellent co-linearity pleads in favor of the quality of the two constructions. If some mistakes remain within scaffolds, they should be below the level of resolution of the map (average 93 kb). This agreement could seem to be a circular argument as the map is the framework of the assembly. This is not the case. The genetic map and sequence scaffolds have been constructed independently. The maps were calculated with a version of the software Cartha-Gène [8] that does not use physical information and the assembly did not use the map to construct the scaffolds but only to organize them. The eradication of errors in the map, even if it used the sequence to detect them and helped their resolution, was based on genetic methods (controls or addition of genotypes). To evaluate the final control of correctness, the scaffolds that contained at least three markers with two non-null genetic distances were selected. The number of markers flanking non-null distances was 1,319 (that is, two-thirds of the total) and they showed only four local and unresolved mistakes (0.3 %). In addition, the 387 markers that are at a null genetic distance within scaffolds are always clustered in the sequence. This accurate co-linearity within scaffolds may be considered indicative of that between scaffolds, which cannot be tested in this way. In the mouse, a very detailed genetic map existed before the sequence of the genome, but of the 12,000 markers, only 2,605 were considered as 'unambiguously' mapped and were used to assess the accuracy of the assembly [9]; most of the conflicts (1.8% of chromosomal misassignment and 0.7% of local misordering) were attributable to mapping errors. For the rat genome, the radiation hybrid map was consistent for 98% of markers with the genetic maps and for 96% with the genome sequence [10]. Among the 626 honey bee scaffolds, 320, representing a physical length of 152 Mb, are oriented (Table 3); the other half were too short to be oriented genetically; they represent only 18.4% of the physical length. Among them, 113 scaffolds forming 44 blocks are not ordered relative to one another (due to null genetic distances). The unoriented scaffolds are nevertheless placed on chromosomes, but their orientation is random.
Table 3

Total number of scaffolds mapped in the honey bee genome and corresponding physical length of each of the 16 chromosomes

Number of scaffoldsPhysical length (in base pairs)


Linkage groupUnorientedUnorderedTotalUnorientedOrientedTotal
13711 (4)834,324,75621,509,33425,834,090
2214 (2)432,072,40111,899,77613,972,177
3158 (3)391,707,55010,013,97011,721,520
4132 (1)271,741,2309,215,46010,956,690
5132 (1)331,898,44811,002,24412,900,692
6304 (2)553,630,62811,408,45515,039,083
72815 (6)473,141,5427,407,43110,548,973
82616 (7)472,825,7088,063,51510,889,223
9112 (1)261,566,4278,266,4809,832,907
10227 (3)452,686,9517,755,62610,442,577
11227 (3)423,091,8549,380,12312,471,977
12164 (1)301,527,8618,331,1499,859,010
134021399,8678,866,8709,266,737
14103 (1)25990,2127,786,4498,776,661
152822 (6)422,097,9876,011,7008,109,687
16106 (3)21745,3545,327,5186,072,872
Total306113 (44)62634,448,776152,246,100186,694,876
18.4 %81.6 %

Unordered scaffolds are a subset of unoriented scaffolds (number of blocks of unordered scaffolds between brackets).

Total number of scaffolds mapped in the honey bee genome and corresponding physical length of each of the 16 chromosomes Unordered scaffolds are a subset of unoriented scaffolds (number of blocks of unordered scaffolds between brackets). Missing sequences in the gaps are probably very short, as suggested by short interscaffold genetic distances. Manual superscaffolding of the five smallest chromosomes (12-16) [11], mainly achieved through relaxing matching criteria, conserved the general structure of the map, included 178 GroupUn scaffolds in the gaps and reduced the 139 scaffolds to 25 superscaffolds by the addition of only 5.5% of the sequence length. For all chromosome arms, the telomeric regions are reached and the centromeric regions are close to being so [5,7]. Consequently, most of the euchromatic sequence of the chromosome arms is now organized and perhaps only 5% is not included in the assembly. It may be asked if a genetic map alone provides sufficient information to organize an assembly. The large genetic length of the honey bee genome (about 4,000 cM) compared to its relatively small physical size (about 230 cM) was assuredly a great advantage because it suffices to genotype small families to observe recombination between markers at a short physical distance. The same resolution in organisms with shorter maps (that is, most organisms, if not all [12]), would require a larger genotyping effort in terms of the number of individuals, but it might be limited to a few markers within the largest scaffolds to get a reasonable picture of the genome organization.

Additional data files

Additional data file 1, a list of the primers used for mapping.

Additional data file 1

A list of the primers used for mapping Click here for file
  11 in total

1.  Initial sequencing and comparative analysis of the mouse genome.

Authors:  Robert H Waterston; Kerstin Lindblad-Toh; Ewan Birney; Jane Rogers; Josep F Abril; Pankaj Agarwal; Richa Agarwala; Rachel Ainscough; Marina Alexandersson; Peter An; Stylianos E Antonarakis; John Attwood; Robert Baertsch; Jonathon Bailey; Karen Barlow; Stephan Beck; Eric Berry; Bruce Birren; Toby Bloom; Peer Bork; Marc Botcherby; Nicolas Bray; Michael R Brent; Daniel G Brown; Stephen D Brown; Carol Bult; John Burton; Jonathan Butler; Robert D Campbell; Piero Carninci; Simon Cawley; Francesca Chiaromonte; Asif T Chinwalla; Deanna M Church; Michele Clamp; Christopher Clee; Francis S Collins; Lisa L Cook; Richard R Copley; Alan Coulson; Olivier Couronne; James Cuff; Val Curwen; Tim Cutts; Mark Daly; Robert David; Joy Davies; Kimberly D Delehaunty; Justin Deri; Emmanouil T Dermitzakis; Colin Dewey; Nicholas J Dickens; Mark Diekhans; Sheila Dodge; Inna Dubchak; Diane M Dunn; Sean R Eddy; Laura Elnitski; Richard D Emes; Pallavi Eswara; Eduardo Eyras; Adam Felsenfeld; Ginger A Fewell; Paul Flicek; Karen Foley; Wayne N Frankel; Lucinda A Fulton; Robert S Fulton; Terrence S Furey; Diane Gage; Richard A Gibbs; Gustavo Glusman; Sante Gnerre; Nick Goldman; Leo Goodstadt; Darren Grafham; Tina A Graves; Eric D Green; Simon Gregory; Roderic Guigó; Mark Guyer; Ross C Hardison; David Haussler; Yoshihide Hayashizaki; LaDeana W Hillier; Angela Hinrichs; Wratko Hlavina; Timothy Holzer; Fan Hsu; Axin Hua; Tim Hubbard; Adrienne Hunt; Ian Jackson; David B Jaffe; L Steven Johnson; Matthew Jones; Thomas A Jones; Ann Joy; Michael Kamal; Elinor K Karlsson; Donna Karolchik; Arkadiusz Kasprzyk; Jun Kawai; Evan Keibler; Cristyn Kells; W James Kent; Andrew Kirby; Diana L Kolbe; Ian Korf; Raju S Kucherlapati; Edward J Kulbokas; David Kulp; Tom Landers; J P Leger; Steven Leonard; Ivica Letunic; Rosie Levine; Jia Li; Ming Li; Christine Lloyd; Susan Lucas; Bin Ma; Donna R Maglott; Elaine R Mardis; Lucy Matthews; Evan Mauceli; John H Mayer; Megan McCarthy; W Richard McCombie; Stuart McLaren; Kirsten McLay; John D McPherson; Jim Meldrim; Beverley Meredith; Jill P Mesirov; Webb Miller; Tracie L Miner; Emmanuel Mongin; Kate T Montgomery; Michael Morgan; Richard Mott; James C Mullikin; Donna M Muzny; William E Nash; Joanne O Nelson; Michael N Nhan; Robert Nicol; Zemin Ning; Chad Nusbaum; Michael J O'Connor; Yasushi Okazaki; Karen Oliver; Emma Overton-Larty; Lior Pachter; Genís Parra; Kymberlie H Pepin; Jane Peterson; Pavel Pevzner; Robert Plumb; Craig S Pohl; Alex Poliakov; Tracy C Ponce; Chris P Ponting; Simon Potter; Michael Quail; Alexandre Reymond; Bruce A Roe; Krishna M Roskin; Edward M Rubin; Alistair G Rust; Ralph Santos; Victor Sapojnikov; Brian Schultz; Jörg Schultz; Matthias S Schwartz; Scott Schwartz; Carol Scott; Steven Seaman; Steve Searle; Ted Sharpe; Andrew Sheridan; Ratna Shownkeen; Sarah Sims; Jonathan B Singer; Guy Slater; Arian Smit; Douglas R Smith; Brian Spencer; Arne Stabenau; Nicole Stange-Thomann; Charles Sugnet; Mikita Suyama; Glenn Tesler; Johanna Thompson; David Torrents; Evanne Trevaskis; John Tromp; Catherine Ucla; Abel Ureta-Vidal; Jade P Vinson; Andrew C Von Niederhausern; Claire M Wade; Melanie Wall; Ryan J Weber; Robert B Weiss; Michael C Wendl; Anthony P West; Kris Wetterstrand; Raymond Wheeler; Simon Whelan; Jamey Wierzbowski; David Willey; Sophie Williams; Richard K Wilson; Eitan Winter; Kim C Worley; Dudley Wyman; Shan Yang; Shiaw-Pyng Yang; Evgeny M Zdobnov; Michael C Zody; Eric S Lander
Journal:  Nature       Date:  2002-12-05       Impact factor: 49.962

2.  Exceptionally high levels of recombination across the honey bee genome.

Authors:  Martin Beye; Irene Gattermeier; Martin Hasselmann; Tanja Gempe; Morten Schioett; John F Baines; David Schlipalius; Florence Mougel; Christine Emore; Olav Rueppell; Anu Sirviö; Ernesto Guzmán-Novoa; Greg Hunt; Michel Solignac; Robert E Page
Journal:  Genome Res       Date:  2006-10-25       Impact factor: 9.043

3.  CARTHAGENE: constructing and joining maximum likelihood genetic maps.

Authors:  T Schiex; C Gaspin
Journal:  Proc Int Conf Intell Syst Mol Biol       Date:  1997

4.  Canonical TTAGG-repeat telomeres and telomerase in the honey bee, Apis mellifera.

Authors:  Hugh M Robertson; Karl H J Gordon
Journal:  Genome Res       Date:  2006-10-25       Impact factor: 9.043

5.  Annotated expressed sequence tags and cDNA microarrays for studies of brain and behavior in the honey bee.

Authors:  Charles W Whitfield; Mark R Band; Maria F Bonaldo; Charu G Kumar; Lei Liu; Jose R Pardinas; Hugh M Robertson; M Bento Soares; Gene E Robinson
Journal:  Genome Res       Date:  2002-04       Impact factor: 9.043

6.  New genomic resources for the honey bee(Apis mellifera L.): development of a deep-coverage BAC library and a preliminary STC database.

Authors:  J P Tomkins; M Luo; G C Fang; D Main; J L Goicoechea; M Atkins; D A Frisch; R E Page; E Guzmán-Novoa; Y Yu; G Hunt; R A Wing
Journal:  Genet Mol Res       Date:  2002-12-31

7.  High-density rat radiation hybrid maps containing over 24,000 SSLPs, genes, and ESTs provide a direct link to the rat genome sequence.

Authors:  Anne E Kwitek; Jo Gullings-Handley; Jiaming Yu; Danilo C Carlos; Kimberly Orlebeke; Jeff Nie; Jeffrey Eckert; Angela Lemke; Jaime Wendt Andrae; Susan Bromberg; Dean Pasko; Dan Chen; Todd E Scheetz; Thomas L Casavant; M Bento Soares; Val C Sheffield; Peter J Tonellato; Howard J Jacob
Journal:  Genome Res       Date:  2004-04       Impact factor: 9.043

8.  Manual superscaffolding of honey bee (Apis mellifera) chromosomes 12-16: implications for the draft genome assembly version 4, gene annotation, and chromosome structure.

Authors:  Hugh M Robertson; Justin T Reese; Natalia V Milshina; Richa Agarwala; Michel Solignac; Kimberly K O Walden; Christine G Elsik
Journal:  Insect Mol Biol       Date:  2007-05-16       Impact factor: 3.585

9.  Whole-genome scan in thelytokous-laying workers of the Cape honeybee (Apis mellifera capensis): central fusion, reduced recombination rates and centromere mapping using half-tetrad analysis.

Authors:  Emmanuelle Baudry; Per Kryger; Mike Allsopp; Nikolaus Koeniger; Dominique Vautrin; Florence Mougel; Jean-Marie Cornuet; Michel Solignac
Journal:  Genetics       Date:  2004-05       Impact factor: 4.562

10.  Insights into social insects from the genome of the honeybee Apis mellifera.

Authors: 
Journal:  Nature       Date:  2006-10-26       Impact factor: 49.962

View more
  11 in total

1.  The first chromosome-level genome assembly of a green lacewing Chrysopa pallens and its implication for biological control.

Authors:  Yuyu Wang; Ruyue Zhang; Mengqing Wang; Lisheng Zhang; Cheng-Min Shi; Jing Li; Fan Fan; Shuo Geng; Xingyue Liu; Ding Yang
Journal:  Mol Ecol Resour       Date:  2021-09-30       Impact factor: 8.678

2.  A chromosome-level genome assembly and intestinal transcriptome of Trypoxylus dichotomus (Coleoptera: Scarabaeidae) to understand its lignocellulose digestion ability.

Authors:  Qingyun Wang; Liwei Liu; Sujiong Zhang; Hong Wu; Junhao Huang
Journal:  Gigascience       Date:  2022-06-28       Impact factor: 7.658

Review 3.  Sirtuin/Sir2 phylogeny, evolutionary considerations and structural conservation.

Authors:  Sebastian Greiss; Anton Gartner
Journal:  Mol Cells       Date:  2009-11-18       Impact factor: 5.034

4.  Next generation transcriptomes for next generation genomes using est2assembly.

Authors:  Alexie Papanicolaou; Remo Stierli; Richard H Ffrench-Constant; David G Heckel
Journal:  BMC Bioinformatics       Date:  2009-12-24       Impact factor: 3.169

5.  A shot in the genome: how accurately do shotgun 454 sequences represent a genome?

Authors:  Emese Meglécz; Nicolas Pech; André Gilles; Jean-François Martin; Michael G Gardner
Journal:  BMC Res Notes       Date:  2012-05-28

6.  Gene discovery in the horned beetle Onthophagus taurus.

Authors:  Jeong-Hyeon Choi; Teiya Kijimoto; Emilie Snell-Rood; Hongseok Tae; Youngik Yang; Armin P Moczek; Justen Andrews
Journal:  BMC Genomics       Date:  2010-12-14       Impact factor: 3.969

7.  Representativeness of microsatellite distributions in genomes, as revealed by 454 GS-FLX titanium pyrosequencing.

Authors:  Jean-Francois Martin; Nicolas Pech; Emese Meglécz; Stéphanie Ferreira; Caroline Costedoat; Vincent Dubut; Thibaut Malausa; André Gilles
Journal:  BMC Genomics       Date:  2010-10-12       Impact factor: 3.969

8.  Sweetness and light: illuminating the honey bee genome.

Authors:  G E Robinson; J D Evans; R Maleszka; H M Robertson; D B Weaver; K Worley; R A Gibbs; G M Weinstock
Journal:  Insect Mol Biol       Date:  2006-10       Impact factor: 3.585

9.  Insights into social insects from the genome of the honeybee Apis mellifera.

Authors: 
Journal:  Nature       Date:  2006-10-26       Impact factor: 49.962

10.  A third-generation microsatellite-based linkage map of the honey bee, Apis mellifera, and its comparison with the sequence-based physical map.

Authors:  Michel Solignac; Florence Mougel; Dominique Vautrin; Monique Monnerot; Jean-Marie Cornuet
Journal:  Genome Biol       Date:  2007       Impact factor: 13.583

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.