Literature DB >> 19936061

The physical and genetic framework of the maize B73 genome.

Fusheng Wei1, Jianwei Zhang, Shiguo Zhou, Ruifeng He, Mary Schaeffer, Kristi Collura, David Kudrna, Ben P Faga, Marina Wissotski, Wolfgang Golser, Susan M Rock, Tina A Graves, Robert S Fulton, Ed Coe, Patrick S Schnable, David C Schwartz, Doreen Ware, Sandra W Clifton, Richard K Wilson, Rod A Wing.   

Abstract

Maize is a major cereal crop and an important model system for basic biological research. Knowledge gained from maize research can also be used to genetically improve its grass relatives such as sorghum, wheat, and rice. The primary objective of the Maize Genome Sequencing Consortium (MGSC) was to generate a reference genome sequence that was integrated with both the physical and genetic maps. Using a previously published integrated genetic and physical map, combined with in-coming maize genomic sequence, new sequence-based genetic markers, and an optical map, we dynamically picked a minimum tiling path (MTP) of 16,910 bacterial artificial chromosome (BAC) and fosmid clones that were used by the MGSC to sequence the maize genome. The final MTP resulted in a significantly improved physical map that reduced the number of contigs from 721 to 435, incorporated a total of 8,315 mapped markers, and ordered and oriented the majority of FPC contigs. The new integrated physical and genetic map covered 2,120 Mb (93%) of the 2,300-Mb genome, of which 405 contigs were anchored to the genetic map, totaling 2,103.4 Mb (99.2% of the 2,120 Mb physical map). More importantly, 336 contigs, comprising 94.0% of the physical map ( approximately 1,993 Mb), were ordered and oriented. Finally we used all available physical, sequence, genetic, and optical data to generate a golden path (AGP) of chromosome-based pseudomolecules, herein referred to as the B73 Reference Genome Sequence version 1 (B73 RefGen_v1).

Entities:  

Mesh:

Substances:

Year:  2009        PMID: 19936061      PMCID: PMC2774505          DOI: 10.1371/journal.pgen.1000715

Source DB:  PubMed          Journal:  PLoS Genet        ISSN: 1553-7390            Impact factor:   5.917


Introduction

Maize is an important crop and a model biological system. With global climate change and increasing caloric and raw material demands, the development of higher yielding and more stress-resistant maize cultivars is a major challenge facing 21st century breeders. Approximately 50 million years ago maize shared a common lineage with all grass and cereal ancestors [1]. Subsequently, the maize ancestor underwent allotetraploidization and diploidization [2]–[5], prior to domestication some 10,000 years ago in the Americas. The present day maize genome is genetically diploid (n = 10), and has a genome size (GS) of approximately 2300–2700 Mb [6], 85% of which is composed of transposable elements [7]. With the smaller and less complex cereal genome sequences of rice (GS = 389 Mb; [8]) and sorghum (GS = 700 Mb; [9]) already completed, the generation of a whole genome sequence of maize offers the greatest technical challenge to date for any complex plant genome. Since 1998 the U.S.A. National Science Foundation's Plant Genome Research Program has invested heavily in the development of resources and pilot projects to build a foundation to sequence the maize genome, including generation of maize genetic [10]–[14], physical [15]–[18], and optical maps [19], sequencing maize gene space by methylation filtration and high C selection [20]–[22], BAC end sequencing [23], random BAC sequencing [24], sequencing large contiguous maize regions [25], and the maize full-length cDNA project [26]. These investments came to fruition in 2005 with the funding of the Maize Genome Sequencing Consortium (MGSC) to use a novel clone-by-clone approach to sequence the genome of the maize inbred B73, a process that was completed in 2009 [7]. Here we present a detailed account of the utilization of a previously described genetically-integrated sequence-ready physical framework map of the B73 maize genome (721 contigs anchored with 1092 genetic markers, covering ∼94% of the genome [18]) as the vade mecum to dynamically select a minimum tiling path (MTP) of BAC clones across the genome. We describe our progress in integrating new and more complex resources into the physical map to better guide the generation, validation and annotation of a reference genome sequence for maize. These processes included the use of maize genome sequence and optical map information to merge, break, anchor and orient FPC contigs. Upon completion of the shotgun sequencing and sequence improvement of most large-insert clones, we combined all available evidence (i.e. sequence, physical, genetic, and optical map information) to construct a golden path (AGP) of pseudomolecules across the maize genome, hereinafter referred to as the “B73 RefGen_v1”.

Results/Discussion

Generation of a Minimum Tiling Path (MTP) of Bacterial Artificial Chromosome (BAC) and fosmid clones to sequence the B73 maize genome

To sequence the maize genome (B73), we employed a clone-by-clone approach and selected a minimum tiling path of BACs across the integrated genetic and physical map. Initially we selected 3,200 BACs that were spaced approximately 800 kb apart across the genome. Additional criteria used to select these “seed” BAC clones were: 1) each had a genomic insert that was larger than the average insert sizes of the BAC libraries; 2) each had a pair of high-quality end sequences; 3) each had a high-quality fingerprint; and 4) where possible, each had an associated genetic and/or overgo marker [27]. These combined criteria ensured that the genomic position of each seed BAC clone was known, that each clone could be easily validated prior to shotgun library construction/sequencing, and that a maximum amount of sequence could be obtained from each region due to the large clone insert size. Because the previously published B73 maize BES data set [23] was not adequate to walk from seed BACs, the MGSC resequenced BAC ends for the ZMMBBc EcoR1/MboI BAC library, resulting in a total of 340,869 new BESs to aid clone walking/sequencing. The ZMMBBc library was selected because it had the larger average insert size of the two BAC libraries used to generate the physical map. Combined, we employed 815,473 BESs (70% paired) for the maize genome sequencing project. In addition, the MGSC also generated a total of 827,571 (72% paired end) fosmid end sequences/trace files that were used primarily for MTP gap filling (see below). Once a seed BAC was sequenced we employed one of two methods to select adjacent BAC clones that had minimal sequence overlap. The first method, termed the sequenced tagged connector (STC) approach [28], utilized the BES data set (FASTA and trace files) to identify BESs that minimally aligned to the seed BAC sequence on either side of the sequence. Once a MTP BAC clone was identified, its position on the physical map was checked, then validated by BAC end sequencing prior to incorporation in the production sequencing pipeline. To make MTP clone selection more efficient, we developed a web-based MTP Tilepath pipeline interface (Figure 1A and 1B) that is described in detail in Text S1.
Figure 1

Sequencing pipeline for MTP clone selection and gap analysis.

(A) An example of STC-based clone walking. Candidate walking clone list for seed BAC c0245B14. The list showed clones in which BES shared >95% sequence identity with the seed BAC; (B) Gbrowse view of sequence and trace alignment of candidate clone b0566J07 to seed BAC c0245B14. (C) Gap analysis pipeline to check gaps between adjoining clones.

Sequencing pipeline for MTP clone selection and gap analysis.

(A) An example of STC-based clone walking. Candidate walking clone list for seed BAC c0245B14. The list showed clones in which BES shared >95% sequence identity with the seed BAC; (B) Gbrowse view of sequence and trace alignment of candidate clone b0566J07 to seed BAC c0245B14. (C) Gap analysis pipeline to check gaps between adjoining clones. The second method used for MTP selection relied solely on the underlying BAC fingerprints used to assemble the maize integrated genetic and physical map. This method was employed due to the scale of the project and the timeline mandated to complete the project. It simply was impossible to exclusively use the STC approach, because the improved seed-BAC and MTP-walk sequences were not generated rapidly enough to supply the shotgun library and production sequencing pipelines with adequate numbers of BACs to complete the project on time. To select MTP BAC clones for sequencing with fingerprints, we used an e-value score of e−9 to e−15 between adjacent BAC clones in the maize high information content fingerprint (HICF) map [17] to ensure minimal overlap. E-value scores for evaluating fingerprint overlap were assessed using the FPC Analysis function [29] and resulted in an average overlap of adjacent BAC clones of 38 kb across the genome. Such overlap can thereby exclude false overlaps created by two identical (or nearly identical) retrotransposons whose sizes are normally less than 15 kb. This e-value parameter could be used for MTP selection of other genomes where HICF physical maps are available. The final step in MTP generation was to check and fill gaps with either BAC or fosmid clones. To simplify this task, we developed a comprehensive web-based MTP interface (Figure 1C) that is described in Text S1. To ensure high-confidence overlap between two contigs we set the following criteria: 1) two adjoining clones must have overlap in Megablast searches with over 99.9% sequence identity; 2) the highest scoring overlap must be between each clone, and not with any other clone in other parts of the genome; 3) the BES of one clone must align to the sequence of its adjacent clone with over 95% identity; and 4) if the sequence identity in the BAC-end search was less than 99%, the sequence alignment along with the trace chromatograph was manually checked. If any one of these criteria was not met, the clone was flagged and manually annotated. In conclusion, we selected a total of 16,910 MTP clones across the maize genome (3,200 seed, 5,748 STC walks, 6,048 FP walks, 1,795 BAC gaps and 63 Fosmid gaps, and 56 BACs from outside projects). The full list of MTP clones and an interactive website can be accessed at http://www2.genome.arizona.edu/genomes/maize and in Table S1.

Improvement of the maize integrated genetic and physical map

In our previous study [18], we were unable to merge or genetically anchor additional FPC contigs based on fingerprint evidence alone. By utilizing maize genome sequence and genetic map information we were able to significantly improve the physical map by performing new contig merges, breaking mis-assembled contigs and anchoring additional FPC contigs to the maize genetic map. Using the same rules described above for gap checking, in combination with the maize genome sequence, we were able to perform 109 FPC contig merges, and identified ten FPC contigs that were incorrectly merged (Table S2). These latter contigs were broken apart and then merged into 17 new FPC contigs. This analysis resulted in a total of 435 FPC contigs in the maize physical map, which covered ∼93% (2120 Mb) of the 2300-Mb genome. In addition, 170 small low-coverage FPC contigs (∼25 Mb in total) shown to represent contaminating cotton sequences were removed from the physical map assembly. The contamination was identified by Kmer [30] and BAC end sequence analyses. All contaminated clones were from the ZMMBBb library and most likely originated during the BAC library construction process. To fully integrate the physical map with the maize genetic map we utilized all publicly available marker data from the IBM2 2008 Neighbors Map (Schaeffer, Sanchez-Villeda, and Coe, 2008; http://maizegdb.org/map.php), and the literature. The IBM2 2008 Neighbors map contains 15,932 markers (11,475 publicly available). However, due to the long history of these genetic markers, dating back 20 years or more, the nucleotide sequences of many markers were not deposited into centralized databases, such as maizeGDB or GenBank. To integrate additional genetic markers at the sequence level, we conducted extensive literature and Google searches and identified 2,864 markers with sequences not associated with markers in maizeGDB or linked to GenBank entries. In total, we obtained 9,229 sequence-based genetic markers with available sequences (http://www2.genome.arizona.edu/genomes/maize). Of these, 8,315 markers could be mapped onto both the physical map and the B73 RefGen_v1 (Table S3). We could not pinpoint the genomic locations of 134 markers (indicated as “no hit” in Table S3), perhaps due to lack of sequence or genome coverage in the related regions, or their origin as inbred-specific sequences. Gore et al. [31] reported that about 7.8% of the maize sequences could be inbred specific. The low genetic map resolution of these markers made it impossible to determine the cause for no coverage. Additionally, 780 markers were placed on different chromosomes in contrast to their reported genetic positions (Table S3). Most of these 780 markers were from low-resolution maps and their genetic positions could not be validated. Of the 90 bin markers (Table 1, partial; the full list is in Table S4) used to divide the maize genome genetically, we could confidently place 87 markers on both the physical map and the B73 RefGen_v1. There were three bin markers (RFLP markers umc5a, agrr37b, and csu93b) with physical positions that conflicted with their genetic positions. Most likely, those multiple copy markers were InDels that were present in different parental lines, but absent in B73 or in gaps, because each marker only had one locus in the B73 genome (RefGen_v1), instead of multiple ones in their original mapping parents.
Table 1

Position of bin markers in the B73 physical map and RefGen_v1a.

MarkerChrBinGeneticb Original MapTypec Seq. Sourced Starte Ende cloneFPC Ctg
tub111.012.5IBM2FX5287820226072024984c0363D201
umc157a(chn)11.02114.4IBM2PG108231235736412357663c0140E025
umc76a11.03198.4IBM2FG108662936455929364266c0380M209
asg45(ptk)11.04294.3Gnp2004PAY7712105223980052240131b0109M1412
csu311.05405IBM2FDQ1238918136013281360551c0122B1320
umc67a11.06496.6IBM2PG13173175505327175505029c0152A1436
asg6211.07607.3IBM2FDQ001865198707401198707865c0479A0941
umc128a11.08722.4IBM2Fumc128227601774227602233b0310F1546
cdj211.09812.3IBM2FAY109456252192856252193562b0611E1652
umc107a(croc)11.1886.9IBM2PG10803266927146266927488c0293G1656
umc161a11.11963.6IBM2FAY771212282140672282141394c0086K0861
bnl6.3211.121113IBM2Fbnl6.32296840063296840574c0455B1463
bnl8.45a22.013.3Gnp2004PG1077615468721547084b0252P0568
lox622.0250.9IBM2FAY77121441750124174428c0468P2269
umc6a22.03164.8IBM2FG108561492025514920433c0530G2172
umc3422.04243.3IBM2FDQ0018662806392728064503c0030B1174
umc13122.05342.4IBM2Fumc1317103156571031939c0244C0182
umc255a22.06364.5IBM2Pumc255149697523149697768b0120F0790
umc5a72.07405.8Gnp2004Pumc5116650892116650654b0022A14315
asg2022.08478.7IBM2FDQ123894201356132201355954c0158O02103
umc49a22.09591.5IBM2FDQ123895219604574219604915c0184K09108
php20581b(tb)22.1692.7Gnp2004PG10795231788583231788397b0109B01109
umc32a33.0111.3UMC98Pumc3217262761725856c0286H14111
csu32a33.0260IBM2FDQ12389638370123837402c0299P11111
asg24a(gts)33.03109IBM2PAY77121784057158482306b0166B24112
asg48a33.04152.7IBM2FG131841286281312862593c0385I07113
umc102a33.05297.9IBM2FDQ005498122406867122407553c0072M24124
im30p133.06391.4IBM2PG10766166733121166732779b0583P10131
bnl6.16a33.07520.7IBM2FG10768189303505189303133c0328L01138
umc17a33.08585.5IBM2FAY771218203506017203506852b0460H12145
umc63a33.09697.2IBM2FG10857214210836214210676b0347M11147
cyp133.1845.2Gnp2004PDQ005499230486027230486291b0147G12153

This is a partial list. The full list is in Table S4.

genetic position.

marker type, P: Placement, not as accurate as Framework (F).

Sequence Source, marker names with no GenBank accession number indicated that the sequences are available at http://www2.genome.arizona.edu/genomes/maize.

positions in B73 RefGen_v1.

This is a partial list. The full list is in Table S4. genetic position. marker type, P: Placement, not as accurate as Framework (F). Sequence Source, marker names with no GenBank accession number indicated that the sequences are available at http://www2.genome.arizona.edu/genomes/maize. positions in B73 RefGen_v1. After integration, 97.8% of the physical map could be assigned to the maize genetic map, as compared to 86.1% [18] prior to the genome sequence. Among the 435 contigs in the updated physical map, 392 could be anchored, totaling 2073 Mb (97.8% of the 2120 Mb physical map). Among these 392 anchored FPC contigs, 163 (totaling ∼1222.9 Mb; 57.7% of the physical map) could be ordered and oriented in the maize genome, 92 (comprising ∼387.4 Mb) could be ordered, but not oriented, and 137 (∼462.8 Mb) had only rough genomic positions and were not ordered and oriented. Finally, the genomic positions of 43 FPC contigs (∼47 Mb; 2.2%) could not be determined due to lack of any sequence overlap and/or genetic linkage information. Development and mapping of polymorphic genetic markers from these latter contigs would be the most efficient approach to incorporate them into the integrated genetic and physical map of maize or other species.

Ordering and orienting maize physical contigs using the maize optical map

Zhou et al. [19] reported the construction of an optical map for the B73 maize genome. The optical map was constructed by generating SwaI restriction maps of high molecular weight genomic DNA at 400-fold redundancy. The restriction maps were assembled into a whole genome optical map consisting of 66 contigs, many fewer than the 435 contigs in the maize physical map. To interdigitate the optical map with the integrated physical and genetic maps, we generated a contig-based in silico maize optical map by digesting the contig-based pseudomolecules (described below) with the SwaI restriction enzyme. The resulting in silico restriction map was then aligned to the maize optical map (see details in [19]) and used to assist with the ordering and anchoring of additional FPC contigs. For example, in Figure 2A, Ctg33 was well anchored on maize Chr1, while Ctg36 was only ordered but not oriented. Both FPC Ctg33 and 36 were mapped adjacent to one another in the maize optical map (Omcontig_0) thus allowing Ctg36 to be oriented correctly. In another example (Figure 2B and 2C), Ctg304 was well anchored (ordered and oriented) on maize Chr7, but the chromosomal positions of Ctg459 and 470 were unknown. These three contigs mapped next to each other in the following order: Ctg304, 470, and 459 on maize Omcontig_10. These data provided a genome context for the two orphan FPC contigs (Ctg459 and 470).
Figure 2

Use of the maize optical map for FPC contig anchoring.

In each panel, the top blue fragments represent a maize optical SwaI restriction map, and the bottom orange fragments represent the in silico optical SwaI restriction map from contig-based pseudomolecules. Red fragments in (B) and (C) indicate a mis-sassmbly in the pseudomolecule that required manual editing. (A) Well-anchored Ctg36 helped to orient Ctg33, which was previously only ordered, but not oriented. (B) Anchored Ctg407 aided order and orientation of Ctg470, which was neither ordered nor oriented. (C). The newly anchored Ctg470 facilitated ordering and orienting of Ctg459.

Use of the maize optical map for FPC contig anchoring.

In each panel, the top blue fragments represent a maize optical SwaI restriction map, and the bottom orange fragments represent the in silico optical SwaI restriction map from contig-based pseudomolecules. Red fragments in (B) and (C) indicate a mis-sassmbly in the pseudomolecule that required manual editing. (A) Well-anchored Ctg36 helped to orient Ctg33, which was previously only ordered, but not oriented. (B) Anchored Ctg407 aided order and orientation of Ctg470, which was neither ordered nor oriented. (C). The newly anchored Ctg470 facilitated ordering and orienting of Ctg459. Combining the optical map analysis with the improved integrated genetic and physical map, we were able to anchor an additional 13 FPC contigs to the maize genetic map, which resulted in a final total of 405 anchored FPC contigs comprising 99.2% of the 2120 Mb physical map. More importantly, more than twice as many FPC contigs (336 as opposed to 163), comprising 94.0% of the physical map (∼1993 Mb), could be ordered and oriented. For the remaining contigs, 21 (containing ∼20.6 Mb) could be ordered, but not oriented; and 48 (∼90.1 Mb) had only approximate genomic positions and were neither ordered nor oriented. The final 17.1 Mb contained 30 contigs with no genome context. The efficiency of using the maize optical map for anchoring is remarkable due to its deep coverage, large single molecule, and contig sizes. The anchoring quality of each contig, including the evidence used for anchoring, ordering and orienting, is shown in Table 2 (partial; the full list is in Table S5). The final integrated genetic and physical map can be downloaded at: http://www2.genome.arizona.edu/genomes/maize.
Table 2

Contig anchoring quality and contig positions in B73 RefGen_v1a.

ContigGenetic PositionOrder/Orienb Number of ClonesNumber of MarkersPhysical Length (Kb)ChrStartc Endc
12.52276982523112299274
213.52132531092123002753419854
326.123341602126134208555929995
482.8261626241651593099610045647
51032350152322411004664813079531
6124.72459172285111308053216193432
7145110944115611619443317299506
8160.62793280560011730050723505871
91702801249593312350687229869500
10205224277451854112987050148303993
12290.12506179427714830499452419976
13292.42723571515242097753153490
14325.7216724341270015315449165772465
16360.9328177192016577346667717626
4743850793970516771862768401065
17386.4224763236116840206670727174
18391.83379119318917072817573586599
19392.953626136464017358760078384554
20398.22798198635017838555584665058
22406325368183218466605986598526
24415348299397318659952790264916
234172578114439519026591794733311
106227.14939190735995359814860378895
432227.24701135487596037989665011726
448227.3457690451296501272769221205
425unknown52185118710971851111319527
427unknown523635203301147978513269771
429unknown516641175901344380814680007

This is a partial list. The full list is in Table S5.

Code: 0, chromosomal assignment is known, but not ordered and oriented; 1, ordered, but not oriented; 2, genetically anchored and oriented; 3, anchored and oriented with assistance from optical map; 4, the block was anchored, but order and orientation are unknown; 5, unknown chromosomal context.

positions in B73 RefGen_v1.

This is a partial list. The full list is in Table S5. Code: 0, chromosomal assignment is known, but not ordered and oriented; 1, ordered, but not oriented; 2, genetically anchored and oriented; 3, anchored and oriented with assistance from optical map; 4, the block was anchored, but order and orientation are unknown; 5, unknown chromosomal context. positions in B73 RefGen_v1.

Generation of A Golden Path (AGP) of the maize B73 genome

A major objective of the MGSC was to sequence the genome, integrate the sequence into the maize genetic and physical maps, and provide a high quality reference sequence in low copy regions. The final step of the MGSC, before annotation, was to generate a set of ten pseudomolecules that represented the ten chromosomes of maize—called “a golden path” or “AGP.” AGPs greatly simplify the analysis of a genome because an AGP removes all redundant overlapping sequences between BACs and fosmids, and provides a convenient set of contiguous sequence for annotation, as opposed to having to download over 16,000 individual BAC sequences and assembling them into a genome sequence independently. Most BAC sequences, generated by the MGSC and deposited in GenBank, contained multiple sequence contigs (on average 11 per clone) some of which were neither ordered nor oriented; it was thus very challenging to construct the AGP. The main task in building a whole genome AGP is to determine the extent of overlapping sequence between adjacent clones, order and orient sequence contigs in the overlapping regions, and finally remove all redundant overlapping sequence. To accomplish this task, we built a semi-automated web-based AGP pipeline connected to a MySQL relational database that was run with custom Perl scripts. All available sequence data including BAC, BAC and fosmid –end, and marker sequence information from both the MGSC and outside projects were then loaded into the MySQL database. A set of comparisons was then performed between neighboring BAC sequences and/or BES using BLAST, which resulted in the identification of the left and/or right end of each BAC on adjacent BACs, as well as overlapping sequence between two adjoining clones. Employing a user-friendly graphical interface (Figure 3A and 3B), we manually curated the order and orientation of BAC pieces in overlapping regions, and removed overlapping or redundant sequences from the final pseudomolecule according to sequence alignment. All processing information was saved into our database for creating the AGP file.
Figure 3

Direct comparison of sequence overlap between adjacent clones before (A) and after (B) the semi automated AGP pipeline.

At present, a total of 16,910 clones assigned to 435 FPC contigs have been processed by the AGP pipeline. After removal of sequence overlap and ordering and orienting sequence contigs within the overlapping regions, we were able to generate a maize AGP composed of 2048 Mb of pseudomolecule sequences in 61,161 scaffolds from 125,325 sequence contigs, which covers ∼97% of the 2120-Mb physical map. Table 3 summarizes the sizes, scaffolds, and contig number of each maize chromosome plus those that are unanchored. The AGP and maize B73 RefGen_v1 are available at: http://www2.genome.arizona.edu/genomes/maize.
Table 3

Sequence summary of the maize chromosomes in B73 RefGen_v1.

ChrLength (bp)ScaffoldContig
NumberLength (bp)Average (bp)NumberLength (bp)Average (bp)
0 a 14680007647145889072254912061453160712049
1 3002390418696299312341344201768329840544116875
2 2347528396661234044439351371369423333393917039
3 2305581376612229865037347651350922916793716964
4 2470955086834246365608360501397524563870817577
5 2169155296547216219929330261314821555172916394
6 1692543005257168698300320901098616811900015303
7 1709741875239170418187325291085816985128715643
8 1745152995452173935699319031145017333059915138
9 152350485465315185218532635924315138668516379
10 149686045456314920214532698957314869774515533
Total 2061021377 61161 2054502777 33592 125325 2048014677 16342

total of all unanchored contigs.

total of all unanchored contigs.

Conclusion

We used an integrated genetic and physical map to select and validate an MTP of clones across the maize genome as the template to generate a whole genome sequence. Using individual BAC assemblies, over 8,300 sequence-based genetic markers, and the optical map, we significantly improved the integrated genetic and physical map of maize, which in turn resulted in the generation of an AGP across the maize genome. The tremendous resources generated by this project will greatly facilitate basic and applied research on multiple fronts, including comparative and functional genomics studies, genome structure and evolution, map-based gene cloning, and molecular breeding. Although the first release of the maize genome (i.e. B73 RefGen_v1) is now realized, as with any genome sequence, several improvements are still needed to produce an even more accurate reference sequence for maize. First, six percent of the genome (127.8 Mb in total) still needs to be genetically ordered and oriented. This includes 20.6 Mb (1.0% of the physical map in 21 contigs) to be oriented, 90.1 Mb (4.3% in 48 contigs) to be precisely ordered and oriented, and finally, 17.1 Mb (0.8% in 35 contigs) to be genetically mapped. Secondly, the physical map covers ∼93% of the B73 genome in 435 contigs, and significant physical gaps remain to be bridged. For example, approximately 5% of the maize full-length cDNA data set could not be mapped to the genome (i.e. B73 RefGen_v1; [26]). Finally, we must continue to better orient sequence contigs within BACs using multiple data types, such as the optical map, syntenic relationships across the cereal genomes, full-length cDNA evidence, and paired-end whole genome shotgun sequence. Data generated from the maize diversity project should provide enough evidence to anchor most unanchored contigs (Ed Buckler, pers. comm.). Efforts to further improve the B73 RefGen_v1 are now underway, and new AGP releases will be made available regularly through the AGI website (www2.genome.arizona.edu/genome/maize).

Materials and Methods

Physical map editing and anchoring

All steps related to physical map editing were as previously described [18].

Sequence based genetic marker integration

See Text S1.

MTP clone selection pipeline

See Text S1.

AGP generation pipeline

See Text S1. MTP clones and their physical position, sequence characteristics, and overlap information. (2.82 MB XLS) Click here for additional data file. Contig number and orientation change after merging and breaking. (0.03 MB XLS) Click here for additional data file. Genetic markers and their genetic, physical, and RefGen_v1 positions. (1.80 MB XLS) Click here for additional data file. The position of bin markers in the B73 physical map and RefGen_v1. (0.03 MB XLS) Click here for additional data file. Contig anchoring quality and contig positions in B73 RefGen_v1. (0.05 MB XLS) Click here for additional data file. The maize MTP pipeline, the maize AGP pipeline, and sequence-based genetic markers. (0.40 MB DOC) Click here for additional data file.
  28 in total

1.  DNA sequence evidence for the segmental allotetraploid origin of maize.

Authors:  B S Gaut; J F Doebley
Journal:  Proc Natl Acad Sci U S A       Date:  1997-06-24       Impact factor: 11.205

Review 2.  Genetic, physical, and informatics resources for maize. On the road to an integrated map.

Authors:  Karen C Cone; Michael D McMullen; Irie Vroh Bi; Georgia L Davis; Young-Sun Yim; Jack M Gardiner; Mary L Polacco; Hector Sanchez-Villeda; Zhiwei Fang; Steven G Schroeder; Seth A Havermann; John E Bowers; Andrew H Paterson; Carol A Soderlund; Fred W Engler; Rod A Wing; Edward H Coe
Journal:  Plant Physiol       Date:  2002-12       Impact factor: 8.340

3.  Genetic dissection of intermated recombinant inbred lines using a new genetic map of maize.

Authors:  Yan Fu; Tsui-Jung Wen; Yefim I Ronin; Hsin D Chen; Ling Guo; David I Mester; Yongjie Yang; Michael Lee; Abraham B Korol; Daniel A Ashlock; Patrick S Schnable
Journal:  Genetics       Date:  2006-09-01       Impact factor: 4.562

4.  The B73 maize genome: complexity, diversity, and dynamics.

Authors:  Patrick S Schnable; Doreen Ware; Robert S Fulton; Joshua C Stein; Fusheng Wei; Shiran Pasternak; Chengzhi Liang; Jianwei Zhang; Lucinda Fulton; Tina A Graves; Patrick Minx; Amy Denise Reily; Laura Courtney; Scott S Kruchowski; Chad Tomlinson; Cindy Strong; Kim Delehaunty; Catrina Fronick; Bill Courtney; Susan M Rock; Eddie Belter; Feiyu Du; Kyung Kim; Rachel M Abbott; Marc Cotton; Andy Levy; Pamela Marchetto; Kerri Ochoa; Stephanie M Jackson; Barbara Gillam; Weizu Chen; Le Yan; Jamey Higginbotham; Marco Cardenas; Jason Waligorski; Elizabeth Applebaum; Lindsey Phelps; Jason Falcone; Krishna Kanchi; Thynn Thane; Adam Scimone; Nay Thane; Jessica Henke; Tom Wang; Jessica Ruppert; Neha Shah; Kelsi Rotter; Jennifer Hodges; Elizabeth Ingenthron; Matt Cordes; Sara Kohlberg; Jennifer Sgro; Brandon Delgado; Kelly Mead; Asif Chinwalla; Shawn Leonard; Kevin Crouse; Kristi Collura; Dave Kudrna; Jennifer Currie; Ruifeng He; Angelina Angelova; Shanmugam Rajasekar; Teri Mueller; Rene Lomeli; Gabriel Scara; Ara Ko; Krista Delaney; Marina Wissotski; Georgina Lopez; David Campos; Michele Braidotti; Elizabeth Ashley; Wolfgang Golser; HyeRan Kim; Seunghee Lee; Jinke Lin; Zeljko Dujmic; Woojin Kim; Jayson Talag; Andrea Zuccolo; Chuanzhu Fan; Aswathy Sebastian; Melissa Kramer; Lori Spiegel; Lidia Nascimento; Theresa Zutavern; Beth Miller; Claude Ambroise; Stephanie Muller; Will Spooner; Apurva Narechania; Liya Ren; Sharon Wei; Sunita Kumari; Ben Faga; Michael J Levy; Linda McMahan; Peter Van Buren; Matthew W Vaughn; Kai Ying; Cheng-Ting Yeh; Scott J Emrich; Yi Jia; Ananth Kalyanaraman; An-Ping Hsia; W Brad Barbazuk; Regina S Baucom; Thomas P Brutnell; Nicholas C Carpita; Cristian Chaparro; Jer-Ming Chia; Jean-Marc Deragon; James C Estill; Yan Fu; Jeffrey A Jeddeloh; Yujun Han; Hyeran Lee; Pinghua Li; Damon R Lisch; Sanzhen Liu; Zhijie Liu; Dawn Holligan Nagel; Maureen C McCann; Phillip SanMiguel; Alan M Myers; Dan Nettleton; John Nguyen; Bryan W Penning; Lalit Ponnala; Kevin L Schneider; David C Schwartz; Anupma Sharma; Carol Soderlund; Nathan M Springer; Qi Sun; Hao Wang; Michael Waterman; Richard Westerman; Thomas K Wolfgruber; Lixing Yang; Yeisoo Yu; Lifang Zhang; Shiguo Zhou; Qihui Zhu; Jeffrey L Bennetzen; R Kelly Dawe; Jiming Jiang; Ning Jiang; Gernot G Presting; Susan R Wessler; Srinivas Aluru; Robert A Martienssen; Sandra W Clifton; W Richard McCombie; Rod A Wing; Richard K Wilson
Journal:  Science       Date:  2009-11-20       Impact factor: 47.728

5.  Linkage relationships of 19 enzyme Loci in maize.

Authors:  M M Goodman; C W Stuber; K Newton; H H Weissinger
Journal:  Genetics       Date:  1980-11       Impact factor: 4.562

6.  A first-generation haplotype map of maize.

Authors:  Michael A Gore; Jer-Ming Chia; Robert J Elshire; Qi Sun; Elhan S Ersoz; Bonnie L Hurwitz; Jason A Peiffer; Michael D McMullen; George S Grills; Jeffrey Ross-Ibarra; Doreen H Ware; Edward S Buckler
Journal:  Science       Date:  2009-11-20       Impact factor: 47.728

7.  Anchoring 9,371 maize expressed sequence tagged unigenes to the bacterial artificial chromosome contig map by two-dimensional overgo hybridization.

Authors:  Jack Gardiner; Steven Schroeder; Mary L Polacco; Hector Sanchez-Villeda; Zhiwei Fang; Michele Morgante; Tim Landewe; Kevin Fengler; Francisco Useche; Michael Hanafey; Scott Tingey; Hugh Chou; Rod Wing; Carol Soderlund; Edward H Coe
Journal:  Plant Physiol       Date:  2004-03-12       Impact factor: 8.340

8.  Duplicated chromosome segments in maize (Zea mays L.): further evidence from hexokinase isozymes.

Authors:  J F Wendel; C W Stuber; M D Edwards; M M Goodman
Journal:  Theor Appl Genet       Date:  1986-03       Impact factor: 5.699

9.  A single molecule scaffold for the maize genome.

Authors:  Shiguo Zhou; Fusheng Wei; John Nguyen; Mike Bechner; Konstantinos Potamousis; Steve Goldstein; Louise Pape; Michael R Mehan; Chris Churas; Shiran Pasternak; Dan K Forrest; Roger Wise; Doreen Ware; Rod A Wing; Michael S Waterman; Miron Livny; David C Schwartz
Journal:  PLoS Genet       Date:  2009-11-20       Impact factor: 5.917

10.  Sequencing, mapping, and analysis of 27,455 maize full-length cDNAs.

Authors:  Carol Soderlund; Anne Descour; Dave Kudrna; Matthew Bomhoff; Lomax Boyd; Jennifer Currie; Angelina Angelova; Kristi Collura; Marina Wissotski; Elizabeth Ashley; Darren Morrow; John Fernandes; Virginia Walbot; Yeisoo Yu
Journal:  PLoS Genet       Date:  2009-11-20       Impact factor: 5.917

View more
  52 in total

1.  DNA methylation epigenetically silences crossover hot spots and controls chromosomal domains of meiotic recombination in Arabidopsis.

Authors:  Nataliya E Yelina; Christophe Lambing; Thomas J Hardcastle; Xiaohui Zhao; Bruno Santos; Ian R Henderson
Journal:  Genes Dev       Date:  2015-10-15       Impact factor: 11.361

2.  Maligner: a fast ordered restriction map aligner.

Authors:  Lee M Mendelowitz; David C Schwartz; Mihai Pop
Journal:  Bioinformatics       Date:  2015-12-03       Impact factor: 6.937

3.  Fine quantitative trait loci mapping of carbon and nitrogen metabolism enzyme activities and seedling biomass in the maize IBM mapping population.

Authors:  Nengyi Zhang; Yves Gibon; Amit Gur; Charles Chen; Nicholas Lepak; Melanie Höhne; Zhiwu Zhang; Dallas Kroon; Hendrik Tschoep; Mark Stitt; Edward Buckler
Journal:  Plant Physiol       Date:  2010-10-22       Impact factor: 8.340

4.  In silico genotyping of the maize nested association mapping population.

Authors:  Baohong Guo; William D Beavis
Journal:  Mol Breed       Date:  2010-09-26       Impact factor: 2.589

5.  Generation of a BAC-based physical map of the melon genome.

Authors:  Víctor M González; Jordi Garcia-Mas; Pere Arús; Pere Puigdomènech
Journal:  BMC Genomics       Date:  2010-05-28       Impact factor: 3.969

6.  Choosing a genome browser for a Model Organism Database: surveying the maize community.

Authors:  Taner Z Sen; Lisa C Harper; Mary L Schaeffer; Carson M Andorf; Trent E Seigfried; Darwin A Campbell; Carolyn J Lawrence
Journal:  Database (Oxford)       Date:  2010-07-06       Impact factor: 3.451

7.  MaizeGDB becomes 'sequence-centric'.

Authors:  Taner Z Sen; Carson M Andorf; Mary L Schaeffer; Lisa C Harper; Michael E Sparks; Jon Duvick; Volker P Brendel; Ethalinda Cannon; Darwin A Campbell; Carolyn J Lawrence
Journal:  Database (Oxford)       Date:  2009-12-07       Impact factor: 3.451

8.  Detailed analysis of a contiguous 22-Mb region of the maize genome.

Authors:  Fusheng Wei; Joshua C Stein; Chengzhi Liang; Jianwei Zhang; Robert S Fulton; Regina S Baucom; Emanuele De Paoli; Shiguo Zhou; Lixing Yang; Yujun Han; Shiran Pasternak; Apurva Narechania; Lifang Zhang; Cheng-Ting Yeh; Kai Ying; Dawn H Nagel; Kristi Collura; David Kudrna; Jennifer Currie; Jinke Lin; Hyeran Kim; Angelina Angelova; Gabriel Scara; Marina Wissotski; Wolfgang Golser; Laura Courtney; Scott Kruchowski; Tina A Graves; Susan M Rock; Stephanie Adams; Lucinda A Fulton; Catrina Fronick; William Courtney; Melissa Kramer; Lori Spiegel; Lydia Nascimento; Ananth Kalyanaraman; Cristian Chaparro; Jean-Marc Deragon; Phillip San Miguel; Ning Jiang; Susan R Wessler; Pamela J Green; Yeisoo Yu; David C Schwartz; Blake C Meyers; Jeffrey L Bennetzen; Robert A Martienssen; W Richard McCombie; Srinivas Aluru; Sandra W Clifton; Patrick S Schnable; Doreen Ware; Richard K Wilson; Rod A Wing
Journal:  PLoS Genet       Date:  2009-11-20       Impact factor: 5.917

9.  A single molecule scaffold for the maize genome.

Authors:  Shiguo Zhou; Fusheng Wei; John Nguyen; Mike Bechner; Konstantinos Potamousis; Steve Goldstein; Louise Pape; Michael R Mehan; Chris Churas; Shiran Pasternak; Dan K Forrest; Roger Wise; Doreen Ware; Rod A Wing; Michael S Waterman; Miron Livny; David C Schwartz
Journal:  PLoS Genet       Date:  2009-11-20       Impact factor: 5.917

10.  10 reasons to be tantalized by the B73 maize genome.

Authors:  Virginia Walbot
Journal:  PLoS Genet       Date:  2009-11-20       Impact factor: 5.917

View more

北京卡尤迪生物科技股份有限公司 © 2022-2023.