| Literature DB >> 23998490 |
Martin Mascher1, Gary J Muehlbauer, Daniel S Rokhsar, Jarrod Chapman, Jeremy Schmutz, Kerrie Barry, María Muñoz-Amatriaín, Timothy J Close, Roger P Wise, Alan H Schulman, Axel Himmelbach, Klaus F X Mayer, Uwe Scholz, Jesse A Poland, Nils Stein, Robbie Waugh.
Abstract
Next-generation whole-genome shotgun assemblies of complex genomes are highly useful, but fail to link nearby sequence contigs with each other or provide a linear order of contigs along individual chromosomes. Here, we introduce a strategy based on sequencing progeny of a segregating population that allows de novo production of a genetically anchored linear assembly of the gene space of an organism. We demonstrate the power of the approach by reconstructing the chromosomal organization of the gene space of barley, a large, complex and highly repetitive 5.1 Gb genome. We evaluate the robustness of the new assembly by comparison to a recently released physical and genetic framework of the barley genome, and to various genetically ordered sequence-based genotypic datasets. The method is independent of the need for any prior sequence resources, and will enable rapid and cost-efficient establishment of powerful genomic information for many species.Entities:
Keywords: Hordeum vulgare; barley; genetic mapping; genome assembly; next-generation sequencing; population sequencing; technical advance
Mesh:
Year: 2013 PMID: 23998490 PMCID: PMC4298792 DOI: 10.1111/tpj.12319
Source DB: PubMed Journal: Plant J ISSN: 0960-7412 Impact factor: 6.417
Figure 1Schematic representation of POPSEQ.(a) A segregating population (80–100 individuals) is constructed from a bi-parental cross.(b) A whole-genome shotgun is generated for one parent, and used to construct a gene space assembly (alternatively, the POPSEQ data itself may be used for this purpose). On this assembly, gene models (green arrows) are defined using RNA–seq. In parallel, POPSEQ, and, if necessary, genotyping-by-sequencing (GBS), is performed on the population, and a medium-density framework genetic map is calculated (thousands to tens of thousands of loci).(c) SNPs detected and typed by POPSEQ along with associated WGS contigs are integrated into the framework map through nearest-neighbor search.(d) The result of POPSEQ is a sequence assembly in linear order that contains comprehensive information on the gene space. It may be enhanced by performing POPSEQ on additional populations.
Sequence data generated in this study
| M x B WGS | OWB WGS | M × B GBS | Morex | |
|---|---|---|---|---|
| Population | Morex × Barke RIL F8 | Oregon Wolfe Barleys DH | Morex × Barke RIL F8 | – |
| Sequencing technology | Whole-genome shotgun; HiSeq 2000 | Whole-genome shotgun; HiSeq 2000 | Genotyping-by-sequencing; HiSeq 2000 | Whole-genome shotgun; HiSeq 2000 |
| Number of sequencing lanes | 12 | 12 | 1 | 2 |
| Number of sequenced individuals | 90 (+ parents) | 82 (+ parents) | 92 (+ parents) | 1 |
| Approximate coverage per sample | 1× | 1× | 1× (10 Mb represented) | 15× |
| Number of SNPs detected | 5 123 696 | 6 543 684 | 21 397 | – |
| Mean number of present genotype calls per marker | 33 | 31 | 58 | – |
Anchoring statistics
| M x B (iSelect) | OWB | M x B (GBS map) | M x B + OWB | IBSC | |
|---|---|---|---|---|---|
| Number of SNPs used for anchoring | 4 381 020 | 6 117 837 | 4 429 475 | 11 229 709 | 498 165 |
| Framework map | iSelect | OWB GBS | M x B GBS | iSelect/OWB GBS | iSelect |
| Number of anchored contigs | 498 856 | 591 779 | 512 293 | 747 077 | 138 443 |
| Size of anchored contigs (Mb) | 927 (50%) | 1000 (53%) | 934 (50%) | 1222 (65%) | 410 (16%) |
| Median length of anchored contigs (bp) | 1006 | 973 | 977 | 891 | 1431 |
| Number of anchored HC genes | 16 682 (64%) | 15 743 (60%) | 16 729 (64%) | 20 932 (80%) | 15 719 (60%) |
| Number of anchored LC genes | 28 337 (56%) | 29 033 (55%) | 28 559 (56%) | 37 609 (71%) | 19 415 (36%) |
The Morex × Barke iSelect framework map is described in International Barley Genome Sequencing Consortium (2012) and Comadran .
High-confidence genes as described in International Barley Genome Sequencing Consortium (2012).
Low-confidence genes as described in International Barley Genome Sequencing Consortium (2012).
Figure 2POPSEQ validation. WGS contigs anchored to three genetic maps. These plots show the colinearity of contigs anchored to the Morex × Barke iSelect framework map and (a) the physical and genetic framework of barley (International Barley Genome Sequencing Consortium, 2012), (b) a Morex × Barke genetic map constructed by genotyping-by-sequencing (GBS), (c) a GBS map (Poland ) constructed in the OWB. WGS contigs are shown as dots, and are mostly within 5 cM of the diagonal: 90.8% in (a), 99.2% in (b) 93.2% in (c).