| Literature DB >> 26252423 |
María Muñoz-Amatriaín1, Stefano Lonardi2, MingCheng Luo3, Kavitha Madishetty1, Jan T Svensson1,4, Matthew J Moscou1,5, Steve Wanamaker1, Tao Jiang2, Andris Kleinhofs6, Gary J Muehlbauer7, Roger P Wise8, Nils Stein9, Yaqin Ma1,3,10, Edmundo Rodriguez1,11, Dave Kudrna12, Prasanna R Bhat1,13, Shiaoman Chao14, Pascal Condamine1, Shane Heinen7, Josh Resnik1,15, Rod Wing12, Heather N Witt3,16, Matthew Alpert2,17, Marco Beccuti2,18, Serdar Bozdag2,19, Francesca Cordero2,18, Hamid Mirebrahim2, Rachid Ounit2, Yonghui Wu2,20, Frank You21,22, Jie Zheng2,23, Hana Simková24, Jaroslav Dolezel24, Jane Grimwood25,26, Jeremy Schmutz25,26, Denisa Duma2,27, Lothar Altschmied9, Tom Blake28, Phil Bregitzer29, Laurel Cooper30,31, Muharrem Dilbirligi6,32, Anders Falk33, Leila Feiz28,34, Andreas Graner9, Perry Gustafson35, Patrick M Hayes30, Peggy Lemaux36, Jafar Mammadov37,38, Timothy J Close1.
Abstract
Barley (Hordeum vulgare L.) possesses a large and highly repetitive genome of 5.1 Gb that has hindered the development of a complete sequence. In 2012, the International Barley Sequencing Consortium released a resource integrating whole-genome shotgun sequences with a physical and genetic framework. However, because only 6278 bacterial artificial chromosome (BACs) in the physical map were sequenced, fine structure was limited. To gain access to the gene-containing portion of the barley genome at high resolution, we identified and sequenced 15 622 BACs representing the minimal tiling path of 72 052 physical-mapped gene-bearing BACs. This generated ~1.7 Gb of genomic sequence containing an estimated 2/3 of all Morex barley genes. Exploration of these sequenced BACs revealed that although distal ends of chromosomes contain most of the gene-enriched BACs and are characterized by high recombination rates, there are also gene-dense regions with suppressed recombination. We made use of published map-anchored sequence data from Aegilops tauschii to develop a synteny viewer between barley and the ancestor of the wheat D-genome. Except for some notable inversions, there is a high level of collinearity between the two species. The software HarvEST:Barley provides facile access to BAC sequences and their annotations, along with the barley-Ae. tauschii synteny viewer. These BAC sequences constitute a resource to improve the efficiency of marker development, map-based cloning, and comparative genomics in barley and related crops. Additional knowledge about regions of the barley genome that are gene-dense but low recombination is particularly relevant.Entities:
Keywords: Aegilops tauschii; BAC sequencing; Barley; HarvEST:Barley; Hordeum vulgare L.; centromere BACs; gene distribution; recombination frequency; synteny
Mesh:
Year: 2015 PMID: 26252423 PMCID: PMC5014227 DOI: 10.1111/tpj.12959
Source DB: PubMed Journal: Plant J ISSN: 0960-7412 Impact factor: 6.417
Statistics of the gene‐bearing BAC sequence assembly for nodes ≥200 bp in size
| Chr. arm | No. BACs | Avg. no. nodes per BAC | Length of assembled reads (bp) | Avg. BAC length (bp) | Avg. N50 | Avg. L50 | No. unique HC gene models | No. unique LC gene models |
|---|---|---|---|---|---|---|---|---|
| 1H | 1959 | 19.5 | 209 800 854 | 107 096 | 23 697 | 2.8 | 2866 | 3502 |
| 2HS | 1048 | 18.5 | 111 465 824 | 106 361 | 24 703 | 2.7 | 1486 | 1935 |
| 2HL | 1391 | 19.1 | 149 179 441 | 107 246 | 22 978 | 2.7 | 2241 | 2529 |
| 3HS | 862 | 18.2 | 92 368 717 | 107 156 | 26 301 | 2.6 | 1242 | 1704 |
| 3HL | 1389 | 18.8 | 148 465 120 | 106 886 | 24 223 | 2.7 | 2132 | 2576 |
| 4HS | 862 | 17.9 | 94 522 065 | 109 654 | 26 577 | 2.6 | 1048 | 1442 |
| 4HC | 60 | 15.8 | 5 866 127 | 97 769 | 28 999 | 2.2 | 20 | 40 |
| 4HL | 1100 | 18.7 | 120 519 518 | 109 563 | 25 756 | 2.7 | 1536 | 1740 |
| 5HS | 640 | 19.1 | 69 074 495 | 107 929 | 24 696 | 2.8 | 812 | 1207 |
| 5HL | 1623 | 20.1 | 173 282 123 | 106 767 | 22 285 | 2.9 | 2777 | 3288 |
| 6HS | 823 | 19.7 | 87 025 477 | 105 742 | 22 248 | 2.8 | 1070 | 1624 |
| 6HL | 1113 | 19.3 | 120 662 552 | 108 412 | 23 922 | 2.8 | 1610 | 1942 |
| 7HS | 1196 | 19 | 129 484 046 | 108 264 | 24 580 | 2.7 | 1770 | 2496 |
| 7HL | 1150 | 20.1 | 122 082 348 | 106 159 | 23 221 | 2.8 | 1734 | 2182 |
| NA | 406 | 41.6 | 63 956 654 | 157 529 | 16 942 | 5.2 | 994 | 1287 |
| All | 15 622 | 19.7 | 1 697 755 361 | 108 677 | 23 906 | 2.8 | 15 707 | 19 330 |
N50: length for which the collection of all nodes (contigs) of that length or longer contains at least half of the sum of the lengths of all nodes (contigs) in the BAC assembly.
L50: minimum number of nodes (contigs) accounting for more than 50% of the BAC assembly.
Gene models hitting 10 or more BACs are not included in the count.
Comparison between gene models found in two different sequence assemblies of 997 BACs. High confidence (HC) and low‐confidence (LC) gene models predicted by IBSC (2012) were considered. A minimum sequence length of 200 bp and an e‐value of 1e−20 were the cutoffs used for the BLAST alignments. Numbers do not include gene models hitting ≥10 BACs
| Sequencing technology | Avg. HC gene models / BAC | Total unique HC gene models | Avg. LC gene models / BAC | Total unique LC gene models |
|---|---|---|---|---|
| 454 | 2.96 | 2604 | 3.26 | 2950 |
| Illumina | 2.89 | 2571 | 3.18 | 2904 |
| Both | 2.83 | 2489 | 3.08 | 2785 |
Sequencing institutions for 454 sequencing included IPK Gatersleben, Fritz‐Lipman Institute in Jena and Eurofins Scientific, and were published in IBSC (2012).
Sequencing institution for Illumina sequencing was UCR.
Figure 1BAC distribution along barley chromosomes 2H and 5H and syntenic relationships with rice chromosomes. Grey bars represent the number of sequenced barley BACs and their units are shown on the left Y‐axis. Colored lines represent the proportion of BACs containing only one HC gene model (blue), three or more HC genes (red) or zero HC gene models (yellow), and the scale is shown on the right Y‐axis. BAC densities are calculated for a sliding window of 40 Mb at 2.5 Mb intervals based on the physical coordinates (archived golden path) provided by IBSC (2012). Gray arrows indicate gene‐dense regions different from distal ends. Barley–rice synteny is represented by lines connecting each mapped BAC to the position on the rice genome determined by BLASTX (see Experimental procedures). Densities of expressed rice genes across chromosomes are also displayed (adapted from Figure S2 in IRGSP, 2005), where blue bars indicate the frequency of gene models in 100 kb windows, red boxes indicate centromeres and white boxes represent physical gaps.
Figure 2Relationship between recombination frequency (RF) and gene density (GD) along the seven barley chromosomes. Recombination rates are calculated from the cM/Mb ratios in sliding windows of 40 Mb with 2.5 Mb increments, and are represented by a color gradient from green (RF = 0) to red (RF = 1.14). Gene densities are estimated based on the total number of unique HC genes per window with respect to the total number of sequenced BACs assigned to that window, and are represented by the same color gradient from green (GD = 0.67) to red (GD = 3.16). Grey arrows indicate most evident genomic regions of relatively high gene density and very low recombination.