| Literature DB >> 23982223 |
Emma S Mace1, Shuaishuai Tai, Edward K Gilding, Yanhong Li, Peter J Prentis, Lianle Bian, Bradley C Campbell, Wushu Hu, David J Innes, Xuelian Han, Alan Cruickshank, Changming Dai, Céline Frère, Haikuan Zhang, Colleen H Hunt, Xianyuan Wang, Tracey Shatte, Miao Wang, Zhe Su, Jun Li, Xiaozhen Lin, Ian D Godwin, David R Jordan, Jun Wang.
Abstract
Sorghum is a food and feed cereal crop adapted to heat and drought and a staple for 500 million of the world's poorest people. Its small diploid genome and phenotypic diversity make it an ideal C4 grass model as a complement to C3 rice. Here we present high coverage (16-45 × ) resequenced genomes of 44 sorghum lines representing the primary gene pool and spanning dimensions of geographic origin, end-use and taxonomic group. We also report the first resequenced genome of S. propinquum, identifying 8 M high-quality SNPs, 1.9 M indels and specific gene loss and gain events in S. bicolor. We observe strong racial structure and a complex domestication history involving at least two distinct domestication events. These assembled genomes enable the leveraging of existing cereal functional genomics data against the novel diversity available in sorghum, providing an unmatched resource for the genetic improvement of sorghum and other grass species.Entities:
Mesh:
Year: 2013 PMID: 23982223 PMCID: PMC3759062 DOI: 10.1038/ncomms3320
Source DB: PubMed Journal: Nat Commun ISSN: 2041-1723 Impact factor: 14.919
Summary SNP statistics in Sorghum bicolor genotypes.
| Improved inbreds | 2,284,285 | 2.334 | 2.014 | −0.213 |
| Landraces | 3,092,165 | 2.514 | 2.277 | −0.295 |
| Wild & Weedy | 3,465,947 | 3.881 | 3.177 | −0.349 |
| Total | 4,946,038 | 3.048 | 3.416 | −0.827 |
*Excluding S. propinquum. An additional 3,116,493 genome-wide SNPs were identified in the S. propinquum accessions.
Figure 1Analysis of the phylogenetic relationship and LD decay of sorghum.
(a) A neighbor-joining tree constructed using SNP data. (b) Principal component analysis of sorghum accessions. (c) LD decay determined by squared correlations of allele frequencies (r2) against distance between polymorphic sites in wild and weedy sorghum genotypes (red), landraces (green) and improved inbreds (blue).
Figure 2Signatures of selection in two genomic regions.
(a,b) Diversity (θπ) (wild and weedy group in red; landraces in blue, improved inbreds in green) and FST values (black line) in 10-kb intervals. (c,d) Predicted gene models in genomic regions with candidate genes under selection highlighted, Sb01g032830 (SbGS3) and Sb04g028060 (SSIIb), respectively. (e,f) LD blocks around candidate genes under selection. Red and white spots indicate strong (r2=1) and weak (r2=0) LD, respectively. (g,h) Gene trees for GS3 and SSIIb, respectively.
Figure 3Summary of sorghum resequencing data.
Concentric circles show the different features that were drawn using the Circos program57. The 10 chromosomes are portrayed along the perimeter of each circle. (a) Gene content density distribution. (b) Genomic diversity (θπ) of wild and weedy genotypes (red), landraces (green) and improved inbreds (blue). (c) Tajima’s D of wild and weedy genotypes (red), landraces (green) and improved inbreds (blue). (d) Number of SNPs in wild and weedy genotypes (red), landraces (green) and improved inbreds (blue). (e) FST values of improved inbreds versus landraces. (f) FST values of improved inbreds versus wild and weedy genotypes. (g) FST values of landraces versus wild and weedy genotypes. (h) A graphical view of duplicated annotated genes is indicated by connections between segments.