| Literature DB >> 28912789 |
Sangrong Sun1,2, Jinpeng Wang1,2, Jigao Yu1,2, Fanbo Meng1,2, Ruiyan Xia1, Li Wang1,2, Zhenyi Wang1,2, Weina Ge1,2, Xiaojian Liu1, Yuxian Li1,2, Yinzhe Liu1,2, Nanshan Yang1,2, Xiyin Wang1,2.
Abstract
Grass genomes are complicated structures as they share a common tetraploidization, and particular genomes have been further affected by extra polyploidizations. These events and the following genomic re-patternings have resulted in a complex, interweaving gene homology both within a genome, and between genomes. Accurately deciphering the structure of these complicated plant genomes would help us better understand their compositional and functional evolution at multiple scales. Here, we build on our previous research by performing a hierarchical alignment of the common wheat genome vis-à-vis eight other sequenced grass genomes with most up-to-date assemblies, and annotations. With this data, we constructed a list of the homologous genes, and then, in a layer-by-layer process, separated their orthology, and paralogy that were established by speciations and recursive polyploidizations, respectively. Compared with the other grasses, the far fewer collinear outparalogous genes within each of three subgenomes of common wheat suggest that homoeologous recombination, and genomic fractionation should have occurred after its formation. In sum, this work contributes to the establishment of an important and timely comparative genomics platform for researchers in the grass community and possibly beyond. Homologous gene list can be found in Supplemental material.Entities:
Keywords: common wheat; gene collinearity; genome; grass; polyploidization
Year: 2017 PMID: 28912789 PMCID: PMC5582351 DOI: 10.3389/fpls.2017.01480
Source DB: PubMed Journal: Front Plant Sci ISSN: 1664-462X Impact factor: 5.753
Figure 1Alignment of the Poaceae chromosomes with rice as reference. Based on gene collinearity, the chromosomes were aligned with rice used as the reference. The whole-genome duplication (WGD) in the common ancestor of these Poaceae plants caused all of them to have at least two circles of chromosomes. An additional lineage-specific diploidization event caused maize to have four chromosomes, and an independent hybridization event caused common wheat to have six such chromosomes. Each grass species has another circle containing additional duplicated regions. Genes are colored according to their correspondence with the rice chromosome. For example, the genes from all Poaceae plants having orthologs on the rice chromosome 1 are given in blue. A, Aegilops tauschii (wheat D genome); B, Brachypodium distachyon; F, Setaria italica; H, Hordeum vulgare (barley genome); O, Oryza sativa (rice genome); S, Sorghum bicolor; T, Triticum urartu (wheat A genome); Z, Zea mays; a, genome A of Triticum aestivum (common wheat); b, genome B of Triticum aestivum; d, genome D of Triticum aestivum.
Figure 2Alignment of the wheat crop chromosomes with barley as reference. Based on gene collinearity, the chromosomes were aligned with barley used as the reference. The whole-genome duplication (WGD) in the common ancestor of these Poaceae plants caused all of them to have at least two concentric circles of chromosomes, and an additional hybridization event caused the common wheat to have six such chromosomes. Each grass species has another concentric circle containing additional duplicated regions. Genes are colored according to their correspondence with the barley chromosome. For example, the genes from all Poaceae plants having orthologs on the barley chromosome 1 are shown in blue. A, Aegilops tauschii (wheat D genome); H, Hordeum vulgare (barley genome); T, Triticum urartu (wheat A genome); a, genome A of Triticum aestivum (common wheat); b, genome B of Triticum aestivum; d, genome D of Triticum aestivum.
Number of paralogous and orthologous blocks within and among the selected Poaceae genomes.
| 433 | 138 | 167 | 145 | 121 | 168 | 307 | 146 | 126 | 152 | |||
| 118 | 226 | 709 | 264 | 239 | 241 | 276 | 615 | 601 | 610 | |||
| 72 | 47 | 561 | 114 | 97 | 110 | 146 | 160 | 244 | 240 | |||
| 87 | 103 | 92 | 141 | 136 | 141 | 239 | 464 | 396 | 407 | |||
| 109 | 110 | 63 | 78 | 78 | 158 | 261 | 119 | 119 | 131 | |||
| 94 | 92 | 60 | 73 | 80 | 137 | 219 | 96 | 107 | 106 | |||
| 116 | 81 | 58 | 74 | 85 | 79 | 292 | 111 | 122 | 119 | |||
| 207 | 116 | 75 | 123 | 171 | 142 | 152 | 217 | 209 | 213 | |||
| genome A | 67 | 72 | 71 | 108 | 82 | 67 | 85 | 119 | 24 | 37 | ||
| genome B | 73 | 75 | 73 | 107 | 75 | 71 | 74 | 121 | 51 | 8 | ||
| genome D | 84 | 87 | 76 | 104 | 76 | 69 | 77 | 126 | 56 | 55 | ||
Numbers in boldness on the main diagonal denote the paralogous blocks within a genome, numbers above the diagonal denote the orthologous blocks between two genomes, while the numbers below the diagonal denote the out-paralogous blocks between two genomes.
Number of paralogous and orthologous gene pairs within and among the selected Poaceae genomes.
| ,5000 | 4,591 | 6,642 | 16,330 | 14,564 | 15,427 | 14,678 | 7,998 | 7,998 | 8,388 | |||
| 1,714 | 4,295 | 6,871 | 4,920 | 4,384 | 4,782 | 4,971 | 6,157 | 6,026 | 6,448 | |||
| 1,213 | 623 | 6,379 | 4,674 | 4,284 | 5,062 | 4,889 | 4,859 | 4,929 | 4,582 | |||
| 2,211 | 1,419 | 1,296 | 8,245 | 7,201 | 8,516 | 8,047 | 9,044 | 8,679 | 8,875 | |||
| 10,062 | 1,562 | 1,071 | 2,958 | 15,441 | 15,002 | 15,663 | 8,631 | 8,778 | 9,699 | |||
| 10,548 | 1,536 | 1,135 | 3,273 | 5,839 | 13,634 | 15,842 | 7,727 | 8,294 | 8,470 | |||
| 10,280 | 1,363 | 1,052 | 3,129 | 4,544 | 4,660 | 14,371 | 9,205 | 9,934 | 10,163 | |||
| 8,370 | 1,672 | 1,311 | 3,130 | 4,635 | 4,816 | 3,851 | 9,282 | 9,359 | 9,605 | |||
| genome A | 2,483 | 1,133 | 988 | 2,034 | 4,527 | 4,479 | 3,980 | 4,679 | 8,197 | 8,255 | ||
| genome B | 2,484 | 1,151 | 1,023 | 2,040 | 4,508 | 4,953 | 5,135 | 3,712 | 1,447 | 7,993 | ||
| genome D | 2,597 | 1,221 | 1,094 | 2,079 | 5,239 | 5,391 | 6,003 | 4,940 | 1,530 | 1,536 | ||
Numbers in boldness on the main diagonal denote the paralogous gene pairs within a genome, numbers above the diagonal denote the orthologous gene pairs between two genomes, while numbers below the diagonal denote the out-paralogous gene pairs between two genomes.
Number of paralogous and orthologous genes within and among the selected Poaceae genomes.
| 4,723/4,653 | 4,461/4,510 | 6,472/6,467 | 14,545/13,284 | 12,601/12,697 | 13,530/12,892 | 12,989/11,124 | 7,742/7,800 | 7,742/7,800 | 8,113/8,176 | |||
| 8.5/12.1 | 18.5/11.7 | 16.3/16.8 | 41.4/34.4 | 37.1/32.9 | 53.1/33.4 | 40.0/28.8 | 33.5/20.2 | 32.8/20.2 | 34.8/21.2 | |||
| 1,571/1,622 | 3,545/3,893 | 6,238/6,291 | 4,615/4,495 | 4,125/4,158 | 4,554/4,359 | 4,639/4,038 | 5,443/5,604 | 5,384/5,530 | 5,690/5,867 | |||
| 4.1/2.9 | 14.7/7.0 | 15.7/11.3 | 13.1/8.1 | 12.1/7.5 | 17.9/7.8 | 14.3/7.3 | 23.5/10.1 | 22.8/9.9 | 24.4/10.5 | |||
| 1,175/1,143 | 597/570 | 5,863/5,664 | 4,576/4,144 | 4,172/3,956 | 4,945/4,432 | 4,713/3,839 | 4,623/4,558 | 4,650/4,558 | 4,198/4,318 | |||
| 3.0/4.7 | 1.1/2.4 | 14.8/23.4 | 13.0/17.1 | 12.3/16.4 | 19.4/18.3 | 14.5/15.9 | 20.0/18.9 | 19.7/18.9 | 18.0/17.9 | |||
| 2,079/2,098 | 1,290/1,268 | 1,176/1,214 | 7,731/7,251 | 6,753/6,647 | 8,128/7,521 | 7,547/6,288 | 8,270/8,439 | 8,061/8,272 | 8,234/8,468 | |||
| 5.4/5.3 | 2.3/3.2 | 4.9/3.1 | 22.0/18.2 | 19.9/16.7 | 31.9/18.9 | 23.3/15.8 | 35.7/21.2 | 34.1/20.8 | 35.3/21.3 | |||
| 8,128/8,496 | 1,454/1,463 | 993/1,055 | 2,522/2,757 | 14,253/15,044 | 13,416/13,490 | 14,089/12,769 | 7,394/8,233 | 7,562/8,319 | 7,986/9,026 | |||
| 21.1/24.2 | 2.6/4.2 | 4.1/3.0 | 6.3/7.8 | 41.9/42.8 | 52.6/38.4 | 43.4/36.3 | 32.0/23.4 | 32.0/23.7 | 34.3/25.7 | |||
| 8,933/8,744 | 1,459/1,436 | 1,069/1,115 | 2,794/3,081 | 5,475/5,335 | 12,986/12,222 | 14,904/12,713 | 7,048/7,320 | 7,447/7,687 | 7,582/7,854 | |||
| 23.1/25.7 | 2.6/4.2 | 4.4/3.3 | 7.0/9.1 | 15.6/15.7 | 50.9/35.9 | 45.9/37.4 | 30.5/21.5 | 31.5/22.6 | 32.5/23.1 | |||
| 8,532/8,550 | 1,245/1,278 | 976/1,024 | 2,791/2,932 | 4,040/4,297 | 4,007/4,400 | 12,615/11,315 | 7,893/8,692 | 8,242/9,138 | 8,447/9,338 | |||
| 22.1/33.5 | 2.2/5.0 | 4.0/4.0 | 7.0/11.5 | 11.5/16.8 | 11.8/17.3 | 38.9/44.4 | 34.1/34.1 | 34.9/35.8 | 36.2/36.6 | |||
| 6,153/6,966 | 1,365/1,562 | 1,068/1,282 | 2,381/2,930 | 3,738/4,263 | 3,898/4,471 | 3,142/3,574 | 6,995/8,607 | 7,163/8,726 | 7,352/8,921 | |||
| 15.9/21.5 | 2.5/4.8 | 4.4/4.0 | 6.0/9.0 | 10.6/13.1 | 11.5/13.8 | 12.3/11.0 | 30.2/26.5 | 30.3/26.9 | 31.5/27.5 | |||
| A | 2,390/2,363 | 1,061/1,023 | 929/925 | 1,897/1,827 | 4,149/3,784 | 4,140/3,907 | 3,664/3,526 | 4,183/3,553 | 7,817/7,796 | 7,812/7,798 | ||
| 6.2/10.2 | 1.9/4.4 | 3.8/4.0 | 4.8/7.9 | 11.8/16.4 | 12.2/16.9 | 14.4/15.2 | 12.9/15.4 | 33.1/33.7 | 33.5/33.7 | |||
| B | 2,400/2,357 | 1,076/1,009 | 962/957 | 1,901/1,833 | 4,228/3,832 | 4,577/4,338 | 4,613/4,202 | 3,450/2,927 | 1,369/1,359 | 7,660/7,633 | ||
| 6.2/10.0 | 1.9/4.3 | 4.0/4.1 | 4.8/7.8 | 12.0/16.2 | 13.5/18.4 | 18.1/17.8 | 10.6/12.4 | 5.9/5.7 | 32.9/32.3 | |||
| D | 2,518/2,480 | 1,140/1,078 | 1,017/1,012 | 1,935/1,875 | 4,803/4,378 | 4,957/4,721 | 5,429/5,044 | 4,400/3,766 | 1,458/1,459 | 1,449/1,458 | ||
| 6.5/10.6 | 2.0/4.6 | 4.2/4.3 | 4.9/8.0 | 13.7/18.8 | 14.6/20.3 | 21.3/21.6 | 13.6/16.2 | 6.3/6.3 | 6.1/6.3 | |||
Numbers in boldness on the main diagonal denote the paralogous genes (upper) and percentages of them in total genes (below) within a genome, numbers above the diagonal denote the orthologous genes (upper two numbers) and percentages of them (lower two numbers) in two compared genomes (horizontal/vertical), while numbers below the diagonal denote the corresponding items of the out-paralogous genes between two compared genomes.
Figure 3Alignment of the local regions sharing homology. Ae, Aegilops tauschii; Bd, Brachypodium distachyon; Si, Setaria italica; Hv, Hordeum vulgare; Rice, Oryza sativa; Sb, Sorghum bicolor; Tu, Triticum urartu; Zm, Zea mays; Ta A, genome A of Triticum aestivum; Ta B, genome B of Triticum aestivum; Ta D, genome D of Triticum aestivum. Genes are shown with pointed boxes to indicate their transcriptional direction. Homologous genes between neighboring chromosomes (indicated by the straight lines) are linked to lines with circles at their ends.