| Literature DB >> 23326580 |
Andrea Gottlieb1, Hans-Georg Müller, Alicia N Massa, Humphrey Wanjugi, Karin R Deal, Frank M You, Xiangyang Xu, Yong Q Gu, Ming-Cheng Luo, Olin D Anderson, Agnes P Chan, Pablo Rabinowicz, Katrien M Devos, Jan Dvorak.
Abstract
Wheat and maize genes were hypothesized to be clustered into islands but the hypothesis was not statistically tested. The hypothesis is statistically tested here in four grass species differing in genome size, Brachypodium distachyon, Oryza sativa, Sorghum bicolor, and Aegilops tauschii. Density functions obtained under a model where gene locations follow a homogeneous Poisson process and thus are not clustered are compared with a model-free situation quantified through a non-parametric density estimate. A simple homogeneous Poisson model for gene locations is not rejected for the small O. sativa and B. distachyon genomes, indicating that genes are distributed largely uniformly in those species, but is rejected for the larger S. bicolor and Ae. tauschii genomes, providing evidence for clustering of genes into islands. It is proposed to call the gene islands "gene insulae" to distinguish them from other types of gene clustering that have been proposed. An average S. bicolor and Ae. tauschii insula is estimated to contain 3.7 and 3.9 genes with an average intergenic distance within an insula of 2.1 and 16.5 kb, respectively. Inter-insular distances are greater than 8 and 81 kb and average 15.1 and 205 kb, in S. bicolor and Ae. tauschii, respectively. A greater gene density observed in the distal regions of the Ae. tauschii chromosomes is shown to be primarily caused by shortening of inter-insular distances. The comparison of the four grass genomes suggests that gene locations are largely a function of a homogeneous Poisson process in small genomes. Nonrandom insertions of LTR retroelements during genome expansion creates gene insulae, which become less dense and further apart with the increase in genome size. High concordance in relative lengths of orthologous intergenic distances among the investigated genomes including the maize genome suggests functional constraints on gene distribution in the grass genomes.Entities:
Mesh:
Substances:
Year: 2013 PMID: 23326580 PMCID: PMC3543359 DOI: 10.1371/journal.pone.0054101
Source DB: PubMed Journal: PLoS One ISSN: 1932-6203 Impact factor: 3.240
Figure 1Density functions for intergenic distances in B. distachyon (A), rice (B), sorghum (C), and Ae. tauschii (D).
Shown is the exponential density fitted by Maximum Likelihood (2) (solid) and the non-parametric density estimate (4) (dashed), with bandwidth h = 1.25 (A), h = 1.75 (B), h = 1.50 (C), and h = 15.00 (D).
Summary of test results for the null hypothesis that gene locations are uniformly distributed in the four species.
| No. intergenic distances | Estimated rate parameter | Estimated standard error of |
| |
|
| 52 | 0.174 | 0.024 | 0.330 |
| Rice | 57 | 0.167 | 0.022 | 0.064 |
| Sorghum | 62 | 0.170 | 0.022 | 0.007 |
|
| 81 | 0.012 | 0.001 | 0.000 |
The estimated exponential rate parameter is the maximum likelihood estimator of as given in Equation (1).
Based on a χ2- goodness of fit test.
Insular structure in sorghum and Ae. tauschii.
| Statistic | Unit | Sorghum |
|
| First local minimum | kb | 8.0 | 81.0 |
| Mean of distances shorter than the minimum | kb | 2.1 | 16.7 |
| Mean of distances longer than theminimum | kb | 15.1 | 205.2 |
| Shorter distances | % | 70.0 | 63.0 |
Fitting the full regression model specified in equation (5) to the Ae. tauschii data.
| Parameter | Estimate |
| 95% confidence interval |
|
| 8.3 | 0.82 | (−64.5, 81.1) |
|
| −5.5 | 0.92 | (−102.8, 91.9) |
|
| 387.8 | ∼0.00 | (279.9, 495.7) |
|
| −257.8 | ∼0.00 | (−403.0, −112.5) |
Figure 2Relationships between inter-insular distances (A), intra-insular distances (B) and the number of genes per insula (C) and gene location along the centromere-telomere axes of Ae. tauschii chromosome arms.
(A) shows a fitted regression line from Model (7) of the inter-insular distances and gene location on centromere-telomere axes of Ae. tauschii chromosome arms.
Fitting the regression model (7) to the Ae. tauschii data.
| Parameter | Estimate |
| 95% confidence interval |
|
| 427.2 | ∼0.00 | (274.3, 580.1) |
|
| −300.8 | 0.01 | (−502.9, −98.6) |