| Literature DB >> 20949098 |
Fátima Al-Shahrour1, Pablo Minguez, Tomás Marqués-Bonet, Elodie Gazave, Arcadi Navarro, Joaquín Dopazo.
Abstract
An increasing number of evidences show that genes are not distributed randomly across eukaryotic chromosomes, but rather in functional neighborhoods. Nevertheless, the driving force that originated and maintains such neighborhoods is still a matter of controversy. We present the first detailed multispecies cartography of genome regions enriched in genes with related functions and study the evolutionary implications of such clustering. Our results indicate that the chromosomes of higher eukaryotic genomes contain up to 12% of genes arranged in functional neighborhoods, with a high level of gene co-expression, which are consistently distributed in phylogenies. Unexpectedly, neighborhoods with homologous functions are formed by different (non-orthologous) genes in different species. Actually, instead of being conserved, functional neighborhoods present a higher degree of synteny breaks than the genome average. This scenario is compatible with the existence of selective pressures optimizing the coordinated transcription of blocks of functionally related genes. If these neighborhoods were broken by chromosomal rearrangements, selection would favor further rearrangements reconstructing other neighborhoods of similar function. The picture arising from this study is a dynamic genomic landscape with a high level of functional organization.Entities:
Mesh:
Year: 2010 PMID: 20949098 PMCID: PMC2951340 DOI: 10.1371/journal.pcbi.1000953
Source DB: PubMed Journal: PLoS Comput Biol ISSN: 1553-734X Impact factor: 4.475
Characteristics of functional neighborhoods.
|
|
|
|
|
|
|
|
|
| |
| Number of functional neighborhoods | 265 | 208 | 315 | 267 | 25 | 55 | 146 | 163 | 193 |
| Percentage of genes in functional neighborhoods | 7.2% | 4.71% | 11.9% | 12.8% | 1.0% | 1.4% | 5.3% | 5.8% | 3.0% |
| Mean GC content (p-value) | 42.6% (<10−30) | 41.4% (0.0352) | 42.29% (0.0019) | 42.72% (<10−30) | 41.55% (NS) | 36.37% (NS) | 42.39% (NS) | 35.8% (0.0015) | 35.66% (NS) |
| Mean gene density in functional neighborhoods | 85.84 (<10−30) | 57.35 (<10−30) | 70.77 (<10−30) | 70.26 (<10−30) | 69.32 (0.0154) | 61.96 (0.0014) | 54.85 (0.0061) | 63.59 (<10−30) | 52.07 (NS) |
| p-value of K-S test of co-expression in functional neighborhoods | 7×10−19 | 2×10−29 | 7×10−11 | 1.2×10−8 | NA | 2.8×10−15 | 0.01 | 1.3×10−17 | 1.5×10−5 |
Functional neighborhoods display both a higher GC content and mean gene density which has been described as characteristic of tightly regulated chromosomal domains (28).
These species are seriously affected by a poor annotation of the genes.
Only genes annotated with significantly clustered GO terms are considered here. Genes within the limits of a functional neighborhood that do not match the significant GO term are not considered as members of the cluster.
Total gene density in the functional neighborhoods is reported, including all genes within the limits of the neighborhood independently of the GO terms associated to them. Window size was selected to include, approximately, 50 genes per window with slight variations among organisms.
Number of genes in functional neighborhoods.
|
|
|
|
|
|
|
|
| |||||||||||||||||
| genes | % | Orth. | genes | % | Orth. | genes | % | Orth. | genes | % | Orth. | genes | % | Orth. | genes | % | Orth. | genes | % | Orth. | genes | % | Orth. | |
| organismal physiological process | 1885 | 15.23 | - | 1673 | 10.82 | 69.61 | 1042 | 11.23 | 31.62 | 604 | 20.20 | 21.31 | 120 | 4.17 | 0 | 164 | 16.46 | 0 | 156 | 35.90 | 0 | 475 | 0 | 0 |
| regulation of physiological process | 2917 | 8.12 | - | 2671 | 4.34 | 65.52 | 2318 | 1.77 | 31.71 | 1240 | 1.94 | 45.83 | 432 | 0 | 0 | 718 | 3.48 | 0 | 650 | 14.31 | 0 | 903 | 3.77 | 2.94 |
| regulation of cellular process | 2980 | 7.89 | - | 2765 | 4.30 | 67.23 | 2367 | 1.73 | 31.71 | 1247 | 1.92 | 45.83 | 437 | 0 | 0 | 728 | 5.08 | 27.03 | 607 | 16.47 | 0 | 932 | 2.58 | 4.17 |
| sensory perception | 603 | 35.66 | - | 455 | 29.45 | 60.45 | 357 | 36.96 | 37.89 | 125 | 37.60 | 27.66 | 26 | 19.23 | 20.00 | 66 | 37.88 | 0 | 90 | 61.11 | 0 | 194 | 5.67 | 0 |
| coagulation | 85 | 3.53 | - | 57 | 0 | 0 | 50 | 6.00 | 100 | 34 | 17.65 | 0 | 11 | 54.55 | 0 | 21 | 0 | 0 | ||||||
| response to cexternal stimulus | 464 | 6.68 | - | 433 | 6.93 | 90 | 311 | 6.43 | 70 | 140 | 11.43 | 56.25 | 32 | 15.62 | 0 | 48 | 10.42 | 0 | ||||||
| response to abiotic stimulus | 384 | 9.64 | - | 414 | 5.31 | 100 | 280 | 9.29 | 50 | 148 | 8.78 | 61.54 | 27 | 0 | 0 | 75 | 13.33 | 0 | ||||||
| cell adhesion | 558 | 5.73 | - | 431 | 4.18 | 100 | 428 | 10.98 | 72.34 | 251 | 13.94 | 54.29 | 94 | 0 | 0 | 85 | 20 | 29.41 | ||||||
| organ development | 524 | 2.48 | - | 798 | 0 | 0 | 752 | 0.93 | 0 | 157 | 0 | 0 | 45 | 0 | 0 | 157 | 3.82 | 0 | ||||||
| sex differentiation | 40 | 7.50 | - | 62 | 4.84 | 100 | 46 | 6.52 | 100 | 15 | 13.33 | 100 | 1 | 0 | 0 | 5 | 80 | 50 | ||||||
| reproductive physiological process | 50 | 18.97 | - | 83 | 0 | 0 | 39 | 10.26 | 75 | 12 | 25 | 66.67 | ||||||||||||
| physiological interaction between organisms | 52 | 21.15 | - | 56 | 8.93 | 100 | 16 | 25 | 75 | 6 | 50 | 66.67 | ||||||||||||
| behavior | 185 | 18.92 | - | 250 | 12.40 | 83.87 | 177 | 9.60 | 74.67 | 58 | 20.69 | 66.67 | ||||||||||||
| Average | 12.42 | - | 9.15 | 10.52 | 18.54 | 23.39 | 21.16 | 31.95 | 4.01 | |||||||||||||||
The most left column correspond to the GO terms defining functional neighborhoods. The rest of columns correspond to the analyzed species. Each species' column is divided into three sub-columns labeled as: 1) “genes”, which correspond to the total number of genes in the genome of this particular species annotated with the GO situated in the first column of the corresponding row, 2) “%”, which corresponds to the percentage of these genes found within a functional neighborhood and 3) “Orth.”, which corresponds to the percentage of the genes within the functional neighborhood which are orthologous with respect to their human counterparts.
Figure 1Distribution of functions present in functional neighborhoods along the phylogeny.
The point at which a function makes up a functional neighborhood has been deduced from the species sharing functional clusters with this particular GO term. Boxes in yellow contain GO terms unique to taxa, boxes in blue contain GO terms common to clades and boxes in pink contain GO terms lost in these lineages. In the figure, terms labeled with P were not found in ape, with G: were not found in chicken, with R were not found in rat and with F were not found in fish.
Segmental Duplication (SD) analysis.
| Species | Number of SDs in functional neighborhoods | Number of SDs in the rest of the genome | Total size (in Mbps) of functional neighborhoods | Total size (in Mbps) rest of the genome | Observed proportion of SDs in functional neighborhoods | Expected proportion of SDs in functional neighborhoods | Observed proportion of SDs in rest of the genome | Expected proportion of SDs in rest of the genome | Total genome size in Mbps (golden path) | P-value |
|
| 1630 | 3795 | 932.50 | 1957.03 | 0.3004 | 0.3227 | 0.6995 | 0.6773 | 2889.53 | n.s. |
|
| 1851 | 3399 | 952.35 | 1628.47 | 0.3526 | 0.3690 | 0.6474 | 0.6310 | 2580.82 | n.s. |
|
| 602 | 13366 | 70.00 | 983.97 | 0.0431 | 0.0664 | 0.9569 | 0.9336 | 1053.97 | n.s. |
Functional neighborhoods shared between humans and chimpanzees.
| Data from Newman et al (2005) | ||||||||
| OBSERVED | EXPECTED | |||||||
| Length (Mbp) | % of total lenght | BoS | % of total BoS | BoS density * Mb | BoS | Chi-square value | P-value (Chi-Square) | |
|
| 754.72 | 0.25 | 118 | 0.35 | 0.1563 | 82 | ||
|
| 2325.70 | 0.75 | 216 | 0.65 | 0.0929 | 252 | ||
|
| 3080.42 | 334 | 334 | 21.17 | 4.2×10−6 | |||
Density of breaks of synteny (BoS) in these neighborhoods vs. the rest of the genome. The density of breaks of synteny is higher in shared neighborhoods.
Functional neighborhoods shared between humans and chimpanzees.
| Data from Newman et al (2005) | ||||||||
| OBSERVED | EXPECTED | |||||||
| Neighborhoods | Length (Mbp) | % of total lenght | BoS | % of total BoS | BoS density*Mb | BoS | Chi-square value | P-value (Chi-Square) |
|
| 383.02 | 0.51 | 80 | 0.68 | 0.2089 | 60 | ||
|
| 371.70 | 0.49 | 38 | 0.32 | 0.1022 | 58 | ||
|
| 754.72 | 118 | 117 | 13.56 | 2.31×10−4 | |||
Density of breaks of synteny (BoS) in neighborhoods with high orthology vs. clusters with low orthology. Highly orthologous clusters present lower density of synteny breaks.